Using XML-RPC
Joe Johnston
Senior Software Engineer, O'Reilly and Associates
July, 2001
The World Wide Web is an accidental system. Its original requirements were to simply make the files of Tim Berners-Lee (see Resources (1)) accessible from any workstation on CERN's Local Area Network. This simple idea was so useful and easy that it quickly moved beyond the hands of its creator.
It is unlikely most of us would have heard of the Web without the addition of the Common Gateway Interface (CGI) to the original HyperText Transfer Protocol (HTTP). CGI turned the Web from a read-only environment into an interactive one. By extending HyperText Markup Language (HTML) to include FORM widgets, user input collected from web browsers could be processed and dynamic content returned to the user. Whenever a file from a predefined CGI directory is requested, the web server creates a command shell with a dozen or so HTTP specific environment variables and tries to run the requested file as a program. The CGI program reads in these environment variables and possibly standard input, processes the data and prints an HTML document to standard out. The web server pipes this output back to the requester. It is the advent CGI-enabled web servers like Apache that spurred the rapid development of the World Wide Web during the last five years of twentith century.
Originally, human-driven web browsers were the sole clients of CGI programs. This implied that CGI programs were generating content that a human could immediately use. Of course, creating programs to talk directly to CGI programs weren't long in coming. These "bots" or "spiders" could visit many more web sites that a human and could pull apart HTML pages for storage in very large databases. To complete this karmic circle, yet more CGI programs were built to access these bot-generated databases so that humans could use that information. These Internet search engines were among the first widely used web applications.
In many ways, Web Services were born out these bots. The idea of using HTTP to let programs talk to programs has many applications outside basic web spidering. At their core, Web Services like XML-RPC and SOAP are simply eXtensible Markup Language (XML) messages with a well-defined structure that are passed along standard HTTP. Web Services are built to ease communication between very different platforms. Because Web Services mostly happen over HTTP, their conversations follow the familiar client-server pattern HTTP which is stateless requests and reponses.
The remarkable thing about Web Services is that they are platform agnostic. A Linux box running a Python Web Service can respond requests from Active Server Pages executed from a Microsoft Windows box. In this way, Web Services have the potential to unite systems in a way that's never been easier.
SOAP and XML-RPC: Two Web Service Wire Protocols
Simple Object Access Protocol (SOAP) and eXtensible Markup Language - Remote Procedure Calls (XML-RPC) both function similiarly. They differ in the structure of the XML that they send over the wire. Because of this, SOAP and XML-RPC can be thought of as Web Service Wire Protocols. However similiar these wire protocols seem on the surface, there are some differences that will affect their adoption for a given project.
SOAP is designed to solve the thorny problem of passing live objects over the network. Remember that Web Services are language neutral, so in theory, a Python client can create a new Web Service object that is implemented by a C# server. In practive, SOAP objects merely store instance data and provide access to methods. SOAP structures its messages into headers and payloads. The payload can be any valid XML structure. In order words, SOAP is a general XML message passing system. Its flexibility comes at the cost of a fairly complex message structure. This hasn't deterred companies like Microsoft and IBM from embracing SOAP and making it their preferred Web Service wire protocol.
XML-RPC, on the other hand, isn't concerned with objects or passing
arbitrary XML payloads. It is designed to be a simple procedural way for
a client program to make function requests of another program. XML-RPC
allows for very complex data structures to be communicated, but falls short
of easily allowing object passing. XML-RPC conversations are statless,
meaning that each XML-RPC request is considered in isolation from previous
requests, although there are workarounds for this problem. XML-RPC first
appeared in Userland's Frontier web content management system, which
helps explain why the first Perl XML-RPC library was call Frontier::RPC2.
Looking Under the Hood of XML-RPC
To get a better understanding of how Web Services in general and XML-RPC
in particular works, let's look at a typical XML-RPC dialog. In order for
an XML-RPC client to make a remote procedure call, it needs the URL of the
server and the API of that Web Service which defines the publicly
available functions including expected inputs and outputs. Currently, there
is no universally accepted way to describe an XML-RPC interface, although
the Web Service Description Language (WSDL) does do this for SOAP services.
However the client obtains the required information, it will encode its RPC
request in according to the XML-RPC. For this discussion, let's assume that
the client wishes to call the function repeat_string, which expects to be
passed a string and an integer. The function will return a new string
which is the old string concatenated with itself the number of times
indicated by the passed in integer. The client message for requesting
this function might look like this:
Listing 1: XML-RPC sample client call
<?xml version="1.0"?>
<methodCall>
<methodName>repeat_string</methodName>
<params>
<param>
<value><string>hip, hip, hurray!</string></value>
</param>
<param>
<value><i4>3</i4></value>
</param>
</params>
</methodCall>
|
As is typical of all XML documents, this one starts off with the XML declaration. Although the XML specification allows for unicode, the XML-RPC messages ought to be in ASCII (see Resources (2)). The next XML tag, "methodCall" identifies this document as an XML-RPC request. It is followed by the tag "methodName" which indicates the desired remote procedure to execute. In order to support arguments to the remote procedure, the "params" tag may contain zero or more individual "param" tags. Each "param" tag indicates a function argument, which must further be tagged to indicate the passed datatype. XML-RPC supports a small, but very adequate set of datatypes including strings, integers, floats, booleans, arrays and dictionaries (see the XML-RPC spec). In Listing 1, two arguments are passed: the string "hip, hip, hurray" and the integer "3". By explicitly tagging data, XML-RPC allows loosely typed languages like Perl to interact with more strongly typed languages like C.
The server's response to the above client request is listed in Listing 2.
Listing 2: XML-RPC sample server call
<?xml version="1.0"?>
<methodResponse>
<params>
<param>
<value>
<string>hip, hip, hurray!hip, hip, hurray!hip, hip, hurray!</string>
</value>
</param>
</params>
</methodResponse>
|
The "methodResponse" tag after the ubiquitous XML declaration informs us that this is an XML-RPC response document. It has a simplier structure because it only needs to deliver the results of the requested remote procedure. Whatever data is returned begins with the "params" tag, which may be confusing in this context since there are no parameters. XML-RPC remote procedures have at most one return, although that return may be a complex data structure. Here, a simple string is returned.
In order to get a request document to an XML-RPC server, the client program must make an HTTP POST request to the XML-RPC server's URL. Once the server has processed the request, it sends a standard HTTP response including headers back to the client.
This low-level view of XML-RPC obscures how programs will typically
interact with it. All the XML-RPC libraries available mask many of the
deals of encoding RPC request into XML, making the HTTP POST, and parsing
the response. Instead, a typical client request might look like the
following Perl code that uses the XMLRPC::Lite library:
1 #!/usr/bin/perl --
2
3 use strict;
4 use XMLRPC::Lite;
5 use constant SERVER_URL => 'http://127.0.0.1:1080/RPC2';
6
7 my $client = XMLRPC::Lite->proxy(SERVER_URL);
8 my $ret = $client->call('repeat_string', 'hip hip hurray!', 3);
9 print $ret->result;
|
The reader need not be familar with Perl to see that there is no
direct manipulation of either XML nor HTTP happening at this layer of code.
After pulling in the library and setting up a constant which holds the URL
to the XML-RPC repeat_string service, a new XMLRPC::Lite object is created
on line 7. The RPC call happens on line 8, in which the name of the
remote procedure along with any arguments are passed to the call
method. This returns an XMLRPC::SOM object which has the method
result to recover the value returned by the server. Note that in
this client program, there is only native language data. The details
of translating Perl data into XML-RPC tags is handled by the library.
An even bigger win is that the return from the server is also translated
back to into Perl data. The server's implementing language is irrelevant;
it is the server's API that is important. In this way, code is separated
from service. This has some powerful implications for the Open Source
community, but that is beyond the scope of this paper
(see Resources
(3)).
An Extended Example of a Perl Client (Meerkat)
To show a real world example of XML-RPC in action, let's look at how to build a client to talk to O'Reilly's Meerkat wire service. Meerkat amalagmates Rich Site Summary (RSS) feeds from other news sites and provides a simple UI to browse their headlines. Meerkat also has an XML-RPC interface that allows easier programmatic access of this system (see Resources (4)).
Meerkat assigns every site's RSS feed an integer ID. RSS feeds are grouped into topic categories, each of which also has an integer ID. The XML-RPC API defines these three procedures:
|
The program that follows is a simple command line interface Meerkat client that allows users to first search for Meerkat categories, then find recent stories under that category.
Listing 4: Meerkat client, part 1
1 #!/usr/bin/perl --
2 # CLI meerkat client -- jjohn 7/01
3
4 use strict;
5 use XMLRPC::Lite;
6 use Getopt::Std;
7 use constant SERVER_URL =>
8 'http://www.oreillynet.com/meerkat/xml-rpc/server.php';
9
10 my %opts;
11 getopts('?c:l', \%opts);
12
13 unless( $opts{'c'} || $opts{'l'} ){
14 die qq{
15 $0 - command line interface to Meerkat
16
17 USAGE:
18 $0 [OPTIONS]
19
20 OPTIONS:
21
22 c <STRING> get stories associated with this Channel
23 l list available Channels with IDs
24 ? print this screen
25 };
26 }
27
28 my $meerkat = XMLRPC::Lite->proxy(SERVER_URL);
29
|
Listing 4 shows the begining of the program. As in the brief example
in Listing 3, the XMLRPC::Lite library is used. As with most Perl
modules, XMLRPC::Lite is available on CPAN ( (see Resources
(5)) ).
Lines 7 and 8 define a constant that contains the URL to the Meerkat
XML-RPC server. In order to process command line arguments easily, the
standard Perl library Getopt::Std is used. On line 11, the command
getopts
looks for three command line flags '?' for help, 'c' for get the desired
category's stories, or 'l' for listing all categories. The colon after
the 'c' indicates that this flag should be invoked from the command line
with an additional argument. Any flags that are detected at stored as
keys in the %opts hash. Options that do not take additional arguments
and are present will have the value of '1' in the hash.
Here, $opts{'c'} will contain the value supplied on the command line.
Line 13 is exams how the program was invoked. If the user hasn't
requested a category's headlines or the listing of all headlines, then
a usage screen is displayed. The qq{} operator is a synonym for double
quotes. The special variable $0 holds the name of the program as it
was invoked. A new XMLRPC::Lite object is created for the Meerkat service
on line 28.
Listing 5: Meerkat client, part 2
30 # Execute command
31 if( $opts{'c'} ){
32 print "Fetching RSS feeds for $opts{c}\n";
33 my $rss_feeds = get_feeds( $meerkat, $opts{c} );
34 print "Found ", scalar @{ $rss_feeds }, " channels\n";
35 my $cnt = 0;
36 for my $id ( @{ $rss_feeds } ){
37 my $items = get_links( $meerkat, $id );
38 for my $i ( @{ $items } ){
39 printf "%3d: $i->{title} ($i->{link})\n", ++$cnt;
40 }
41 }
42
|
In Listing 5, the implementation of the command to get category headlines
is defined. After giving the user some feedback on line 32, the XMLRPC::Lite
object is passed into the subroutine get_feeds() which will return an
array reference of hashes, each of which will be a headline record containing
a "title" and "link" field. The subroutine get_feeds() is shown in Listing
9. Because there can be some latency in making a Web Service call, it is
important to give the user some feedback before and
after the call. In lines 36-41, the returned array reference is dereferenced
and each headline record is neatly displayed with "title" and "link" fields.
Listing 6: Meerkat client, part 3
43 }elsif( $opts{'l'} ){
44 print "Meerkat Categories\n";
45 my $cats = get_categories( $meerkat );
46 for my $c (@{$cats}){
47 print "\t$c->{title} ($c->{id})\n";
48 }
49
|
Listing 6 presents the code that lists all the categories on Meerkat.
The heart of this code block is get_categories(), which returns a
reference to a list of hashes which the keys "title" and "id". Lines 46-49
simply print out the returned list.
Listing 7: Meerkat client, part 4
50 }else{
51 # Should never get here
52 die "ERROR: Unknown option\n";
53 }
54
55 exit;
56
|
Listing 8 is just a sanity check. After checking for known conditions,
it is important to handle the unexpected ones. The code in Listing 8
should catch most of these, but this bit of defensive programming is
useful if more flags were added to the getopts line, but no further
code was added to this nested if-elsif block. The main block of this
program ends on line 55. The meat of the XML-RPC interaction happens in
the subroutines.
Listing 8: Meerkat client, part 5
57 #---------
58 # subs
59 #---------
60 # Return a array ref of records { title => "", id => "" }
61 sub get_categories {
62 my ($c) = @_;
63 my $ret = $c->call('meerkat.getCategories');
64 if( $ret->fault ){
65 die "ERROR: ", $ret->fault->{faultString}, "\n";
66 }
67 return $ret->result;
68 }
69
|
Listing 9 shows the implementation of get_categories(). It expects
to be passed a live XMLRPC::Lite object. On line 63, the remote
procedure meerkat.getCategories is called. It returns an XMLRPC::SOM
object that can be tested for server errors. If none are present, the
payload can be returned back to the main line.
Listing 9: Meerkat client, part 6
70 # return the RSS IDs for the given category
71 sub get_feeds {
72 my ($c, $cat) = @_;
73
74 my $cat_id = -1;
75 if( $cat =~ /^\d+$/ ){
76 $cat_id = $cat;
77 }else{
78 # Need to look this up (take the first match)
79 my $cats = get_categories( $c );
80 for my $c ( @{ $cats } ){
81 if( $c->{title} =~ /\Q$cat/i ){
82 $cat_id = $c->{id};
83 last;
84 }
85 }
86 }
87
88 # Find the feeds
89 my @feeds;
90
91 my $ret = $c->call('meerkat.getChannelsByCategory', $cat_id);
92 if( $ret->fault ){
93 die "ERROR: ", $ret->fault->{faultString}, "\n";
94 }else{
95 for my $r (@{$ret->result} ){
96 push @feeds, $r->{id};
97 }
98 }
99 return \@feeds;
100 }
101
|
Listing 10 has the implementation of get_feeds(). This subroutine expects
to be passed an XMLRPC::Lite object along with the "category" that can
be either an integer ID or a string. If passed an integer, it is assumed
to be an ID. Otherwise, the string will need to be matched against the
list of categories to determine the associated ID. Once the category ID is
known, the remote Meerkat procedure getChannelsByCategory() can be called.
The list of RSS IDs is returned from the whole lists of records returned by
the remote procedure.
Listing 10: Meerkat client, part 7
102 # return an array of structs for this RSS_ID
103 sub get_links {
104 my ($c, $rss_id) = @_;
105
106 my $ret = $c->call('meerkat.getItems', { channel => $rss_id,
107 num_items => 10,
108 }
109 );
110
111 if( $ret->fault ){
112 die "ERROR: ", $ret->fault->{faultString}, "\n";
113 }else{
114 return $ret->result;
115 }
116 }
|
The last subroutine, get_links() is shown in Listing 10. Given a live
XMLRPC::Lite object and an RSS ID, it will return the 10 most recent
headlines. Meerkat's getItems() function allows for very flexible control
of both search criteria and the format of the returned headline objects.
Assuming no server errors, the results are passed back to the caller.
Buried in the Meerkat example is contraversy. Meerkat's XML-RPC service runs as an ordinary PHP page. There are those who question the practice of co-mingling Web Services with traditional web documents. The argument against doing such a thing is that normal web pages are generally for human consumption and Web Services are for programs that will be a lot more demanding of server resources. Web Services, say these separatists, should be isolated onto their own TCP way from port 80. In fact, many XML-RPC libraries support creating standalone HTTP servers just for a particular application.
On the other hand, freely mixing Web Services with traditional CGI applications has a lot less administrative overhead. Because Web Services happen over HTTP, why not let web server, already heavily optimized to deliver content over that protocol, do the heavily lifting for the Web Service? Things like HTTP authentication and SSL can happen inside the web server, allowing the Web Service to be completely isolated from those decisions. Both arguments have their supporters, and it is best to consider both sides before deploying a production quality XML-RPC service. The next section creates a simple standalone XML-RPC Perl server that allows users to look up popular American slang from the 1970s.
Listing 11: Slang Server, part 1
1 #!/usr/bin/perl --
2 # Provide access to 70's terms from
3 # http://www.inthe70s.com/generated/terms.shtml
4
5 use strict;
6 use Frontier::Daemon;
7
8 my $Terms = get_terms();
9
10 Frontier::Daemon->new(
11 methods => {
12 random => \&random,
13 lookup => \&define,
14 define => \&add,
15 },
16 LocalPort => 1080,
17 Reuse => 1,
18 );
|
Listing 11 shows the begining of the server. Line 8 defines the lexically
global hash reference that holds all the slang terms along witht their
definitions. The get_terms() subroutine is shown in Listing 15.
Perl has three XML-RPC libraries, the oldest of which is Frontier::RPC2.
On line 6, the server library Frontier::Daemon is brought in.
It creates a standalone
HTTP server to field RPC requests. This class is dervived from
HTTP::Daemon, itself derived from IO::Socket.
When creating any Web Service, one has to define the API, which consists of publicly defined names and inputs defined in XML-RPC data types. The server maps the API procedure names into Perl subroutines. This library does nothing to enforce the API procedure signatures. This task is left up to the individual subroutines.
The 'methods' parameter in lines 11-15 takes a hash reference whose keys are the API procedure names and whose values are references to the Perl subroutines that will implement the API.
This new Frontier::Daemon will listen on local TCP port 1080. The 'Reuse'
parameter indicates that the operating system should not wait for all the
client sockets to close before allowing other programs to use the 1080
port.
Once created, this server object never returns. It merely waits to service incoming connections.
Listing 12: Slang Server, part 2
19 #------
20 # subs
21 #------
22 # XML-RPC API functions
23 #
24 sub random {
25 my @keys = keys %{$Terms};
26 my $idx = int rand(scalar @keys);
27 my $term = $keys[ $idx ];
28 return $term . "($idx): " . $Terms->{ $term };
29 }
30
|
When clients call the 'random' API procedure, a term selected at random
is returned to the user. Notice that the return is a simple string that
contains the term's definition. This library will encode the returned
value into an XML-RPC message with the Frontier::RPC2 class. All this
is transparent to the server author.
Listing 13: Slang Server, part 3
31 sub define {
32 my ($term) = @_;
33
34 if( exists $Terms->{uc $term} ){
35 return $Terms->{uc $term};
36 }else{
37 return "__UNKNOWN__";
38 }
39 }
40
|
When users have a term, they can get it's definition using the API procedure
'lookup' which is implemented with the Perl subroutine define(). So as not
to make users remember proper casing, all lookups are upcased. Again, a
simple string is returned to the requester.
Listing 14: Slang Server, part 4
41 sub add {
42 my ($term, $definition) = @_;
43
44 if( !$term || !$definition ){
45 return;
46 }else{
47 $Terms->{ uc $term } = $definition;
48 }
49
50 return 1;
51 }
52
|
Users can also define new terms or update existing ones with the API
method 'define', which is implemented with the Perl subroutine add().
Line 44 does some sanity checking to make sure that both a term and a
definition string were passed in. Notice that this function is dealing
only with Perl data. The Frontier library took care of the details of
translating the XML request into Perl data.
Listing 15: Slang Server, part 5
53 #-------------
54 # helper subs
55 #-------------
56 sub get_terms {
57 my %terms;
58 while(<DATA>){
59 chomp;
60 my($k,$v) = split/:/, $_, 2;
61 $terms{uc $k} = $v;
62 }
63 return \%terms;
64 }
65
|
While not a part of the API, get_terms() is important. It reads in
a colon separated list of terms and definitions stored in the __DATA__
section of the script. An abbreviated list of terms and definition
stored in that section is shown below. Please note that long lines where
wrapped here for display purposes. In the real code, one line holds both the
term and definition.
Listing 16: Slang Server, __DATA__ section
66 __DATA__
67 A.F.A.:Insead of signing letters, photos,etc. sincerely we would
sig A.F.A. That stood for A Friend Always
68 Boss!:Cool; awesome
69 Bread:Money; Cash. "Do you have any bread?"
70 C.B. Radio Slanguage:The on air language used over cb radio,
one of the most memorable fads of the 70's 10-4 GOOD BUDDY!
|
The server is only a brief sketch of what could be done with XML-RPC. Creating these database interfaces is a natural fit for Web Services.
PHP is a popular web scripting language that allows users to add programming
logic into otherwise plain HTML. This is an example of technology called
server side includes, of which ASP, ColdFusion and HTML::Mason are
also a part. The XML-RPC library used here was written by Edd Dumbill and
is available from (see Resources
(6)). Figure 1 shows what this
client looks like to the user.
Figure 1: PHP Client in action
Listing 17: PHP Client, part 1
1 <html>
2 <body bgcolor="#FFFFFF">
3 <h1>Seventies Term Server Client</h1>
4 <?
5 include("xmlrpc/xmlrpc.inc");
6 $Client = new xmlrpc_client ("/RPC2", "127.0.0.1", 1080);
7 $Client->setDebug( 0 );
8 ?>
|
Many PHP pages start out as typical web pages. Lines 1-3 of Listing 17
are standard HTML. The PHP block begins on line 4 pulls in the XML-RPC
PHP library and creates a new client object, $Client. Because this client
object will be used in other code blocks, I have capitalized it to remind me
that it is a global. The setDebug() method on line 7 may be set to make
debugging much easier.
Listing 18: PHP Client, part 2
9 <p>A random term:
10 <?
11 $m = new xmlrpcmsg("random");
12 $r = $Client->send( $m );
13 if( !$r->faultCode() ){
14 $v = $r->value();
15 print "<b>" . $v->scalarval() . "</b><br>";
16 }
17 ?>
|
Listing 18 shows a simple XML-RPC call in PHP. Here, the API procedure
'random' is called. With this library, a new xmlrpcmsg object is first
created. This method takes two arguments: the name of the API procedure and
an array of arguments for the RPC. On line 12, that xmlrpcmsg object is
passed to the send() method, which returns an xmlrpcresp object. This
new object can be tested for server errors, as line 13 shows. To get at
the returned data, the value method is envoked to return an xmlrpcval
object. On line 15, the actual return value is recovered with the method
scalarval().
Listing 19: PHP Client, part 3
18 <?
19 if( $term ){
20 # lookup previously submitted term
21 $d = new xmlrpcval($term, "string");
22 $m = new xmlrpcmsg("lookup", array($d));
23 $r = $Client->send( $m );
24 if( !$r->faultCode() ){
25 $v = $r->value();
26 print "<p>'$term' is defined as:<br>";
27 print "<b>" . $v->scalarval() . "</b><br>";
28 }
29 }
30 ?>
|
Previously submitted form elements in PHP are typically available as scalars.
Here, $term will hold the sought slang term submitted by the form shown
in Listing 19. If there is a term to search for, it needs to be wrapped
into an xmlrpcval object (see line 21). On line 22, a new xmlrpcmsg
object is created for the call to 'lookup'. Notice that the argument to
'lookup' needs to be in an array. The rest of the code proceeds in much the
same way as the previous listing.
Listing 20: PHP Client, part 4
31 <p>Look up a new one:
32 <form>
33 <p>Term: <input name="term"> <input type="submit" value="Search">
34 </form>
35 </body>
36 </html>
|
Listing 20 is a standard HTML form, used here to gather the user's input for a slang term to look up. Although PHP operates in a different environment than Perl, XML-RPC provides a simple bridge between them.
This paper has introduced the XML-RPC protocol and provided some examples of where this Web Service Wire Protocol is useful. While the specification XML-RPC is done, the developement of the individual libraries continues. Better handling of HTTP authenication and providing automatic introspection of the Web Service API are two areas where individual language implementations vary greatly. Exploring better ways to encrypt message payloads using SSL or something like the Blowfish encryption algorithm is also an area under developement. There is a great effort underway to make XML-RPC and SOAP interoperate, which will improve the adoption rate of XML-RPC. Perhaps the most interesting aspect of XML-RPC's future lies with the new developers that are just discovering the protocol. Open Source advocate Eric Raymond posited that "with enough eyes, all bugs are shallow". It is also true that new eyes see old problems in a new way. XML-RPC's future is brighter than ever.
More information on using XML-RPC can be found in the recently published O'Reilly book, Programming Web Services with XML-RPC (see Resources (7)).