Latest News

25th September 2009
Send to twitter Send to Facebook

It occurred to me today that there is an important philosophical difference between the older web service protocols like XML-RPC and SOAP think of themselves and the way REST does.

SOAP and XML-RPC spend a lot of their spec on format of their messages. In particular, how application data is serialized into XML. How that message is moved from one host to another is less important. Both protocols assume that developers will use HTTP, but certainly other protocols can be used. In fact, Leostream at one point used XML-RPC messages across a pipe rather than TPC socket.

REST on the other hand focuses almost entirely on how messages are passed. That is, REST is tightly bound to HTTP. Application data can be serialized in a number of ways, none of which are particularly important to REST.

20th July 2009
Send to twitter Send to Facebook

(Note: Thanks to gizmo, I have corrected an abbreviation expansion problem.)

Uniform Resource Locators are an addressing scheme at the heart of the Web. Without them, there would be no stardard way to refer to a resource offered by a web server. URLs remove the ambiguity of addressing a resource, but at the cost of creating some rather formidable namespaces (e.g. https://addons.mozilla.org/en-US/firefox/addon/9549).

In general, long URLs aren't a problem. Either through web page hyperlinks or web browser bookmarks, URLs fade into the background for most users. However, sometimes it is more convenient to have a shorter reference to a resource than the fully qualified URL. For example in the late nineties on IRC, it was common to see tiny.cc URLs pasted into chat rooms. Long URLs tend to clutter up already busy chat room windows. With the advent of text message-based systems like Twitter, which limit status updates to 140 characters, long URLs are actually consuming a valuable resource. The most common URL shortener used on Twitter.com appears to be bit.ly

There are several URL shortening services out there and they all work pretty much the same way. The user supplies the full URL. The service hashes the URL into something smaller and appends this to its own namespace. Using the bit.ly service, the mozilla URL becomes: http://bit.ly/g0Z9. When someone accesses this bit.ly URL, he will be seemlessly redirected to the original resource.

Bit.ly provides a REST interface to their service (API). To use this, create an account on bit.ly's system. Now you are ready to build a Perl REST client for the shorten service (http://api.bit.ly/shorten).

The following code is a listing of a small command line Perl script that expects to be passed a long URL. It uses the bit.ly REST service to return a shortened version.

use strict;
use LWP::UserAgent;
use Getopt::Std;
use HTTP::Request;
use URI;

my $VERSION = "1.0";
my $Opts = {};
my $bitly_api_url = q[http://api.bit.ly/shorten];
my $long_url = pop @ARGV;
getopts('u:p:?', $Opts);

if (!$long_url || $Opts->{'?'}) {
    print usage();
    exit;
}

set_defaults($Opts);

my $ua = LWP::UserAgent->new;
my $fetch_url = URI->new($bitly_api_url);
$fetch_url->query_form({'version' => "2.0.1",
                         'format'  => "xml",
                         'longUrl' => $long_url,
                     });
my $req = HTTP::Request->new(GET => $fetch_url);
$req->authorization_basic($Opts->{u} => $Opts->{p});

my $res = $ua->request($req);
if ($res->code == 200) {
    my ($url) = ($res->content 
       =~ m!([^<]+)!);
    unless ($url) {
        warn("FAIL: [". $res->content . "]\n");
	exit 1;
    }
    print "$url\n";
    exit;
} else {
    warn("FAIL:[".$res->content."]\n");
    exit 1;
}

#-----
# sub
#-----
sub usage {
    return <

OPTIONS

  ? - Display this screen
  u [USERNAME] - Bit.ly username
  p [PASSWORD] - Bit.ly password

EOT
}

sub set_defaults {
    my ($h) = @_;
    $h->{u} ||= "taskboy3000";
    $h->{p} ||= "s3c3rt";
}

This code uses the standard Perl module Getopt::Std to parse optinal command line arguments. The set_defaults function merely uses my bit.ly credentials if none are provided through optional parameters. Next, a new LWP::UserAgent object is created to make client HTTP calls. The bit.ly shorten service expects a GET request with optional arguments encoded as query parameters in the URL. The bit.ly service can respond to requests with data in various formats (e.g. XML, JSON). In this case, the format parameter is set to "xml."

The URI class manages the extra parameters through the query_form method and urlencodes these into the new URL. A simple HTTP::Request object is passed the new URL and the bit.ly credentials are added to the HTTP request header using the authorization_basic method.

Once the HTTP request has all the information, it is ready to be sent to the bit.ly server. The HTTP::Request object is passed to the LWP::UserAgent::request method, which contacts the server and encodes the response as an HTTP::Response object.

If an error occurred in transmission, the response will have a HTTP status code other than 200. Even if the requests succeeds, the service might fail due to missing or bad credentials. A simple regex extracts the shortend URL from the XML message and reports on the command line for easy consumption by other command line tools.

This script will run on any platform supported by Perl.

17th July 2009
Send to twitter Send to Facebook
Perl as internet duct tape

For a long time, I've ignore the Representational State Transfer (REST) architecture. For one thing, I don't particularly agree with its premise that remote procedure calls (RPC) that use HTTP as a transport mechanism should obey the same semantics as regular web traffic. Things like XML-RPC and SOAP are, to my thinking, happening on an entirely different layer of the application stack than HTTP. Indeed, there are implementations of XML-RPC that do no use HTTP at all.

I remember pretty heated arguments I witnessed at tech conferences in the early 2000s about this seemingly unimportant technical point. For REST adherents, web services are another form of web traffic and should be treated as such. Given that Twitter, Facebook and Bit.ly all use REST for their APIs and older apps like liveJournal use XML-RPC/SOAP, I guess REST is the new hotness.

I've recently had reason to interact with the Twitter and Bit.ly APIs. This has made me come to terms with REST RPC mechnanisms. I admit, the sad, sick part of me that enjoys playing around with low-level HTTP stuff finds satisfaction in the way these API leverage existing HTTP features like basic authentication, extra path info, and GET and POST semantics. In this post, I thought I would show a bit of Perl code I wrote post status updates to Twitter, an activity more commonly referred to as "tweeting."

Twitter's API documentation is relatively straight forward, if you already have a solid grounding in HTTP. The API call to tweet is called "statuses/update". The basics of the RPC mechanism are easy enough:

  • The caller makes a HTTP GET or POST request
  • The sender replies with content in the form of JSON or XML

Let's start with the request. There are serveral bits of information required by the API: user credentials, the URL and additional query parameters. The user credentials are passed as part of the HTTP request header as a basic authentication field, which is merely a base64 string that is the concatenation of the username and password of your Twitter account. Fortunately, Perl's HTTP::Request::Common class makes it easy to add basic auth credentials to the request without knowing how this information is encoded in the HTTP request.

The next bit is the URL to the function. This is a core idea of REST -- function calls should have URIs and look like ordinary web resources. In this case, the URL is http://twitter.com/statuses/update.xml. Interestingly, the response from twitter can be encoded in a number of formats. These formats are determined by the extension you give to the URL. For instance, I could have request the metainformation about myself in JSON with the following URL: http://twitter.com/users/show/taskboy3000.json.

The text of the tweet must be passed to the URL as if it were POSTed from a form. The parameter name is status. The status must be encoded as if the data were submitted from an HTML form. Again, Perl makes this very easy, as will be shown below.

use LWP::UserAgent;
use HTTP::Request::Common ('POST')

my $api_url = q[http://twitter.com/statuses/update.xml];
my $status = "Tweeting from the API!";
my $twitter_username = "taskboy3000";
my $twitter_password = "s3cr3t";

my $ua = LWP::UserAgent->new;
my $req = POST($api_url => [status => $status]);
$req->authorization_basic($twitter_username 
			  => $twitter_password);

# Make the request
my $res = $ua->request($req);

The code above is sets up and makes the status RPC call to twitter. The first thing needed is an LWP::UserAgent object, which is kind of like a web browser. It makes HTTP requests of web servers. To construct the POST request, I use HTTP::Request::Common::POST. Because I can pass in form parameters as plain perl data structures, it frees me from worrying about urlencoding values and fooling around with HTTP headers that are germain to the task at hand. POST() returns an HTTP::Request object.

Adding my twitter account credentials to the request is a simple one line call to authorization_basic(). Very handy and very clean. That's all the setup I need to make the request. I pass in the HTTP::Request object to the User Agent object. That makes the actual network connection to the URL. The response comes back in the form of an HTTP::Response object, which I'll discuss next.

If all has gone well with the request, I'll get back an XML document that looks something like this:

<?xml version="1.0" encoding="UTF-8"?>
<status>
<created_at>Tue Apr 07 22:52:51 +0000 2009</created_at>
<id>1472669360</id>
<text>At least I can get your humor through tweets. 
RT @abdur: I don't mean this in a bad way, but 
genetically speaking your a cul-de-sac.</text>
<truncated>false</truncated>
<in_reply_to_status_id>1472669230</in_reply_to_status_id>
<in_reply_to_user_id>10759032</in_reply_to_user_id>
<favorited>false</favorited>
<in_reply_to_screen_name></in_reply_to_screen_name>
<user>
<id>1401881</id>
 <name>Doug Williams</name>
 <screen_name>dougw</screen_name>
 <location>San Francisco, CA</location>
 <description>Twitter API Support. Internet, greed, 
users, dougw and opportunities are 
my passions.</description>
 <url>http://www.igudo.com</url>
 <protected>false</protected>
 <followers_count>1027</followers_count>
 <profile_text_color>000000</profile_text_color>
 <profile_link_color>0000ff</profile_link_color>
 <friends_count>293</friends_count>
 <created_at>Sun Mar 18 06:42:26 +0000 2007</created_at>
 <favourites_count>0</favourites_count>
 <utc_offset>-18000</utc_offset>
 <time_zone>Eastern Time (US & Canada)</time_zone>
 <profile_background_tile>false</profile_background_tile>
 <statuses_count>3390</statuses_count>
 <notifications>false</notifications>
 <following>false</following>
 <verified>true</verified>
</user>
</status>

Most of this, I don't care about. However, I do want to see if there's an tag. If so, there was a problem with the post. The way I handle this error checking can be see in the following code.

 
unless ($res->is_success) {
    my $c = $res->content;
    my ($errstr) = ($c =~ m!<error>([^<]+)</error>!);
    warn(sprintf("Post failed (%d): $errstr\n", $res->code));
    exit 1;
}

print "OK\n";
exit 0; 

Without the services of a full XML parser, it's relatively easy to look for an error tag and extract the contents for display. The error message I've encountered most is essentially "you used the API too much". Twitter does restrict the usage of some of their API calls, but not the status one.

If you collapse all the Perl code, you're looking at less than 20 lines of code. If you wanted to, you could even make posts using the very handy command line tool curl: curl -u taskboy:s3cr3t -d "status=hello curl" \
http://twitter.com/statuses/update.xml

I will leave the checking of error messages from curl output as an excerise for the reader.

As I said, REST RPC mechanisms are fun and interesting if you already understand HTTP. However, not everyone does. I think XML-RPC and SOAP libraries to a better job of insulating the programmer from the HTTP protocol, allowing him to focus on the API task at hand.

About this blog

The taskboy blog is a exploration of computer technology by Joe Johnston. Topics of posts include practical examples Perl, PHP, Python and Java as well as book reviews, industry insights and miscellaneous good stuff.

Current Status

Watching _Brass Latern_. Ah IF, your coyness is your charm.

Posted: Sun Sep 05 16:02:15 +0000 2010