Taskboy: A Web Services Exhibition. Part 2

Posted:

(This article continues my thoughts on the taskboy CMS.)

How the taskboy CMS works

Once I decided that content would be managed through emacs (to as large a degree as possible), the rest fell mostly into place. The blog, the music section, the polls and the ratings would all be stored in mysql and accessed through a XML-RPC API. I would use PHP to define the layout, pulling the content from the database where needed. Templates as such are not used. To my thinking, a PHP page is the template. I also decided against database abstraction classes, since I’m unlikely to move from mysql any time soon. I do have a collection of PHP utility functions (like, sql_insert, sql_delete, sql_select) to make database access less painful. Each PHP page calls the same header and footer pages. Much of this code was developed along side State Secrets. Together, this makes the PHP stuff pretty easy to modify.

Getting from emacs to PHP is a little circuitous, so please bear with me. It is straight forward to write a perl script that’s an XML-RPC client using the Frontier::RPC2 library. So that’s what I did first. I verified that I could talk to the PHP page that processes XML-RPC requests. Emacs is an extentable editor using the macro language lisp. The creator of Perl, Larry Wall, said of lisp that it had all the visual appeal of “porridge with toenail clippings” and I agree. However, I did learn just enough lisp to write the current emacs buffer to standard out to be read by a perl script which could then make the appropriate XML-RPC request and make some snappy response that emacs could deal with. This solution is what I wrote about on use.perl.org.

Gnu Privacy Guard

The new wrinkle for taskboy is security. The XML-RPC messages go across the network in clear text. The primary risk I wanted to address is not that someone will see my blog before it’s posted, but that an unauthorized fool would mess with my XML-RPC service. Whatever authorization mechanism I choose would have to work over clear text. It’s true I could have used SSL with HTTP Authentication for the web services PHP page, but I didn’t want to. Fortunately there is already a solution for this kind of problem, but for a different form of internet messaging.

Back in the mid-80’s, Phil Zimmerman had a problem: he couldn’t prove he was him. That is, email that claimed to be from him could have been forged by some joker only claiming to be him. How could those receiving email from he be assured that the sender Phil Zimmerman was the Phil Zimmerman? The answer became known as Pretty Good Privacy and it involved some very scary math. But you can think of it as something like a lock and key mechanism. When an email is sent out, a Very Big Number is computed with the content of the message and your private key. Your private key has a sibling called a public key that the recipient of mail will already have (and verified). When the recipient gets this message, pgp uses the public key on file to decode the message (or signature). If nothing has been changed in the message, the math will work out (via magic) and you can be pretty sure that Phil indeed has told you to “go pound sand.”

The important concept here is that PGP was meant to guarentee the identity of a sender using a message that anyone could read, but not change. Now in web services, I also have messages that anyone could read, but I want the server to accept only requests from me. Although it’s not a seemless fit, PGP turns out to be a good authenication method for private web services. Here’s how I modified Edd Dumbill’s XML-RPC PHP library and Ken MacLeod’s Frontier::RPC to use Gnu Privacy Guard (any open source version of PGP) to look down my web service. The strategy in both cases is that requests should be signed, not responses. It would be staight-forward to implement response signing too, but I don’t deem it necessary for my application.

Tweaking the PHP server

This class merely extends the xmlrpc_server class found in xmlrpcs.inc. I need to intercept the content, verify the signature, remove it if the message checks out and pass the rest of the XML doc to the parent class for handling. Hats off to Edd and the boys for getting the class partitioned so that I needed to override only one method.

One PHP tip: name your class files with .php. That way, you can point a browser to them and check the syntax. After all the syntax typos are gone, the page will appear blank. The the contents of files with .inc extensions are typical just displayed by the web server without parsing.

VerifyRequest($data)) {
               return $this->RPCError("Couldn't verify request");
            }
            $data = $this->RemoveSignature($data);
        }
        # pass off to parent
        return parent::parseRequest($data);
    }
    #-----------------------------------------
    # Look at the body of the request.  Does it have
    # a signature to verify?
    function VerifyRequest ($data="") {
       # BTW: I hate this solution
       # write out to a tmpfile
       $infile = "/tmp/" . posix_getpid() . ".vrf";
       if ($fh = fopen($infile, "w")) {
           fwrite($fh, $data);
           fclose($fh);
       } else {
           return 0;
       }
       # is this signed by someone I trust?
       $cmd = "/usr/bin/gpg --homedir=/path/to/gpg "
              . "--verify <$infile 2> /dev/null";
       $retval = 1; # default to failure
       if (file_exists($infile)) {
          system($cmd, $retval);
       } else {
          return 0;
       }
       unlink($infile);
       return $retval ? 0 : 1;
    }
    #-------------------------------------------
    # remove signature header/footer
    function RemoveSignature ($data="") {
        # for GPG
        # strip of the GPG stuff to get the basic XML back
        $preamble = "/-----BEGIN PGP SIGNED MESSAGE-----r?n"
                    . "Hash: SHA1r?nr?n/";
        $footer = "/-----BEGIN PGP SIGNATURE-----r?n"
                  . "Version: .+r?nr?n(S+r?n)+"
                  . "-----END PGP SIGNATURE-----/";
        $data = preg_replace($preamble,"", $data);
        $data = preg_replace($footer,"",$data);
        return $data;
    }
    #--------------------------------------------
    # wrapper for easier (and non-granular) error reporting
    function RPCError ($msg=0) {
        return new xmlrpcresp(0,500,"Bad request: $msg");
    }
}
?>

A few notes on this amateurish PHP code. First, any security wonk will tell you not to create temp files with PID names. In my case, I trust the other users on my server and don’t feel compelled to improve the security here. You may want to. I’m using the fact that gpg process has an exit value of 0 if the verify succeeds. The only way I saw of getting the exit value of a process in PHP is by using system(). There are a couple of other process handling functions, but those didn’t seem to give me this simple result to check (I could have used popen() and grepped through the output, but that seemed painful [although I might have done that if this were a perl module]).

parseRequest() is called by the parent class to unpack the XML request. Here, I look for the GPG signature and if all goes well, I pass just the XML string to the parent parseRequest() for processing.

Keep in mind that PHP runs as whichever user Apache runs as. This affects GPG. You have to set up the file ownership for the keys so that Apache can read and write to a directory. You should create keys specifically for this web service and not reuse your own GPG stuff. You were warned.

This class is used identically to the xmlrpc_service class defined in xmlrpcs.inc. No, I don’t know what the “da_” stands for in the class name. I though I wrote “ds_”, which would have stood for “digital signature.”

Expanding the Frontier

For the perl client, I simply defined to classes at the start of the program. Keep in mind, this is a win32 perl program.

package RPCEncoder;
use Frontier::RPC2;
@RPCEncoder::ISA = qw[Frontier::RPC2];
sub encode_call {
    my ($self) = shift;
    my $request = $self->SUPER::encode_call(@_);

    # sign it.  2-way opens hurt my brain
    my $outfile = "C:/blog/tmp.txt";
    unlink $outfile;

    my $cmd = qq[|C:/blog/gnupg/gpg.exe --homedir=/blog/gnupg ]
              . qq[--clearsign  > $outfile];
    open GPG, $cmd or die "Can't proc open: $!";
    print GPG $request;
    close GPG;

    open IN, $outfile or die "Can't open signed $outfile: $!";
    undef($request);
    while () {
        $request .= $_;
    }
    close IN;
    unlink($outfile);
    return $request;
}

sub decode {
    my ($self) = shift;
    my ($string) = shift;
    my %args = ('Style' => 'Frontier::RPC2',
                'use_objects' => $self->{'use_objects'},
               );                          
    $self->{'parser'} = XML::Parser->new(%args);
    return $self->{'parser'}->parsestring($string);
}
#-----------------------------------------------------
package RPCClient;
use Frontier::Client;
@RPCClient::ISA = qw[Frontier::Client];
sub new {
    my ($self) = shift->SUPER::new(@_);
    my %args = ('encoding'    => $self->{'encoding'},
                'use_objects' => $self->{'use_objects'}
               );
    $self->{'enc'} = RPCEncoder->new(%args);
    return $self;
}

The perl is a little weirder because of the way the Frontier Client works with XML::Parser, itself a horrible creation of Cthulhu. The Frontier::Client constructor needs to be overrided so that I can insert my custom RPCEncoder class, which is a thin coating over Frontier::RPC2. All the XML encoding and decoding happens in Frontier::RPC2 and that’s what I need to intercept.

When making a request, I need to sign the XML string before it goes on the wire. All things being equal, I’ll like to open the gpg process for reading (to feed it the string I’ve got in memory), but also read from it to get the output. This is a kind of double pipe, which is easy to do in shell, but weird to do with perl and especially so on Windows. Once again, I write a temp file and I don’t even pretend to give security a mind. Windows boxes are typically single user machines and mine doubly so. Also note that I don’t need to worry about running as a different user when I make the XML-RPC request. I’m in emacs (which runs as the current user); it spawns a shell to run perl; perl spawns a shell to run gpg.exe). All these processes run will run as me.

I had to also override decode(), because the parent uses ref($self) to determine the class name of the XML callbacks (n.b. BAD MONKEY!). This really should have been hard coded to ‘Frontier::RPC2’ since the callbacks all have hardcoded class names (see the code for the real scoop). I think this was an attempt to make child classes easier to write, but this trick backfired.

A Quick Note on GPG setup

Getting up to speed on how GPG works took longer than integrating it into the taskboy web service. I cannot go in to all the set up details here, but if you are familiar with ssh key mananagement, you will be well ahead of the game in GPG. If ssh keys make your brain hurt, GPG is a veritable migraine. But it boils down to this: you must make a GPG key pair for the source machine with the perl/emacs setup. You must copy the public key to the server. You must import that key into GPG and verify it (with gpg --edit). If you don’t do all of these steps, this digital signature for XML-RPC hack won’t work and you’ll be mystified at what went wrong. Verify your GPG at all stages using test files, so that you can get the GPG errors.

Note to jjohn: Move the *gpg files to wherever gpg want to find them. It will make things go easier on you.

Next: Some dirty thoughts on SOAP