This article originally appeared in The Perl Journal, issue #17.
The scene: A dusty afternoon in a rickety one horse town. The sign over the "Last Chance" saloon leans drunkenly forward and tumble weeds skip lazily across your path. You've fought your way through seven ambushing web projects and just barely escaped to tell about them. To your left, a shifty eyed city slicker named ASP hawks his miracle invention to eliminate work-a-day web drudgery. To your right, a young, ruddy faced preacher thumps his ham fist righteously on his leather bound Cold Fusion manual. All around you, the young and blind pound the dry earth, desperately trying to hold together their company's legacy home page with notepad and Frontpage. And staring down at you from the end of the street, is the meanest, neediest, most market driven web site east of the Mississippi that threatens to eat your lunch.
Yep, there's no doubt about it. You're in web country.
In an environment in which the person responsible for creating
an appealing web site layout and the person who writes the code
that makes that happen are different people, traditional hard wired
CGI scripts just get in the way. As a web programmer, you probably
don't find adding print statements that spew HTML overly challenging.
Every time the layout person wants to alter the site, a traditional
CGI script will require a programmer to implement those changes, even
if no new functionality is added to the site. Wouldn't you rather
the layout person manage the HTML instead of you?
Mason solves this problem.
Mason (http://www.masonhq.com) is an Open Source project authored by Jonathan Swartz which, together with mod_perl and Apache, offers the web developers a tool to slay the maintenance dragon. In the words of the FAQ, Mason is "a Perl-based web site development and delivery engine."
Mason accomplishes its magic with a venerable trick. It allows for embedded Perl code to be written in an otherwise ordinary HTML file. In fact, these bits of embedded Perl can be collected into files called components which in turn can be called from other Mason rendered HMTL files. Components are to Mason what subroutines are to Perl.
Yes, Server Side Include technology is alive and well. In fact, Mason has some very successful closed source brethren. Microsoft's Active Server Pages and Allaire's Cold Fusion also use a special SSI language. Let's not forget about open source competitors like Python's Zope, Java Server Pages or PHP! SSI is here to stay.
To tame the wild beast of creating and maintaining a living web site, traditional HTML spewing CGI programs are not enough. Even with a flexible language like Perl, making UI changes to traditional CGI scripts often requires an experienced coder. "Vital" changes thought up by marketing folks and their graphic designers can often amount to several hours of patching and testing new CGI code. Even simple changes, like the movement of a button, or the addition of text can become non-trivial task when web site's presentation is tied to its functionality. This is the issue that transcends the choice of implementation language and speaks to the core of dynamic web site design.
Using any SSI technology should greatly reduce the friction between your coders and graphics people. Because site functionality, like a navigation widget for example, can be encapsulated into a component which is called from an otherwise static web page. The graphic designer can simply treat this code, which looks like a funny HTML tag, as a black box and move this widget to wherever his fickle black heart desires. The good news is that, after implementing the navigation widget, the coder is no longer required (assuming the delivered widget works as advertised).
For those that want the benefits of code reusability and data hiding, HTML::Mason components can be used in a very Object Oriented fashion.
Mason works best with Apache and mod_perl. For the record, the system I used was a Red Hat 6.0 Celeron 400 with 128M of RAM, stock Red Hat distributed Perl, compiled-from-source Apache 1.3.9, mod_perl 1.21 and HTML::Mason 0.8. If you don't already have mod_perl or Mason, try your local CPAN mirror. Better yet, use the CPAN module. From your shell as an account with administrator privileges, type:
perl -MCPAN -e 'install mod_perl; install HTML::Mason'
Have I mentioned how much I love the CPAN module? A lot.
Mason comes with a very complete installation guide (Mason.html). For those familiar with Apache, the httpd.conf changes are trivial, although I'm not sure I'd commit my entire web directory to Mason use as this installation guide suggests. I made a directory off the root of my htdocs called 'mason'.
Next, you'll need to create a handler.pl file in your new mason root directory. Besides configuring path information (see below), this file is where you would 'use' modules common to all your components. This prevents the cost of including the same module in multiple components. You'll notice in the unpacked Mason directory, there is an 'eg' subdirectory which has a very serviceable handle.pl nearly ready to go. I recommend uncommenting the line
#return -1 if $r->content_type && $r->content_type !~ m|^text/|io;
in the handler() subroutine. This prevents Mason from trying to parse non-text files served from your mason directory. I suppose an entry for next year's Obfuscated Perl Contest might include a carefully engineered GIF that is meant to be parsed by Mason to produce "The Perl Journal", but it won't be submitted by me.
Another source of confusion about configuring the handle.pl file concerns the initialization of Mason's Interp (Interpreter) object, which requires a few user dependent paths. Although most new users won't need to directly manipulate it, the Interpreter object is responsible for executing the components and directing the resulting output. The first is the "comp_root", which points to the directory which is where Mason will begin to search for called components. Because I wasn't overly security conscious, I choose the "mason" directory. The data directory, which also needs to be specified, is a kind of scratch directory where debug files and previews are stored. Again being a simple caveboy, I chose 'mason/data'. For a production system, you'll want to choose these directories a bit more carefully.
Mason 0.8 has some new syntax and does some implementation a bit differently than the 0.7x series. While I believe the development is heading in the right direction, there are some issues which are worth noting. For instance, Mason 0.8 won't send HTTP headers for a page with no text. This makes redirection and cookie issuing less than ideal since one would need to write a dummy page just to have the HTTP headers served. There is a workaround on the Mason mailing list, but I'd recommend staying with the last 0.7x version or waiting for 0.81, which may be out by the time you read this.
The site I designed was intended to demonstrate common tasks that most web designers face. Please note I am not a layout expert. One of the compelling reasons to use Mason is to bridge the gap between coders and layout people. One common task is that a web page display information stored in a database. The layout person needs the coder to provide a method for accessing this data. This is where a Mason component comes in handy. I will be querying my web site Aliens, Aliens, Aliens (A3), it's about aliens. It is a MySQL driven web site with a mod_perl front end. Because I didn't know about Mason, I wrote my own system of embedded symbols which could be mixed with HMTL. Although I like my little A3 site, I don't want anyone else to reinvent this wheel. Just as CPAN doesn't need another XBase module, we probably have enough embedded Perl systems now.
The best place to begin a discussion of components is with the Mason equivalent of "Hello, World". Many sites like to have standard headers and footers. They help provide a common look and feel to web sites that make some marketing types soil themselves.
Listing 1.
<html> <title><%$title%></title> <body bgcolor="<%$color%>"> <h1><%$title%></h1> <%args> $title => 'Nonsuch' $color => 'FFFFFF' # white </%args> |
Components are any mixture of HTML and specially delimited Perl code. Listing 1 is the source for my header. For the most part, it looks like boring HTML. There are two different Mason tags you should note here.
The first is the ubiquitous <% %>tag.
Any arbitrary Perl code found in that tag will be executed and the
resulting value displayed in the render page.
<% 2 + 2 %> will display in a browser as 4.
Mason also has a small set of special tags used for more complex or special purpose blocks of code. Here, the <%args> </%args> section is used to "prototype" expected arguments for this component. In this case, two scalars may be passed to the header component. In the absence of an values, I am setting some defaults. You may declare arguments without defaults, which forces the caller to pass parameters. These parameters are lexically scoped, which means these variables have no life outside of the component. For those that have wanted stronger subroutine prototyping in Perl, this may appeal to you.
Listing 2.
<hr> <div align=center> <address> © <% 1900+(localtime)[5] %> Joe Johnston<BR> Use this code to your maximium advantage, but due credit is always appreciated. <address> </div> </body> </html> |
The footer component, listing 2, is even simpler, since it takes no arguments at all.
Mason provides many flexible ways to pass arguments to components. One way is to simply attach the URLencoded arguments to the URL of the component, a la a GET query. Another is to call the component directly from another component, as seen in the first line of listing 3 which is my index page.
Mason's <& &> is similar to Perl's ampersand operator in that it calls a component much like a subroutine. The return value is discarded, but the side effects are the important feature. In this case, calling:
Listing 3.
<& header, title=>'Welcome to the World of Mason', color=>'tan' & >
<P>Gawk in amazement as I build an interactive, database driven
site before your eyes!
<P>Here's a link to a nonexistent <a href="microsoft">subdirectory</a>.
<P>Pssst! Want to look at some headlines from other sites?
<UL>
<LI><a href="news/slashdot">Slashdot</a>
<LI><a href="news/perl_news">Perl News</a>
<LI><a href="news/a3">Aliens, Aliens, Aliens</a>
<LI><a href="news/missing_uri">Microsoft News</a>
<LI><FORM Method=post Action="news/dhandler">
URL to your favorite RDF: <input type=text name=RDF>
<input type=submit>
</FORM>
</UL>
<& departments &>
<& footer &>
|
inserts the rendered version of the header modified with the appropriate parameters. The rendered version of this page appears in figure 1. Yet another way to pass arguments is to use default handlers and extra path information.
|
|
When a component is called that Mason cannot find, it looks in that directory for a file called 'dhandler'. For example, in the 'mason' directory, I have a dhandler (listing 4) which is really just a custom 404 document (figure 2). In the 'news' subdirectory, I have a dhandler (listing 5) which is meant to be called with extra path information. In this case the dhandler will try to retrieve from the given site a Rich Site Summary (RSS) file, an XML file many news sites use to broadcast their headlines. Looking back at the index.html component, listing 3, the parameter to dhandler looks like a file in the news subdirectory. Selecting the A3 link produces the page seen in figure 3. This is the kind of magic that makes some coders soil themselves.
|
|
But something else is going on in this news/dhandler component. Because users can enter an arbitrary URL to an RDF, this component also accepts the more traditional parameter passing method in a variable called %ARGS.
Listing 4.
| <& header &> <b>Oops! I'm not certain where you were going!</b> <P><a href="index.html">Back</a> <& footer &> |
Listing 5.
<%init>
my $news_site = $m->dhandler_arg;
my $rss = new XML::RSS;
my $rdf;
for ( $news_site ){ # think "switch"
/slashdot/ && do {
$rdf = get('http://slashdot.org/slashdot.rdf');
last;
};
/perl_news/ && do {
$rdf = get('http://www.news.perl.org/perl-news.rdf');
last;
};
/a3/ && do {
$rdf = get('http://aliensaliensaliens.com/a3.rdf');
last;
};
}
$rdf ||= get($ARGS{RDF}); # was I passed in something?
unless( $rdf ){
# a little tricky, use the existing mechanism
# for this 404, use old standby CGI env hack
use CGI qw/:all/;
print
redirect("http://$ENV{SERVER_NAME}/mason/tpj/404");
return;
}
$rss->parse($rdf);
</%init>
<& ../header, title=> ($news_site||$rss->{'channel'}->{'title'}) &>
<P>See the rest of <a href="<% $rss->{'channel'}->{'link'} %>">
<% $rss->{'channel'}->{'title'} %></a>
<UL>
% for my $bit ( @{ $rss->{'items'}} ){ # not very OO ;-)
<LI><a href="<% $bit->{'link'} %>"><% $bit->{'title'} %></a>
% if( $bit->{'description'} ) {
: <% $bit->{'description'} %>
% }
% }
</UL>
<a href="/mason/tpj/">Back</a>
<& ../footer &>
|
If you're familiar with DBI, database access done no differently in Mason. In fact, you can use Apache::DBI to transparently give you persistent DBH handles. Aliens, Aliens, Aliens is divided up into several departments, which contain other departments. The idea of the departments component, listing 6, is to generate a nice table with links to all the top level departments.
Listing 6.
<%init>
my $dbh = DBI->connect("DBI:mysql:aliens:nfs.daisypark.org",
"username", "password") or
die "ERROR: Couldn't connect to DB $DBI::errstr";
# find all the top level departments
# All top level departments have 'home' as a parent
my $sth = $dbh->prepare(<<EOT) or die "ERROR: prepare failed ".$dbh->errstr;
select homepage_id,segment from departments
where parent_id=1 order by segment
EOT
$sth->execute or die "ERROR: couldn't get departments! " . $dbh->errstr;
</%init>
<TABLE Border=1>
<TR>
% while( my $hr = $sth->fetchrow_hashref ){
<TH><A HREF=
"http://nfs.daisypark.org/cgi-bin/render_article.pl?article_id=<%$hr->{homepage_id}%>">
<% $hr->{segment} %></a></TH>
% }
</TR>
</TABLE>
<%cleanup>
#$dbh->disconnect;
</%cleanup>
|
I'll skip the discussion of DBI and SQL and draw your attention to the embedded fetchrow loop which cleaning retrieves all the pertinent links and labels. Notice how even though the 'while' statement is set off with the % symbol (meaning that the rest of the line is Perl code, the plain HTML is repeated as needed. Compare this to a more traditional Perl CGI in which the loop has a print statement outputting HTML. Although this may seem like two sides of the same coin, the difference with Mason is that your layout expert can now tweak the non-code bits easily without bothering you. This generally leads to more beer time, which is the second thing any good job should give you.
|
|
Finally, you'll notice the <%cleanup> section. This is Perl code that gets executed at the end of the component's run. Here, I would normally kill my DBH handle, close filehandles or free objects. However, since my DBH handle isn't going away due to Apache::DBI, I have commented this out.
I have provided merely the briefest introduction to this great tool. Other topics that await you in Mason-land are the fabulous Component Manager (written by Mark Schmick), lots of documentation, component debugging files and component staging. Do yourself a favor and check Mason out for yourself.