Building iTunes podcast feeds with XML::RSS

Posted:

UPDATE 2: Please see this post for the latest working version of this module.

UPDATE 1: This code works with XML::RSS version 1.05 or so. The newest versions of this library removed the encode() method for reasons beyond my reckoning. You can either use this code as a starting point for porting to the new XML::RSS module (and tell me how you did it!) or simply use the older version, which is still available on CPAN.

Like searching for Bigfoot, creating a podcast feed that’s recognized by Apple can be an elusive, furtive and lonely process. Most know that podcasts are really just RSS 2.0 feeds with some extra tags. This seems like something perl should handle. The perl module XML::RSS nearly has everything necessary, but there’s always a catch: the itunes namespace.

All is not lost, because XML::RSS is a class that can be inherited from. With a little overriding goodness, you too get make valid feeds that even Apple’s iTunes music store will accept. Here’s my module that inherits from XML::RSS. While it’s not a complete solution, it works well enough for me.


package XML::RSS::Podcast;
use XML::RSS;
@XML::RSS::Podcast::ISA = qw[XML::RSS];


sub as_string {
  my $self = shift;
  return $self->as_podcast_rss;
}

sub as_podcast_rss {
  my $self = shift;
  my $enc = $self->{encoding};
  my $output = <<EOT;
<?xml version="1.0" encoding="$enc"?>
<rss xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" 
     version="2.0">

EOT

  $output .= $self->podcast_start_channel;

  for my $i (@{$self->{items}}) {
    $output .= $self->podcast_item($i);
  }

  $output .= $self->podcast_end_channel;
  return $output .= "n</rss>n";
}

sub podcast_start_channel {
  my $self = shift;
  my @fields = qw[ttl title description link language 
                  pubDate lastBuildDate creator 
                  webMaster copyright
                 ];
  my @image_fields  = qw[title url description link     
                         width height];
  my @itunes_fields = qw[subtitle author summary
                         image];

  my $output = "<channel>n";

  for my $f (@fields) {
    if (length($self->{channel}->{$f})) {
      my $s = $self->encode($self->{channel}->{$f});
      $output .= "t<$f>$s</$f>n";
    }
  }

  my $seen_image = 0;
  for my $f (@image_fields) {
    if (length($self->{image}->{$f})) {
      unless ($seen_image) {
        $output .= "t<image>n";
        $seen_image = 1;
      }
      my $s = $self->encode($self->{image}->{$f});
      $output .= "tt<$f>$s</$f>n";
    }
  }

  if ($seen_image) {
    $output .= "t</image>n";
  }

  # Owner name/email not handled
  for my $f (@itunes_fields) {
    if (length($self->{channel}->{itunes}->{$f})) {
      my $s=$self->encode($self->{channel}->{itunes}->{$f});
      $output .= "t<itunes:$f>$s</itunes:$f>n";
    }
  }

  # FIXME: Doesn't handle sub cats.
  if (ref $self->{channel}->{itunes}->{category}) {
    for my $c (@{$self->{channel}->{itunes}->{category}}) {
      my $s = $self->encode($c);
      $output .= qq[t<itunes:category text="$s" />n];
    }
  }

  return $output . "n";
}

sub podcast_end_channel {
  return "</channel>n";
}

sub podcast_item {
  my $self = shift;
  my $item = shift;

  my @fields = qw[title guid pubDate description];
  my @itunes_fields = qw[author subtitle summary 
                         duration keywords explicit];

  my $output = "t<item>n";

  for my $f (@fields) {
    if (defined $item->{$f}) {
      $s = $self->encode($item->{$f});
      $output .= "tt<$f>$s</$f>n";
    }
  }

  if (ref $item->{enclosure}) {
    $output .= "<enclosure";
    for my $f (qw[url length type]) {
      if (defined $item->{enclosure}->{$f}) {
        $output .= qq[ $f="$item->{enclosure}->{$f}"];
      }
    }
    $output .= "/>";
  }

  for my $f (@itunes_fields) {
    if (defined $item->{itunes}->{$f}) {
      $s = $self->encode($item->{itunes}->{$f});
      $output .= "tt<itunes:$f>$s</itunes:$f>n";
    }
  }

  return $output .= "t</item>n";
}

A word about the RFC822 pubDate. This seemingly arbitrary date format can be easily generated with a call to strftime(). The format string is "%a, %e %b %Y %H:%M:%S %z". You might think that you can use mysql’s DATE_FORMAT() to replicate this, but you’d be wrong. Instead, generate mysql queries with UNIX_TIMESTAMP(), feed the result of that to localtime() and feed that to strftime(). Simple, no? No, but such are the challenges in programming.

Here’s a sample of how I use this for pseudocertainty.com.

use strict;
use DBI;
use POSIX qw[strftime];
use MP3::Info;

my $rssfile = shift || "./ps-pod.rss";

my $dbh=DBI->connect("dbi:mysql:pseudo", "pwrUser", "s3cr3t") 
        or die "connect: $DBI::errstr";

my $shows    = get_shows($dbh);
$dbh->disconnect;

my $rss = XML::RSS::Podcast->new(version => "2.0");
my $rfc822_fmt = '%a, %e %b %Y %H:%M:%S %z';

my $iMeta = { "author" => "Joe Johnston and Mike Lord",
              "summary" => 'UFOlogy, Cryptozology and 
                            the people who love them are 
                            discussed on this 
                            internet-only radio show',
              "subtitle" => "Don't be Certain.  
                             Be PseudoCertain.",
              "category" => ["Talk Radio"],
                          };

$rss->channel(title => 'PseudoCertainty',
              "ttl" => 60, # time to live
              link  => 'http://www.pseudocertainty.com/',
              language => 'en-us',
              description => 'UFOlogy, Cryptozology and the 
                             people who love them are discussed 
                             on this internet-only radio show',
              copyright => "Copyright Joe Johnston and Mike Lord",
              webMaster => "jjohn@pseudocertainty.com",
              pubDate => strftime($rfc822_fmt,localtime()),
              "itunes" => $iMeta,
              );

for my $r (@$shows) {
    # no more than 30 words
    my @words = map { s/(<[^>]+>)//g; $_; } 
                  split /s+/, $r->{about};
    my $desc = "not set";
    if (@words > 30) {
        $desc = join " ", @words[0..29], "...";
    } else {
        $desc = join " ", @words;
    }

    my $finfo = get_mp3info("/path/to/shows/$r->{mp3_filename}");
    my $pl = qq[http://pseudocertainty.com/$r->{mp3_filename}];
    my $enc = { url => $pl,
                length => -s "/path/to/shows/$r->{mp3_filename}",
                type => "audio/mpeg",
              };
    my $itunes = {
                  explicit => "N",
                  keywords => "UFO aliens zorknapp",
                  summary  => $desc,
                  duration => $finfo->{TIME},
                                }
    $rss->add_item( title => $r->{title},
                    link  => $pl,
                    pubDate => strftime($rfc822_fmt, 
                               localtime($r->{pretty_created})),
                    enclosure => $enc,
                    permaLink => $pl,
                    description => $desc,
                    "itunes" => $itunes,
                   );
}

$rss->save($rssfile);

#------------------------
# subs
#-------------------------
sub get_shows {
    my ($dbh) = shift;
    my $sql = qq[
  SELECT *,UNIX_TIMESTAMP(created) as pretty_created 
    FROM shows WHERE publish = 1 ORDER BY created DESC;
                 ];

    my $sth = $dbh->prepare($sql);
    die "get_shows: '$sql': " . $sth->errstr unless $sth->execute;
    return $sth->fetchall_arrayref({});
}

And they called me a fool for wanting to make XML::RSS spew podcast RSS. But I showed them! I showed them all! Bwahahhaha!