Latest News
It seems that a few of you are using or trying to use the XML::RSS::Podcast module I published on this blog. As was mentioned there, modern version of XML::RSS do not support the encode function required for generating well-formed XML documents.
That's all changed thanks to Rohan Carly. [Add your thanks to Rohan in the comments.]
I present the complete version of XML::RSS::Podcast below. If you find this module
useful, I'll attempt to create a CPAN module for it.
package XML::RSS::Podcast;
use XML::RSS;
use HTML::Entities qw[encode_entities_numeric encode_entities];
@XML::RSS::Podcast::ISA = qw[XML::RSS];
our $VERSION = q[1.1];
# encode by Rohan Carly
# Stolen from XML::RSS::Private::Output::Base;
sub encode {
my ($self,$text) = @_;
return unless defined($text);
my $encoded_text = '';
while ($text =~ s,(.*?)(<!\[CDATA\[.*?\]\]>),,s) {
# we use &named; entities here because it's HTML
$encoded_text .= encode_entities($1) . $2;
}
# we use numeric entities here because it's XML
$encoded_text .= encode_entities_numeric($text);
return $encoded_text;
};
sub as_string {
my $self = shift;
return $self->as_podcast_rss;
}
sub as_podcast_rss {
my $self = shift;
my $enc = $self->{encoding};
my $output = qq[<?xml version="1.0" encoding="$enc"?>
<rss xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
version="2.0">];
$output .= $self->podcast_start_channel;
for my $i (@{$self->{items}}) {
$output .= $self->podcast_item($i);
}
$output .= $self->podcast_end_channel;
return $output .= "\n</rss>\n";
}
sub podcast_start_channel {
my $self = shift;
my @fields = qw[ttl title description link language
pubDate lastBuildDate creator
webMaster copyright
];
my @image_fields = qw[title url description link
width height];
my @itunes_fields = qw[subtitle author summary
image explicit]; # Thanks Rohan
my $output = "<channel>\n";
for my $f (@fields) {
if (defined($self->{channel}->{$f})) {
my $s = $self->encode($self->{channel}->{$f});
$output .= "\t<$f>$s</$f>\n";
}
}
my $seen_image = 0;
for my $f (@image_fields) {
if (defined($self->{image}->{$f})) {
unless ($seen_image) {
$output .= "\t<image>\n";
$seen_image = 1;
}
my $s = $self->encode($self->{image}->{$f});
$output .= "\t\t<$f>$s</$f>\n";
}
}
if ($seen_image) {
$output .= "\t</image>\n";
}
# Owner name/email not handled
for my $f (@itunes_fields) {
if (defined($self->{channel}->{itunes}->{$f})) {
my $s=$self->encode($self->{channel}->{itunes}->{$f});
$output .= "\t<itunes:$f>$s</itunes:$f>\n";
}
}
# Rohan's sub-cat handling code
# Expects an array: [category, sub-category]
if (ref $self->{channel}->{itunes}->{category} eq 'ARRAY') {
my $major = $self->encode(${$self->{channel}->{itunes}->{category}}[0]);
my $minor = $self->encode(${$self->{channel}->{itunes}->{category}}[1]);
$output .= qq[\t<itunes:category text="$major">\n];
$output .= qq[\t\t<itunes:category text="$minor">\n];
$output .= qq[\t\t</itunes:category>\n];
$output .= qq[\t</itunes:category>\n];
}
return $output . "\n";
}
sub podcast_end_channel {
return "</channel>\n";
}
sub podcast_item {
my $self = shift;
my $item = shift;
my @fields = qw[title guid pubDate description link];
my @itunes_fields = qw[author subtitle summary
duration keywords explicit];
my $output = "\t<item>\n";
for my $f (@fields) {
if (defined($item->{$f})) {
$s = $self->encode($item->{$f});
my $perma = "";
if ($f eq "guid") {
$perma = qq[isPermaLink="false"];
}
$output .= "\t\t<$f$perma>$s</$f>\n";
}
}
if (ref $item->{enclosure}) {
$output .= "<enclosure";
for my $f (qw[url length type]) {
if (defined $item->{enclosure}->{$f}) {
$output .= sprintf("$f=%s", $self->encode($item->{enclosure}->{$f}));
}
}
$output .= "/>";
}
for my $f (@itunes_fields) {
if (defined $item->{itunes}->{$f}) {
$s = $self->encode($item->{itunes}->{$f});
$output .= "\t\t<itunes:$f>$s</itunes:$f>\n";
}
}
return $output .= "\t</item>\n";
}
1;
For an example of usage, please see the previous post cited above. Report bugs in the comments or through email. Thanks.
Let's all say it together: XML::Parser Sucks! There, that was cleansing.
After a much prodding of my XML buddies (hi jmac!) and an evil notion of using goto (thankfully Perl doesn't let you jump into the middle of a function), I came across a seemingly little used XML::Parser function So, here's a very goofy example of how to work with this bod boy. I'll be looking to shove this into Frontier::RPC2 in a most Eee-VEIL way. ;-)
parse_start which returns a new XML::Parser::ExpatNB object (with oh so little documentation) that does EXACTLY WHAT I NEED! I need a parser that parses a stream in increments.
Consider how useful this is for dealing with XML messages coming across the network that might f'ing HUGE! This parsing method will at least give me an opportunity to chunk the data into smaller bits (save for the pathological 45TB between a single
use strict;
use warnings;
use XML::Parser;
my $p = XML::Parser->new(
Style => 'My::Pkg',
);
print "Reading from __DATA__\n";
my $data; # A place for my text data
# Don't be fooled: it's an
# object constructor
my $nb_p = $p->parse_start(data => \$data);
while(my $l = <DATA>){
chomp($l);
$nb_p->parse_more($l);
if(my $s = ${$nb_p->{data}}){
print "Back at the range, I got $s\n";
}
}
$nb_p->parse_done;
package My::Pkg;
sub Init {
my($expat) = @_;
print "Hello!\n";
}
sub Start {
my($expat, $tag, %attrs) = @_;
${$expat->{data}} = undef;
print "Start: $tag\n";
}
sub Char {
my($expat, $text) = @_;
${$expat->{data}} = undef;
return if $text =~ /^\s*$/;
$expat->{char_bag} = $text;
}
sub End {
my($expat, $tag) = @_;
print "End: $tag\n";
${$expat->{data}} =
$expat->{char_bag};
# clean up
$expat->{char_bag} = '';
return;
}
__END__
<?xml version="1.0" ?>
<a>
<b>
<c>
<d>fiddlesticks</d>
</c>
</b>
</a>
UPDATE 2: Please see this post for the latest working version of this module.
UPDATE 1: This code works with XML::RSS version 1.05 or so. The newest versions of this library removed the encode() method for reasons beyond my reckoning. You can either use this code as a starting point for porting to the new XML::RSS module (and tell me how you did it!) or simply use the older version, which is still available on CPAN.
Like searching for Bigfoot, creating a podcast feed that's recognized by Apple can be an elusive, furtive and lonely process. Most know that podcasts are really just RSS 2.0 feeds with some extra tags. This seems like something perl should handle. The perl module XML::RSS nearly has everything necessary, but there's always a catch: the itunes namespace.
All is not lost, because XML::RSS is a class that can be inherited from. With a little overriding goodness, you too get make valid feeds that even Apple's iTunes music store will accept. Here's my module that inherits from XML::RSS. While it's not a complete solution, it works well enough for me.
package XML::RSS::Podcast;
use XML::RSS;
@XML::RSS::Podcast::ISA = qw[XML::RSS];
sub as_string {
my $self = shift;
return $self->as_podcast_rss;
}
sub as_podcast_rss {
my $self = shift;
my $enc = $self->{encoding};
my $output = <<EOT;
<?xml version="1.0" encoding="$enc"?>
<rss xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
version="2.0">
EOT
$output .= $self->podcast_start_channel;
for my $i (@{$self->{items}}) {
$output .= $self->podcast_item($i);
}
$output .= $self->podcast_end_channel;
return $output .= "n</rss>n";
}
sub podcast_start_channel {
my $self = shift;
my @fields = qw[ttl title description link language
pubDate lastBuildDate creator
webMaster copyright
];
my @image_fields = qw[title url description link
width height];
my @itunes_fields = qw[subtitle author summary
image];
my $output = "<channel>n";
for my $f (@fields) {
if (length($self->{channel}->{$f})) {
my $s = $self->encode($self->{channel}->{$f});
$output .= "t<$f>$s</$f>n";
}
}
my $seen_image = 0;
for my $f (@image_fields) {
if (length($self->{image}->{$f})) {
unless ($seen_image) {
$output .= "t<image>n";
$seen_image = 1;
}
my $s = $self->encode($self->{image}->{$f});
$output .= "tt<$f>$s</$f>n";
}
}
if ($seen_image) {
$output .= "t</image>n";
}
# Owner name/email not handled
for my $f (@itunes_fields) {
if (length($self->{channel}->{itunes}->{$f})) {
my $s=$self->encode($self->{channel}->{itunes}->{$f});
$output .= "t<itunes:$f>$s</itunes:$f>n";
}
}
# FIXME: Doesn't handle sub cats.
if (ref $self->{channel}->{itunes}->{category}) {
for my $c (@{$self->{channel}->{itunes}->{category}}) {
my $s = $self->encode($c);
$output .= qq[t<itunes:category text="$s" />n];
}
}
return $output . "n";
}
sub podcast_end_channel {
return "</channel>n";
}
sub podcast_item {
my $self = shift;
my $item = shift;
my @fields = qw[title guid pubDate description];
my @itunes_fields = qw[author subtitle summary
duration keywords explicit];
my $output = "t<item>n";
for my $f (@fields) {
if (defined $item->{$f}) {
$s = $self->encode($item->{$f});
$output .= "tt<$f>$s</$f>n";
}
}
if (ref $item->{enclosure}) {
$output .= "<enclosure";
for my $f (qw[url length type]) {
if (defined $item->{enclosure}->{$f}) {
$output .= qq[ $f="$item->{enclosure}->{$f}"];
}
}
$output .= "/>";
}
for my $f (@itunes_fields) {
if (defined $item->{itunes}->{$f}) {
$s = $self->encode($item->{itunes}->{$f});
$output .= "tt<itunes:$f>$s</itunes:$f>n";
}
}
return $output .= "t</item>n";
}
A word about the RFC822 pubDate. This seemingly arbitrary date format
can be easily generated with a call to strftime(). The
format string is "%a, %e %b %Y %H:%M:%S %z". You might think
that you can use mysql's DATE_FORMAT() to replicate this, but you'd be wrong.
Instead, generate mysql queries with UNIX_TIMESTAMP(), feed the result of
that to localtime() and feed that to strftime(). Simple, no? No, but such
are the challenges in programming.
Here's a sample of how I use this for pseudocertainty.com.
use strict;
use DBI;
use POSIX qw[strftime];
use MP3::Info;
my $rssfile = shift || "./ps-pod.rss";
my $dbh=DBI->connect("dbi:mysql:pseudo", "pwrUser", "s3cr3t")
or die "connect: $DBI::errstr";
my $shows = get_shows($dbh);
$dbh->disconnect;
my $rss = XML::RSS::Podcast->new(version => "2.0");
my $rfc822_fmt = '%a, %e %b %Y %H:%M:%S %z';
my $iMeta = { "author" => "Joe Johnston and Mike Lord",
"summary" => 'UFOlogy, Cryptozology and
the people who love them are
discussed on this
internet-only radio show',
"subtitle" => "Don't be Certain.
Be PseudoCertain.",
"category" => ["Talk Radio"],
};
$rss->channel(title => 'PseudoCertainty',
"ttl" => 60, # time to live
link => 'http://www.pseudocertainty.com/',
language => 'en-us',
description => 'UFOlogy, Cryptozology and the
people who love them are discussed
on this internet-only radio show',
copyright => "Copyright Joe Johnston and Mike Lord",
webMaster => "jjohn@pseudocertainty.com",
pubDate => strftime($rfc822_fmt,localtime()),
"itunes" => $iMeta,
);
for my $r (@$shows) {
# no more than 30 words
my @words = map { s/(<[^>]+>)//g; $_; }
split /s+/, $r->{about};
my $desc = "not set";
if (@words > 30) {
$desc = join " ", @words[0..29], "...";
} else {
$desc = join " ", @words;
}
my $finfo = get_mp3info("/path/to/shows/$r->{mp3_filename}");
my $pl = qq[http://pseudocertainty.com/$r->{mp3_filename}];
my $enc = { url => $pl,
length => -s "/path/to/shows/$r->{mp3_filename}",
type => "audio/mpeg",
};
my $itunes = {
explicit => "N",
keywords => "UFO aliens zorknapp",
summary => $desc,
duration => $finfo->{TIME},
}
$rss->add_item( title => $r->{title},
link => $pl,
pubDate => strftime($rfc822_fmt,
localtime($r->{pretty_created})),
enclosure => $enc,
permaLink => $pl,
description => $desc,
"itunes" => $itunes,
);
}
$rss->save($rssfile);
#------------------------
# subs
#-------------------------
sub get_shows {
my ($dbh) = shift;
my $sql = qq[
SELECT *,UNIX_TIMESTAMP(created) as pretty_created
FROM shows WHERE publish = 1 ORDER BY created DESC;
];
my $sth = $dbh->prepare($sql);
die "get_shows: '$sql': " . $sth->errstr unless $sth->execute;
return $sth->fetchall_arrayref({});
}
And they called me a fool for wanting to make XML::RSS spew podcast RSS. But I showed them! I showed them all! Bwahahhaha!
About this blog
The taskboy blog is a exploration of computer technology by Joe Johnston. Topics of posts include practical examples Perl, PHP, Python and Java as well as book reviews, industry insights and miscellaneous good stuff.
Current Status
Watching _Brass Latern_. Ah IF, your coyness is your charm.
Posted: Sun Sep 05 16:02:15 +0000 2010
Latest Feedbag
- Do It Anyway
- Reader recommendation: Atlas, Schmatlas
- Scientists Cut Greenland Ice Loss Estimate By Half
- (Video) George Parker: Agencies Need To Buy Themselves Back
- Android Is As Open As The Clenched Fist Id Like To Punch The Carriers With
- Eden Ventures Joins The Super Angels Gang, Five Investments Down
- Human Translation Startup myGengo Raises Seed Round From International Investors
- Another Instant Music Video
- DARPA Wants Extreme Wireless Interference Buster
- Film Industry Hires Cyber Hitmen To Take Down Pirates
Generated: 04:37 on 09/Sep/2010
Recent posts
- Very quick git primer for basic functionality
- Tips for spammers: don't insult me
- CakePHP vs. Symfony: a quick note
- Creating events for Yahoo and Google calendars
- SANs on a budget: iSCSI under Ubuntu
- iPad, iTouch and Kindle: Which is the better mousetrap?
- Rise of the Ad-Hocracy, Part II
- Rise of the Ad-Hocracy, Part I
- Small Hiatus
