Monthly Archives: November 2011

London Perl Workshop Review

Unfortunately O’Reilly’s Josette Garcia couldn’t be at the London Perl Workshop, so she asked if I could write something about it for her blog.

It took me longer than it should have done, but my post has just been published over at Josetteorama.

Hopefully Josette will be back at next year’s event. She was much missed (although, of course, Alice did a fine job of making up for Josette’s absence).

Programming Like It’s 1999

This article was published yesterday. It shows a way to extract data about a film from IMDB and put it into a local database. Actually, it doesn’t even do that. It produces SQL that you can then run to insert the data.

It’s all rather nasty stuff and indicative of the fact that most people using Perl are still using idioms that would have made us shudder ten years ago.

There were a few things that I didn’t like about the code. The use of curl to grab data from the web site, the indirect object syntax when creating the XML::Simple object and, in particular, the huge amount of repetitive code used to create the SQL statements.

So I did what anyone would do in my situation. I rewrote it. Here’s my version.

#!/usr/bin/perl

use strict; 
use warnings;

use XML::Simple;
use LWP::Simple;

@ARGV or die "Please provide movie title in quotes\n";

my $movie = shift;
$movie =~ s/\s/+/g;

my $movieData = get "http://www.imdbapi.com/?r=XML&t=$movie";
my $data = XMLin( $movieData );

my @fields = qw[released rating director genre writer runtime plot id
                title votes poster year rated actors];

my %film = %{$data->{movie}};

foreach (@fields) {
  $film{$_} =~ s/'/\\'/g;
}

my $tstamp = time();

my $sql = 'INSERT INTO movie_collection (';
$sql .= join ', ', @fields;
$sql .= ') VALUES (';
$sql .= join ', ', map { qq['$film{$_}'] } @fields;
$sql .= ",'" . time . "');\n";

print $sql;

I haven’t actually changed that much. I’ve tidied up a bit. Switched to using LWP::Simple, removed some unnecessary escaping, things like that. I have made two pretty big changes. I’ve got rid of all of the nasty variables containing data about a film. A film is a single object and therefore should be stored in a single variable. And, happily enough, the $data you get back from XMLin contains a hash that does the trick perfectly.

The second change I made was to rejig the way that the SQL is created. By using an array that contains the names of all of the columns in the table, I can generate the SQL programmatically without all of that repetitive code. I’ve even made the SQL a little safer by explicitly listing the columns that we are inserting data into (this has the side effect of no longer needing to insert a NULL into the id column).

Of course, this would just be a first step. The whole idea of generating SQL to run against the database is ugly. You’d really want to use DBIx::Class (or, at the very least, DBI) to insert the data directly into the database. And why mess around with raw XML when you can us something like IMDB::Film to do it?

At that point in my thought process I had an epiphany. You don’t need the database at all. The IMDB data changes all the time. Why take a local copy? Why not just use the web service directly with IMDB::Film (or perhaps WebService::IMDB – I haven’t used either of them so I have no strong opinions on this).

In general, I think that the original code was too complicated. Which made it hard to maintain. My version is better (but I am, of course, biased) but it can be made even better by using more from CPAN.

CPAN is Perl’s killer app. If you’re not using CPAN then you’re not using half the power of Perl.

What do you think? How would you write this program?

Update: A few people have mentioned the fact that I’m directly interpolating random data into my SQL statements – which is generally seen as a bad thing as it opens the door to SQL injection attacks. In my defence, I’d like to make a couple of points.

Firstly, the data I’m using isn’t just any old data. It’s data that is returned from the IMDB API. So it would be hard to use this for a malicious attack on the system (at least until Hollywood makes a film about the life of Bobby Tables).

Secondly, I am cleaning the data before using it. I’m escaping any single quotes in the input data. I think that removes the possibility of attack. I could be wrong though, if that’s the case, please let me know what I’m missing.

But, in general, I agree that this approach is dangerous. This is one of the major advantages of using DBI. By using bound parameters in your SQL statements you can remove possibility of SQL injection attacks.

Update 2: You can, of course, rely on Zefram to point out the issues here. His comment is well worth reading.

Other people (on IRC) raised the potential of other Unicode characters that databases treat as quote characters but that aren’t covered by my substitution.

Update 3: Here’s a local copy of the original code.

Saint Pierre and Miquelon

Does Saint Pierre and Miquelon mean anything to you? It’s a small French-owned territory just off the coast of Newfoundland.

Why would this be of any interest on a Perl blog? Well, it’s a French territory with it’s own ccTLD. And that ccTLD is .pm.

Ever since Perl Mongers started we’ve looked longingly at that TLD, thinking how cool it would be to own a .pm domain. But domain registration in .pm is run by the French registry, AFNIC and for at least the last thirteen years they have refused all registrations under that domain. This made many Perl Mongers very sad.

But that is about to change. It appears that from 6th December, AFNIC are going to open registrations under a number of their previously suspended domains – including .pm. I think you’ll need to be in the EU in order to register a .pm domain, but I don’t think that will be a huge problem.

And it’s not just for Perl Monger groups. You’ll also be able to have domains for your favourite Perl modules too (or, at least, the ones without ‘::’ in their names).

Which .pm domains do you have your eye on? And what are you going to do with it.

Maybe one year we should have YAPC::NA in Saint Pierre and Miquelon and YAPC::EU in Poland.

A Brief History of the LPW

In his opening remarks on Saturday, Mark Keating suggested that we might be at the tenth London Perl Workshop. That seemed unlikely to me, so I’ve done a little research.

And it seems that I was right. The first LPW was in 2004, which makes this year’s the eighth. In a way, I’m happy that it wasn’t the tenth, as we now have two years to ensure that the tenth LPW is celebrated appropriately.

Here’s a list of the LPWs so far. I’ve also included details of the talks I gave at each workshop – mainly so that I can disprove Mark when he claims that I always show up and run training.

It seems that the web sites for some of the earlier workshops have fallen off the internet. This makes me a little sad. If I’m wrong and it’s just that Google can’t find them, then please let me know.

1st LPW – 11 Dec 2004
Lanyrd link
At Imperial College. I gave a 20 minute talk about OO Perl.

2nd LPW – 26 Nov 2005
Lanyrd link
At City University. I gave a 20 minute talk on Databases and Perl.

3rd LPW – 9 Dec 2006
Lanyrd link
I think this was the first LPW at its current home of the University of Westminster. I can’t be sure as I wasn’t there. I have a good excuse though – I was on holiday celebrating my tenth wedding anniversary.

4th LPW – 1 Dec 2007
Lanyrd link
At the University of Westminster. I gave a training course on Beginning Perl.

5th LPW – 29 Nov 2008
Lanyrd link
At the University of Westminster. I gave the keynote (a history of london.pm as it was our tenth anniversary) and a training course on Web Programming.

6th LPW – 5 Dec 2009
Lanyrd link
At the University of Westminster. I gave the keynote (about marketing Perl) and a training course called “The Professional Programmer“.

7th LPW – 4th Dec 2010
Lanyrd link
At the University of Westminster (although not in the usual building). I gave a training course on Modern Web programming (i.e. Plack) and a talk on Roles and Traits in Moose.

8th LPW – 12 Nov 2011
Lanyrd link
At the University of Westminster. I gave a training course on Modern Core Perl.