I Wrote Some Perl

Ok, yes, I’ve been writing Perl for over twenty years. But Perl 5.26 was released this week and for the first time, my name is mentioned in the release notes. Because I have not one, but two fixes in this release of Perl.

The first is this commit which fixes a piece of documentation to make it clear that grep() returns a list, not an array.

The second is this commit which fixes some sample code so that it runs without warnings under use strict.

It’s a small start, I admit, but I have a taste for it now. In a years time, I hope to report that I have more than two commits in Perl 5.28.

And you can help too. Instructions on how to contribute are in the perlhack manual page. There is more information in perlhacktips and perlhacktut.

The people working on Perl all do a great job. But it’s a hard job and it might just get a little easier if more of us helped out.

Shaving Last.FM Yaks

Long-time readers might remember that I once had a bit of an obsession with aggregating web feeds on sites that I called “planets”. I wrote Perlanet to make this job easier and I registered the domain theplanetarium.org to host these planets.

The planets I built were of varying levels of usefulness – but of all of them, planet davorg was the vanity project. It was simply a way to aggregate all the web feeds that I produced. There were feeds from various blogs along with things like Flickr, Twitter and CPAN.

One of the things I liked about planets was that they were self-maintaining. Once you’ve configured a planet, it will just keep on running (well, as long as the cron job is running). If the web feeds they are aggregating have new content, the planet will have new content. And many of the feeds that powered planet davorg were still running.

But last weekend I found a couple of  problems with it. Firstly, it looked like it was designed by an idiot. Which, to be fair, it was. Web design was never my strong point. But we have Bootstrap now, so there’s no excuse for web sites to look that bad. So that’s how I spent the first hour or  so – slapping a bit of Bootstrap paint onto the site. I think it now looks acceptable.

The second problem was that not all of the feeds were still running some of them (Delicious, for example) were just dead. I can’t remember the last time I posted anything to Delicious – can you? So I spent some time tweaking and fixing the feeds (replacing CPAN with MetaCPAN, for example). Most of this was easy.

However, one feed was a problem. My Last.fm feed was dead. For over ten years I’ve been “scrobbling” ever song I’ve listened to and one of the feeds I was aggregating was that list. According to this page on their web site, my feed is supposed to be at http://ws.audioscrobbler.com/1.0/user/davorg/recenttracks.rss – and that was the URL in my Perlanet configuration. But it doesn’t work. It returns a 404 error.

I tried to contact someone at Last.fm to find out what was going on, but I haven’t got any kind of response. It looks like they’ve been running on a skeleton staff since CBS took them over and they don’t seem to have the time to support their users (not, I suspect, a recipe for long-term success!)

But there was one possibility. You can get the same data through their API. And some quick experimentation, revealed that their API hasn’t been turned off.

And CPAN has Net::LastFM which will make the API calls for me. Ok, so it hasn’t been updated since 2009, but it still works (I’ve just noticed that there’s also Net::LastFMAPI which is a little more recent).

So it just took a small amount of work to write a little program which grabs uses the Last.fm API to get some JSON that contains the information that I want and convert it to an Atom feed. In case this is useful to anyone else, I’ve put the code on Github. Please let me know if you do anything interesting with it.

And if anyone from Last.fm reads this. Please either turn the web feeds back on or remove the documentation that still claims they exist.

Two Weekend Projects

It’s far too long since I’ve posted anything here. I’ve no excuse really. Following the end of my contract in Canary Wharf, I was off work for seven weeks. OK, I was on holiday for two of those weeks, but that still leaves five weeks when I could have been doing something constructive, but actually just spent a lot of time watching Netflix.

But there were a couple of things I did. Neither of them took more than a few hours, but I thought it was worth writing them up – if only to give an example of a couple of really useful (to me, at least) things that I was able to build really quickly with Perl.

Cooking Vinyl

If you were a music fan in the 1990s, then there’s a good chance that you own at least one album released on Cooking Vinyl Records. At times, it seemed like pretty much every album I bought was released by them. Back in 2005, I wrote a blog post where I tried to explain how much they meant to me.

In particular, they produced a series of compilation albums that introduced me to so many of my favourite acts. Ten years ago, I tried to find a definitive list of all of the songs and artists which appeared on those compilation albums. As I failed to find one, I created it myself. At the time, it was a static list of albums which listed the tracks and artists on each of the albums. For ten years I’ve had it in the back of my head to do something more interesting with the data. A few weeks ago, I finally got round to it.

As I said, the original page just had a list of albums with artists and song titles. That’s useful, but it would be more interesting to be able to cross-reference the data in various ways – list all of the albums that an artist appeared on, for example. And for that, we need a database.

If you’ve come on any of my database training course over the last ten years, you’ll know that I use a CD database example. The model that I use is pretty simple and, in particular, it assumes that all tracks on a given CD are by the same artist. As I say in the class “various artists compilations don’t exist in this simplified universe”. Obviously, that’s not going to work in this example. So I needed to come up with another database model.

Compilation album data model
Compilation album data model

Here’s the data model I designed. You’ll see that it all hinges on the track table. A track is an instance of a particular song, recorded by a particular album appearing on a particular album. The only extra data on the track table is the “number” column which allows us to declare the order in which tracks appear on an album.

Advanced students will have spotted an omission from the data model. An artist might well have different versions of a song. There could be the original version, an edited single version and many live or remixed versions. So actually, we could add a “recording” table and it’s the recording that appears on an album. That’s, perhaps, an enhancement for the future.

Having designed the database the rest of the code just falls out really. I already had a data file so it was just a case of parsing that and inserting the data into an SQLite database. DBIx:Class (and, particularly the find_or_create method) makes this trivial. I then wrote another program that generated the web site using the Template Toolkit. Nothing complex there at all.

The site is at http://cookingvinyl.dave.org.uk/. And all of the code is on Github. It could do with being made a bit prettier – perhaps I can add some pictures.

Why not have a look. And check out some Cooking Vinyl recordings.

Tower Bridge

I’ve lived in London for thirty-five years. And in all that time I have never seen Tower Bridge opening. Oh, I’ve seen it when it’s  open, but I’ve never been in the right place at the right time to see it actually opening. As a Londoner, that’s a matter of supreme embarrassment to me.

But the office I’m working in currently is three minutes walk from Tower Bridge. All I need is a way to get a notification a few minutes before the bridge lifts. Surely, there must be a way to get that?

Sadly, no. The Tower Bridge web site has a page listing the upcoming lifts, but no service that would send any kind of notification. So, once again, it was up to me to provide one. I asked the London Perl Mongers on IRC what would be a good way to get notifications of upcoming events on an Android phone and Ilmari pointed out that the obvious method was to create a calendar that could be read by the calendar app on my phone.

So that’s what I’ve done. I use Web::Query to scrape the data from the Tower Bridge web site (doing some over-complicated madness to account for the fact that they are missing the year from their dates) and then create a .ics file using Date::ICal and Data::ICal. I also create a JSON version of the data in case it’s useful to anyone (if it is, please let me know).

The site is at http://towerbridge.dave.org.uk/ and (of course) the code is on Github.


So, there you are. Two (hopefully) little projects that I threw togther in a very small amount of time using the power of Perl. Please let me know if you find either of them useful.

Hacking Symbol::Approx::Sub

In October, for (I think) the second year, Digital Ocean ran Hacktoberfest – a campaign encouraging people to submit pull requests to Github repos in exchange for free t-shirts.

A few of us thought that this might be a good way to do a small bit of easy Perl advocacy, so we tagged some issues on Perl repos with “hacktoberfest” and waited to see what would happen.

I created a few issues on some of my repos. But the one I concentrated on most was symbol-approx-sub. This is a very silly CPAN module that allows you to make errors in the names of your subroutines. I wrote it many years ago and there’s an article I wrote for The Perl Journal explaining why (and how) I did it.

Long-time readers might remember that in 2014 I wrote an article for the Perl Advent Calendar about Perl::Metrics::Simple. I used Symbol::Approx::Sub as the example module in the article and it showed me that the module had some depressingly high complexity scores and I planned to get round to doing something about that. Of course, real life got in the way and Symbol::Approx::Sub isn’t exactly high on my list of things to do, so nothing happened. Until this October.

Over the month, a lot of changes were made to the module. I probably did about half of it and the rest was pull requests from other people. The fixes include:

  • Better tests (and better test coverage – it’s now at 100%)
  • Using Module::Load to load module
  • Using real exceptions to report errors
  • Updating the code to remove unnecessary ampersands on subroutine calls
  • Fixed a couple of long-term bugs (that were found by the improved tests)
  • Breaking monolithic subroutines down

And I’m pretty happy with how it all went. The work was mostly completed in October and this morning I finally got round to doing the last couple of admin-y bits and version 3.0.0 of Symbol::Approx::Sub is now on the way to CPAN. You still shouldn’t use it in production code though!

Thanks to everyone who submitted a pull request. I hope you did enough to earn a free t-shirt.

If you want to get involved in fixing or improving other people’s code, there’s the 24 Pull Request Challenge taking place over Advent. Or for more Perl-specific code, there’s the CPAN Pull Request Challenge.

p.s. In the Advert Calendar article, I linked to the HTML version of the results. For comparison, I’ve also put the new results online. It’s a pretty good improvement.

Code Archaeology

Long-time readers will have seen some older posts where I criticised Perl code that I’ve found in various places on the web. I thought it was about time that I admitted to some of the dodgier corners of my programming career.

You may know that one of my hobbies is genealogy. You might also know that there’s a CPAN module for dealing with GEDCOM files and a mailing list for the discussion of the intersection of Perl and genealogy. The list is usually very quiet, but it woke up briefly a few days ago when Ron Savage asked for help reconstructing some old genealogy software of his that had gone missing from his web site. Once he recovered the missing files, I noticed that in the comments he credited a forgotten program of mine for giving him some ideas. This comment included a link to my web site which (embarrassingly) was now a 404. I don’t link to leave broken links on the web, so I swiftly put a holding page in place on my site and went off to find the missing directory.

It turns out that the directory had been used to distribute a number of my early ventures into open source software. The Wayback Machine had many of them but not everything. And then I remembered that I had full back-ups of some earlier versions of my web site squirrelled away somewhere and it only took an hour or so to track them down. So that I don’t mislay them again, I’ve put them all on Github – in an appropriately named repository.

I think that most of this code dates from around 2000-2003. There’s evidence that a lot of it was stored in CVS or Subversion at some time. But the original repositories are long gone.

So, what do we have there? And just how bad is it?

There’s a really old formmail program. And it immediately becomes apparent that when I wrote, not only did I not know as much Perl as I thought, but I was pretty sketchy on the basics of internet security as well. I can’t remember if I ever put it live but I really hope not.

Then there’s the “ms” suite of programs. My freelancing company is called Magnum Solutions and it amused me when I realised that people could potentially assume that this code came from Microsoft. I don’t think anyone ever did. Here, you’ll find the beginnings of what later became the nms project – but the nms versions are far more secure.

There’s the original slavorg bot from the #london.pm IRC channel. The channel still has a similar bot, but the code has (thankfully) been improved a lot since this version.

Then there’s something just called spam. I think I was trying to get some stats on how much spam I was getting.

There are a couple of programs that date from my days wrangling Sybase in the City of London. There’s a replacement for Sybase’s own “isql” command line program. My version is called sqpl. I can’t remember what I didn’t like about isql, or how successful my replacement was. What’s interesting about this program is that there are two versions. One uses DBI to connect to the database, but the other uses Sybase’s own proprietary “CTlib” connection library. Proof, I guess that I was talking to databases back when DBI was too new and shiny to be trusted in production.

The other Sybase-related program is called sybserv. As I recall, Sybase uses a configuration file to define the connection details of the various servers that any given client can connect to. But the format of that file was rather opaque (I seem to remember the IP address being stored as a packed integer in some cases). This program parses this file and presents the data in a far more readable format. I remember using it a lot. I believe it’s the only Perl program I’ve ever written that uses formats.

Then there’s toc. That reads an HTML document, looking for any headers. It then builds a table of contents based on those headers and inserts it into the document. I think it’ll still work.

The final program is webged. This is the one that Ron got inspiration from. It parses a GEDCOM file and turns it into a web site. It works in two modes, you can either pre-generate a whole site (that’s the sane way to use it) or you can use it as a CGI program where it produces each page on the fly as it is requested. I remember that parsing the GEDCOM file was unusably slow, so I implemented an incredibly naive caching mechanism where I stored a Data::Dumper version of the GEDCOM object and just “eval”ed that. I was incredibly proud of myself at the time.

The code in most of these programs is terrible. Or, at least, it’s very much a product of its time. I can forgive the lack of “use warnings” (Perl 5.6 wasn’t widely used back when this code was written) as they all have “-w” instead. But it’s the use of ampersands on most of the subroutine calls that makes me cringe the most.

But please have fun looking at the code and pointing out all of the idiocies. Just don’t put any of the CGI programs on a server that is anywhere near the internet.

And feel free to share any of your early code.