The Joy of Prefetch

If you heard me speak at YAPC or you’ve had any kind of conversation with me over the last few weeks then it’s likely you’ve heard me mention the secret project that I’ve been writing for my wife’s school.

To give you a bit of background, there’s one afternoon a week where the students at the school don’t follow the normal academic timetable. On that afternoon, the teachers all offer classes on wider topics. This year’s topics include Acting, Money Management and Quilt-Making. It’s a wide-ranging selection. Each student chooses one class per term.

This year I offered to write a web app that allowed the students to make their selections. This seemed better than the spreadsheet-based mechanisms that have been used in the past. Each student registers with their school-based email address and then on a given date, they can log in and make their selections.

I wrote the app in Dancer2 (my web framework of choice) and the site started allowing students to make their selections last Thursday morning. In the run-up to the go-live time, Google Analytics showed me that about 180 students were on the site waiting to make their selections. At 7am the selections part of the site went live.

And immediately stopped working. Much to my embarrassment.

It turned out that a disk failed on the server moments after the site went live. It’s the kind of thing that you can’t predict.But it leads to lots of frustrated teenagers and doesn’t give a very good impression.

To give me time to rebuild and stress-test the site we’ve decided to relaunch at 8pm this evening. I’ve spent the weekend rebuilding the app on a new (and more powerful) server.

I’m pretty sure that the timing of the failure was coincidental. I don’t think that my app caused the disk failure. But a failure of this magnitude makes you paranoid, so I spent a lot of yesterday tuning the code.

The area I looked at most closely was the number of database queries that the app was making. There are two main actions that might be slow – the page that builds the list of courses that a student can choose from and the page which saves a student’s selections.

I started with the first of these. I set DBIC_TRACE to 1 and fired up a development copy of the app. I was shocked to see the app run about 120 queries – many of which were identical.

Of course I should have tested this before. And, yes, it’s an idiotic way to build an application. But I’m afraid that using an ORM like DBIx::Class can make it all too easy to write code like this. Fortunately, it makes it easy to fix it too. The secret is “prefetch”.

“Prefetch” is an option you can pass to the the “search” method on a resultset. Here’s an example of the difference that can make.

There are seven year groups in a British secondary school. Most schools call them Year 7 to Year 13 (the earlier years are in primary school). Each year group will have a number of forms. So there’s a one to many relationship between years and forms. In database terms, the form table holds a foreign key to the year table. In DBIC terms, the Year result class has a “has_many” relationship with the Form result class and the Form result class has a “belongs_to” relation with the Year result class.

A naive way to list the years and their associated forms would look like this:

Run code like that with DBIC_TRACE turned on and you’ll see the proliferation of database queries. There’s one query that selects all of the years and then for each year, you get another query to get all of its associated forms.

Of course, if you were writing raw SQL, you wouldn’t do that. You’d write one query that joins the year and form tables and pulls all of the data back at once. And the “prefetch” option gives you a way to do that in DBIC as well.

All we have done here is to interpose a call to “search” which adds the “prefetch” option. If you run this code with DBIC_TRACE turned on, then you’ll see that there’s only one database query and it’ll be very similar to the raw SQL that you would have written – it brings back the data from both of the tables at the same time.

But that’s not all of the cleverness of the “prefetch” option. You might be wondering what the difference is between “prefetch” and the rather similar-sounding “join” option. Well, with “join” the columns from the joined table would be added to your main table’s result set. This would, for example, create some kind of mutant Year resultset object that you could ask for Form data using calls like “get_column(‘’)”. [Update: I was trying to simplify this explanation and I ended up over-simplifying to the point of complete inaccuracy – joined columns only get added to your result set if you use the “columns” or “select/as” attributes. And the argument to “get_column()” needs to be the column name that you have defined using those options.] And that’s useful sometimes, but often I find it easier to use “prefetch” as that uses the data from the form table to build Form result objects which look exactly as they would if you pulled them directly from the database.

So that’s the kind of change that I made in my code. By prefetching a lot of associated tables I was able to drastically cut down the number of queries made to build that course selection page. Originally, it was about 120 queries. I got it down to three. Of course, each of those queries is a lot larger and is doing far more work. But there’s a lot less time spent compiling SQL and pulling data from the database.

The other page I looked at – the one that saves a student’s selections – wasn’t quite so impressive. Originally it was about twenty queries and I got it down to six.

Reducing the number of database queries is a really useful way to make your applications more efficient and DBIC’s “prefetch” option is a great tool for enabling that. I recommend that you take a close look at it.

After crowing about my success on Twitter I got a reply from a colleague pointing me at Test::DBIC::ExpectedQueries which looks like a great tool for monitoring the number of queries in your app.

Send to Kindle

YAPC Europe 2015: A Community is a Home

I’m in Granada, Spain for the 2015 “Yet Another Perl Conference” (YAPC). The three-day conference finished about an hour and a half ago and, rather than going to a bar with dozens of other attendees, I thought I would try to get my impressions down while it’s all still fresh in my mind.

YAPC is a grass-roots conference. It’s specifically planned so that it will be relatively cheap for attendees. This year I think the cost for an attendee was 100 EUR (I’m not sure as I was a speaker and therefore didn’t need to buy a ticket). That’s impressively low cost for such an impressive conference. Each year since 2000 (when the first European YAPC took place in London) 250 to 300 Perl programmers gather for their annual conference in a different European city.

Day 0

Although the conference started on Wednesday, there were a few tutorials over the two days before that. On Tuesday I ran a one-day course on DBIx::Class, Perl’s de facto standard ORM. There were slightly fewer students than I would have liked, but they were an enthusiastic and engaged group.

The night before the conference was the traditional pre-conference meet-up. People generally arrive during the day before the conference starts and the local organisers designate a bar for us all to meet in. This year, Eligo (a recruitment company with a strong interest in placing Perl programmers) had arranged to buy pizza and beer for all of the attendees at the conference venue and we spent a pleasant evening catching up with old friends.

I should point out that I’m only going to talk about talks that I saw. There were four tracks at the conference which meant that most of the time I was having to make difficult choices about which talk to see. Other people blogging about the conference will, no doubt, have a different set of talks to discuss.

Day 1

The conference had a keynote at the start and end of each day. They all sounded interesting, but I was particularly interested in hearing Tara Andrews who opened the first day. Tara works in digital humanities. In particular, she uses Perl programs which track differences between copies of obscure medieval manuscripts. It’s a million miles from what you usually expect Perl programmers to be doing and nicely illustrates the breadth of Perl’s usage.

I saw many other interesting talks during the day. The one that stood out for me was Jose Luis Martinez talking about Paws. Paws wants to be the “official” Perl SDK for all of Amazons Web Services. If you know how many different services AWS provides, then you’ll realise that this is an impressive goal – but it sounds like they’re very nearly there.

Lunch was run on an interesting model. Granada is apparently the only remaining place in Spain where you still get served tapas whenever you order a drink in a bar. So when you registered for the conference, you were given some tokens that could be exchanged for a drink and tapas at ten local bars. It was a great way to experience a Granada tradition and it neatly avoided the huge queues that you often get with more traditional conference catering.

At the end of the day, everyone is back in the largest room for the lightning talks. These talks are only five minutes long – which makes them a good way for new speakers to try public speaking without having to commit for a longer talk. They are also often used by more experienced speakers to let their hair down a bit and do something not entirely serious. This session was the usual mixture of talks, which included me giving a talk gently ribbing people who don’t keep their Perl programming knowledge up to date.

The final session of the day was another keynote. Curtis Poe talked about turning points in the story of Perl and the Perl community. Two points that he made really struck home to me (both coming out the venerable age of Perl) – firstly Perl is language that is “Battle-Tested” and that isn’t going anywhere soon; and secondly the Perl community has really matured over the last few years and is now a big part of Perl’s attraction. This last point was apparently reiterated in a recent Gartner report on the relative merits of various programming languages.

Wednesday evening saw an excuse for more socialising with the official conference dinner. This was a buffet affair with around the swimming pool of a swanky Granada hotel. Conference attendees paid nothing for this event and the food and drink was still flowing freely when I slunk off back to my hotel room.

Day 2

Thursday morning started with another Perl community tradition – the “State of the Velocirapter” talk. This is an annual talk that focusses on the Perl 5 community and its achievements (in comparison with Larry Wall’s “State of the Onion” talk which generally concentrates on the Perl 6 project). This year, Matt Trout has handed over responsibility for this talk to Sawyer, who was in a more reflective mood than Matt has often been. Like Curtis, the previous evening, Sawyer has noticed how the Perl community has matured and has reached the conclusion that many of us love coming to YAPC because the community feels like our home.

Next up was Jessica Rose talking about The Cult of Expertise. This was less a talk and more a guided discussion about how people become recognised as experts and whether that designation is useful or harmful in the tech industry. It was a wide-ranging discussion, covering things like imposter syndrome and the Dunning-Kruger effect. It was rather a departure for such a technical conference and I think it was a very successful experiment.

The next talk was very interesting too. As I said above, the European YAPC has 250 to 300 attendees each year. But in Japan, they run a similar conference which, this year, had over 2,000 attendees. Daisuke Maki talked about how he organised a conference of that size. A lot of what he said could be very useful for future conference organisers.

After lunch was the one session where I had no choice. I gave my talk on “Conference Driven Publishing” during the second slot. It wasn’t at all technical but I think I got some people interested in my ideas of people writing their own Perl books and publishing them as ebooks.

At the end of the day, we had another excellent session of lightning talks and another keynote – this time from Xavier Noria, a former member of the Perl community who switched to writing Ruby several years ago. He therefore had an interesting perspective on the Perl community and was happy to tell us about some of Perl’s features that fundamentally shaped how he thought about software.

There was still one more session that took us well into the evening. There is a worry that we aren’t getting many new young programmers into the Perl community, so Andrew Solomon of GeekUni organised a panel discussion on how to grow the community. A lot of ideas where shared, but I’m not sure that any concrete plans came out of it.

Day 3

And so to the final day. The conference started early with a keynote by Stevan Little. The theme of the conference was “Art and Engineering” and Stevan studied art at college rather than computer science, so he talked about art history and artistic techniques and drew some interesting comparisons with the work of software development. In the end he concluded that code wasn’t art. I’m not sure that I agree.

I then saw talks on many different topics – and example of a simple automation program written in Perl 6, a beginners guide to who’s who and what’s what in the Perl community, an introduction to running Perl on Android, a couple of talks on different aspects of running Perl training courses, one on the Perl recruitment market and one on a simple git-driven tool for checking that you haven’t made a library far slower when you add features. All in all, a pretty standard selection of topics for a day at YAPC.

The final keynote was from Larry Wall, the man who created Perl in 1987 and who has been steering the Perl 6 project for the last fifteen years. This was likely to include some big news. At FOSDEM in February, Larry announced his intention to release a beta test version of Perl 6 on his birthday (27 September) and version 1.0 (well, 6.0, I suppose) by Christmas. There were some caveats as there were three major pieces of work that were still needed.

Larry’s talk compared Perl 5 and Perl 6 with The Hobbit and The Lord of the Rings respectively – apparently Tolkien also spent 15 years working on The Lord of the Rings – but finished by announcing that the work on the three blockers was all pretty much finished so it sounds like we really can expect Perl 6 by Christmas. That will be a cause for much celebration in the Perl community.

After Larry, there was a final session of lightning talks (including a really funny one that was a reaction to my lightning talk on the first day) and then it only remained to give all of the organisers and helpers a standing ovation to thank them for another fabulous YAPC.

Next year’s conference will be in Cluj-Napoca. I’m already looking forward to it. Why not join us there?

Send to Kindle

Beginners Perl Tutorial

A few weeks ago I got an interesting email from someone at Udemy. They were looking for someone to write a beginners Perl tutorial that they would make available for free on their web site. I think I wasn’t the only person that they got in touch with but, after a brief email conversation, they asked me to go ahead and write it.

It turned out to be harder that I thought it would be. I expected that I could write about 6,000 words over a weekend. In the end it took two weekends and it stretched to over 8,000 words. The problem is not in the writing, it’s in deciding what to omit. I’m sure that if you read it you’ll find absolutely essential topics that I haven’t included – but I wonder what you would have dropped to make room for them.

But eventually I finished it, delivered it to them (along with an invoice – hurrah!) and waited to hear that they had published it.

Yesterday I heard that it was online. Not from Udemy (they had forgotten to tell me that it was published two weeks ago) but from a friend.

Unfortunately, some gremlins had crept in at some point during their publication pipeline. Some weird character substitutions had taken place (which had disastrous consequences for some of the Perl code examples) and a large number of paragraph breaks had vanished. But I reported those all to Udemy yesterday and I see they have all been fixed overnight.

So finally I can share the tutorial with you. Please feel free to share it with people who might find it useful.

Although it’s 8,000 words long, it really only scratches the surface of the language. Udemy have added a link to one of their existing Perl courses, but unfortunately it’s not a very good Perl course (Udemy don’t seem to have any very good Perl courses). I understand why they have done that (that is, after all, the whole point of commissioning this tutorial – to drive more people to pay for Perl courses on tutorial) but it’s a shame that there isn’t anything of higher quality available.

So there’s an obvious hole in Udemy’s offerings. They don’t have a high quality Perl course. That might be a hole that I try to fill when I next get some free time.

Unless any other Perl trainers want to beat me to it.

Oh, and please let me know what you think of the tutorial.

Send to Kindle

Driving a Business with Perl

I’ve been a freelance programmer for over twenty years. One really important part of the job is getting paid for the work I do. Back in 1995 when I started out there wasn’t all of the accounting software available that you get now and (if I recall correctly) the little that was available was all pretty expensive stuff.

At some point I thought to myself “I don’t need to buy one of these expensive systems, I’ll write something myself”. So I sat down and sketched out a database schema and wrote a few Perl programs to insert data about the work I had done and generate invoices from that data.

I don’t remember much about the early versions. I do remember coming to the conclusion that the easiest way to generate PDFs of the invoices was using LaTex and then wasting a lot of time trying to bend LaTeX to my will. I got something that looked vaguely ok eventually, but it was always incredibly painful if I ever needed to edit it in any way. These days, I use wkhtmltopdf and my life is far easier. I understand HTML and CSS in a way that I will never understand LaTeX.

Why am I telling you this, twenty years after I started using this code? Well, during this last week, I finally decided it was time to put the code on Github. There were two reasons for this. Firstly, I thought that it might be useful for other people. And secondly, I’m ashamed to admit that this is the first time that the code has ever been put under any kind of version control (and, yes, this is an embarrassing case of “do as I say, not as I do“). I have no excuses. The software I used to drive my business was in a few files on a single hard drive. Files that I was hacking away at with gay abandon when I thought they needed changing. I am a terrible role model.

Other than all the obvious reasons, I’m sad that it wasn’t in version control as it would have been interesting to trace the evolution of the software over the last twenty years. For example, the database access started as raw DBI, spent a brief time using Class::DBI and at some point all got moved to DBIx::Class. It’s likely that I wasn’t using the Template Toolkit when I started – but I can’t remember what I was using in its place.

Anyway, the code is there now. I don’t give any guarantees for its quality, but it does the job for me. Let me know if you find any of it interesting or useful (or, even, laughable).

p.s. An interesting side effect of putting it under (public) version control – since I uploaded it to Github I have been constantly tweaking it. The potential embarrassment of having my code available for anyone to see means that I’ve made more improvements to it in the last week that I have in the previous five years. I’m even considering replacing all the command line programs with a Dancer app.

p.p.s. I actually use FreeAgent for all my accounting these days. It’s wonderful and I highly recommend it. But I still use my own system to generate invoices.

Send to Kindle

Culling My Modules

About a year ago, I dabbled briefly with Travis CI. I even gave a talk about my experiences. The plan was that I would start to use it for all of my code. But real life intervened and I never got round to getting any further with that project.

This weekend, I finally made some progress. I added a .travis.yml file to all of my Github repositories that hold CPAN modules. I even fed the details through to Coveralls so I get test coverage reports. From there it was a simple step to building a dashboard that monitors the health of all of my CPAN modules.

And it’s not a pretty picture. You’ll see a lot of grey boxes on that page, indicating that Travis couldn’t run the tests or, worse, red boxes showing that the tests failed for some reason.

Yesterday I made a few quick fixes to some of the modules (particularly in the WWW::Shorten namespace) and a couple more of them now work. But I want to work out how much effort it’s worth investing in the ones that are still failing. And, widening my scope a little, I’ve decided to take a close look at my CPAN modules and work out which ones are worth keeping and which ones I should just delete.

For example, twelve years ago I was really excited about the idea of AudioFile::Info. Most people were ripping music to MP3s, but I wasn’t following the crowd and was using Ogg Vorbis instead. AudioFile::Info and its friends was an attempt to make it easy to extract information from audio files no matter which format they were it. I suppose it was a kind of DBI for ID3 tags. But twelve years on, does anyone really care about that any more? I switched all of my music collection to MP3 years ago. If I recall correctly, the AudioFile::Info modules use a convoluted hand-crafted plugin system which never worked as well as it should. I could probably switch them to use some kind of plugin architecture from CPAN. But is it worth the effort?

Then there is Guardian::OpenPlatform::API – a Perl wrapper around the Guardian’s API. I believe they changed the API end-point several years ago so the module doesn’t even work. But the fact that I’ve had no complaints about that, probably indicates that no-one has ever used it.

It’s a similar story for Net::Backpack. To be honest, I have no idea whether or not it still works. Is Backpack still running? Ok, I’ve just checked and they’re no longer offering it to new customers. But if I’m not a paying customer is there any way I can test that it still works?

Finally, there is the WWW::Shorten family of modules. I released a module called WWW::MakeAShorterLink back in 2002, but it was Iain Truskett who realised that there should be a family of modules around the (at the time new) URL-shortening industry. I took over the module when Iain passed away and I’ve been maintaining it ever since. But it’s a real pain to maintain. The URL-shortening industry changes really quickly. For a long time, new services were popping up all of the time (and many of them closed down just as quickly). I haven’t been anywhere near quick enough at releasing versions that keep up with all the changes. I suspect that at least a couple of the current test failures are down to services that have closed down. I should probably investigate those over the next few days.

I don’t think WWW::Shorten is in any danger of going away (but I need to find a better way to keep abreast of changes in the industry) but the other modules I’ve mentioned here (AudioFile::Info::*, Guardian::OpenPlatform::API and Net::Backpack) are on borrowed time. If you’re using them and you’d like to see new versions of them in the future then let me know. If you’d like to take over maintenance, then that would be even better.

If I don’t hear from anyone (and I strongly suspect that I won’t) then I’ll be removing them from CPAN in a couple of months time.

Send to Kindle

Mailing Lists

Over the years I’ve set up a few mailing lists for the discussion of various projects I’ve been involved with. There’s always an expectation that mailing lists will flourish without much input from me. But it never works out like that.

The truth is that most mailing lists just quietly die. And, in many cases, they end up attracting a lot of spam – which the owner of the list has to check on a semi-regular basis on the off-chance that there’s something interesting or useful in amongst the crap. There never is.

So I’ve decided to close a few mailing lists that didn’t seem to be going anywhere. I don’t suppose anyone will miss them, but I’ve taken a copy of the archives and I may do something with them at some point in the future.

The lists that I have removed are:


A couple of these lists have received slightly special treatment. The xml-feed list is advertised as the support email address for XML::Feed. I’ve redirected that address so that mail now comes to me. Hopefully my spam filters will ensure that I’m not overrun with spam from it before I work out a more permanent solution.

The other list that has been treated differently is the training-news one. That was set up so that people could get information about upcoming training courses that I would be running. I still think that’s useful, so I’ve replaced it with a new list (run by MailChimp). If you’re interested in keeping in touch with what I’m doing then please sign up to the new list by entering your email address below. (The same form will now appear in the sidebar on every page of this site.)

Sign up here for occasional email about stuff I'm doing with Perl, information about upcoming talks and training courses and other updates.

(I promise not to spam you.)

So, there you are. I’ve removed a few moribund mailing lists. I hope that hasn’t ruined anyone’s day.

Send to Kindle

Building TwittElection

I was asked to write a guest post for the Built In Perl blog. I wrote something about how I built my site, TwittElection, for the recent UK general election.

In the UK we have just had a general election. Over the last few weeks many web sites have sprung up to share information about the campaign and to help people decide how to vote. I have set up my own site called TwittElection and in this article I’d like to explain a little about how it works.

But why not go over to Built In Perl and read the whole thing there.

Incidentally, on 13th June, I’ll be giving a talk about TwittElection at this year’s OpenTech conference. If you’re interested in the positive impact that technology can have on society then you’ll, no doubt, find OpenTech very interesting.

Send to Kindle

DBIC Training in Granada

It’s been a while since I’ve run a training course alongside a YAPC. By my calculations, the last time was Riga in 2011. But I’ve been talking to the organisers of this year’s conference and we have plan.

I’m going to be running a one-day introductory course on DBIx::Class before the conference (I think it’ll be on 1st September, but that’s not 100% certain yet). Full details are on the conference web site. There’s an early-bird price of 150 Euro and the full price is 200 Euro. The web site says that the early-bird price finishes today, but I wouldn’t be at all surprised if that gets extended for a few days at least.

Of course, readers of this blog will all already be experts in DBIC and won’t need this course. But I’m sure that most of you will have a colleague who would benefit from… well… a refresher on who DBIC works. Why not see if your company will pay for them to attend the course :-)

The course size is limited. So you might want to think about booking soon.

Hope to see some of you in Granada.

Two updates:

  1. The date has now been confirmed as 1st September.
  2. The early-bird pricing has been extended until 1st June.
Send to Kindle

Subroutines and Ampersands

I’ve had this discussion several times recently, so I thought it was worth writing a blog post so that I have somewhere to point people the next time it comes up.

Using ampersands on subroutine calls (&my_sub or &my_sub(...)) is never necessary and can have potentially surprising side-effects. It should, therefore, never be used and should particularly be avoided in examples aimed at beginners.

Using an ampersand when calling a subroutine has three effects.

  1. It disambiguates the code so the the Perl compiler knows for sure that it has come across a subroutine call.
  2. It turns off prototype checking.
  3. If you use the &my_sub form (i.e. without parentheses) then the current value of @_ is passed on to the called subroutine.

Let’s look at these three effects in a little more detail.

Disambiguating the code is obviously a good idea. But adding the ampersand is not the only way to do it. Adding a pair of parentheses to the end of the call (my_sub()) has exactly the same effect. And, as a bonus, it looks the same as subroutine calls do in pretty much every other programming language ever invented. I can’t think of a single reason why anyone would pick &my_sub over my_sub().

I hope we’re agreed that prototypes are unnecessary in most Perl code (perhaps that needs to be another blog post at some point). Of course there are a few good reasons to use them, but most of us won’t be using them most of the time. If you’re using them, then turning off prototype checking seems to be a bad idea. And if you’re not using them, then it doesn’t matter whether they’re checked or not. There’s no good argument here for  using ampersands.

Then we come to the invisible passing of @_ to the called subroutine. I have no idea why anyone ever thought this was a good idea. The perlsub documentation calls it “an efficiency mechanism” but admits that is it one “that new users may wish to avoid”. If you want @_ to be available to the called subroutine then just pass it in explicitly. Your maintenance programmer (and remember, that could be you in six months time) will be grateful and won’t waste hours trying to work out what is going on.

So, no, there is no good reason to use ampersands when calling subroutines. Please don’t use them.

There is, of course, one case where ampersands are still useful when dealing with subroutines – when you are taking a reference to an existing, named subroutine. But that’s the only case that I can think of.

What do you think? Have I missed something?

It’s unfortunate that a lot of the older documentation on CPAN (and, indeed, some popular beginners’ books) still perpetuate this outdated style. It would be great if we could remove it from all example code.

Send to Kindle

Modern Perl Articles

Back in 2011 I wrote a series of three articles about “Modern Perl” for Linux Format. Although I mentioned all three articles here as they were published, I didn’t post the actual contents of the articles as I wasn’t sure about the copyright situation.

But now I suspect that enough time has passed that copyright is no longer going to be an issue, so I’ve added the full text of the articles to this site. The articles are all about writing a simple web application to track your reading. They use DBIx::Class and Dancer.

Let me know if you find them interesting or useful.

Send to Kindle