The Joy of Prefetch

If you heard me speak at YAPC or you’ve had any kind of conversation with me over the last few weeks then it’s likely you’ve heard me mention the secret project that I’ve been writing for my wife’s school.

To give you a bit of background, there’s one afternoon a week where the students at the school don’t follow the normal academic timetable. On that afternoon, the teachers all offer classes on wider topics. This year’s topics include Acting, Money Management and Quilt-Making. It’s a wide-ranging selection. Each student chooses one class per term.

This year I offered to write a web app that allowed the students to make their selections. This seemed better than the spreadsheet-based mechanisms that have been used in the past. Each student registers with their school-based email address and then on a given date, they can log in and make their selections.

I wrote the app in Dancer2 (my web framework of choice) and the site started allowing students to make their selections last Thursday morning. In the run-up to the go-live time, Google Analytics showed me that about 180 students were on the site waiting to make their selections. At 7am the selections part of the site went live.

And immediately stopped working. Much to my embarrassment.

It turned out that a disk failed on the server moments after the site went live. It’s the kind of thing that you can’t predict.But it leads to lots of frustrated teenagers and doesn’t give a very good impression.

To give me time to rebuild and stress-test the site we’ve decided to relaunch at 8pm this evening. I’ve spent the weekend rebuilding the app on a new (and more powerful) server.

I’m pretty sure that the timing of the failure was coincidental. I don’t think that my app caused the disk failure. But a failure of this magnitude makes you paranoid, so I spent a lot of yesterday tuning the code.

The area I looked at most closely was the number of database queries that the app was making. There are two main actions that might be slow – the page that builds the list of courses that a student can choose from and the page which saves a student’s selections.

I started with the first of these. I set DBIC_TRACE to 1 and fired up a development copy of the app. I was shocked to see the app run about 120 queries – many of which were identical.

Of course I should have tested this before. And, yes, it’s an idiotic way to build an application. But I’m afraid that using an ORM like DBIx::Class can make it all too easy to write code like this. Fortunately, it makes it easy to fix it too. The secret is “prefetch”.

“Prefetch” is an option you can pass to the the “search” method on a resultset. Here’s an example of the difference that can make.

There are seven year groups in a British secondary school. Most schools call them Year 7 to Year 13 (the earlier years are in primary school). Each year group will have a number of forms. So there’s a one to many relationship between years and forms. In database terms, the form table holds a foreign key to the year table. In DBIC terms, the Year result class has a “has_many” relationship with the Form result class and the Form result class has a “belongs_to” relation with the Year result class.

A naive way to list the years and their associated forms would look like this:

Run code like that with DBIC_TRACE turned on and you’ll see the proliferation of database queries. There’s one query that selects all of the years and then for each year, you get another query to get all of its associated forms.

Of course, if you were writing raw SQL, you wouldn’t do that. You’d write one query that joins the year and form tables and pulls all of the data back at once. And the “prefetch” option gives you a way to do that in DBIC as well.

All we have done here is to interpose a call to “search” which adds the “prefetch” option. If you run this code with DBIC_TRACE turned on, then you’ll see that there’s only one database query and it’ll be very similar to the raw SQL that you would have written – it brings back the data from both of the tables at the same time.

But that’s not all of the cleverness of the “prefetch” option. You might be wondering what the difference is between “prefetch” and the rather similar-sounding “join” option. Well, with “join” the columns from the joined table would be added to your main table’s result set. This would, for example, create some kind of mutant Year resultset object that you could ask for Form data using calls like “get_column(‘’)”. [Update: I was trying to simplify this explanation and I ended up over-simplifying to the point of complete inaccuracy – joined columns only get added to your result set if you use the “columns” or “select/as” attributes. And the argument to “get_column()” needs to be the column name that you have defined using those options.] And that’s useful sometimes, but often I find it easier to use “prefetch” as that uses the data from the form table to build Form result objects which look exactly as they would if you pulled them directly from the database.

So that’s the kind of change that I made in my code. By prefetching a lot of associated tables I was able to drastically cut down the number of queries made to build that course selection page. Originally, it was about 120 queries. I got it down to three. Of course, each of those queries is a lot larger and is doing far more work. But there’s a lot less time spent compiling SQL and pulling data from the database.

The other page I looked at – the one that saves a student’s selections – wasn’t quite so impressive. Originally it was about twenty queries and I got it down to six.

Reducing the number of database queries is a really useful way to make your applications more efficient and DBIC’s “prefetch” option is a great tool for enabling that. I recommend that you take a close look at it.

After crowing about my success on Twitter I got a reply from a colleague pointing me at Test::DBIC::ExpectedQueries which looks like a great tool for monitoring the number of queries in your app.

Send to Kindle

Driving a Business with Perl

I’ve been a freelance programmer for over twenty years. One really important part of the job is getting paid for the work I do. Back in 1995 when I started out there wasn’t all of the accounting software available that you get now and (if I recall correctly) the little that was available was all pretty expensive stuff.

At some point I thought to myself “I don’t need to buy one of these expensive systems, I’ll write something myself”. So I sat down and sketched out a database schema and wrote a few Perl programs to insert data about the work I had done and generate invoices from that data.

I don’t remember much about the early versions. I do remember coming to the conclusion that the easiest way to generate PDFs of the invoices was using LaTex and then wasting a lot of time trying to bend LaTeX to my will. I got something that looked vaguely ok eventually, but it was always incredibly painful if I ever needed to edit it in any way. These days, I use wkhtmltopdf and my life is far easier. I understand HTML and CSS in a way that I will never understand LaTeX.

Why am I telling you this, twenty years after I started using this code? Well, during this last week, I finally decided it was time to put the code on Github. There were two reasons for this. Firstly, I thought that it might be useful for other people. And secondly, I’m ashamed to admit that this is the first time that the code has ever been put under any kind of version control (and, yes, this is an embarrassing case of “do as I say, not as I do“). I have no excuses. The software I used to drive my business was in a few files on a single hard drive. Files that I was hacking away at with gay abandon when I thought they needed changing. I am a terrible role model.

Other than all the obvious reasons, I’m sad that it wasn’t in version control as it would have been interesting to trace the evolution of the software over the last twenty years. For example, the database access started as raw DBI, spent a brief time using Class::DBI and at some point all got moved to DBIx::Class. It’s likely that I wasn’t using the Template Toolkit when I started – but I can’t remember what I was using in its place.

Anyway, the code is there now. I don’t give any guarantees for its quality, but it does the job for me. Let me know if you find any of it interesting or useful (or, even, laughable).

p.s. An interesting side effect of putting it under (public) version control – since I uploaded it to Github I have been constantly tweaking it. The potential embarrassment of having my code available for anyone to see means that I’ve made more improvements to it in the last week that I have in the previous five years. I’m even considering replacing all the command line programs with a Dancer app.

p.p.s. I actually use FreeAgent for all my accounting these days. It’s wonderful and I highly recommend it. But I still use my own system to generate invoices.

Send to Kindle

Subroutines and Ampersands

I’ve had this discussion several times recently, so I thought it was worth writing a blog post so that I have somewhere to point people the next time it comes up.

Using ampersands on subroutine calls (&my_sub or &my_sub(...)) is never necessary and can have potentially surprising side-effects. It should, therefore, never be used and should particularly be avoided in examples aimed at beginners.

Using an ampersand when calling a subroutine has three effects.

  1. It disambiguates the code so the the Perl compiler knows for sure that it has come across a subroutine call.
  2. It turns off prototype checking.
  3. If you use the &my_sub form (i.e. without parentheses) then the current value of @_ is passed on to the called subroutine.

Let’s look at these three effects in a little more detail.

Disambiguating the code is obviously a good idea. But adding the ampersand is not the only way to do it. Adding a pair of parentheses to the end of the call (my_sub()) has exactly the same effect. And, as a bonus, it looks the same as subroutine calls do in pretty much every other programming language ever invented. I can’t think of a single reason why anyone would pick &my_sub over my_sub().

I hope we’re agreed that prototypes are unnecessary in most Perl code (perhaps that needs to be another blog post at some point). Of course there are a few good reasons to use them, but most of us won’t be using them most of the time. If you’re using them, then turning off prototype checking seems to be a bad idea. And if you’re not using them, then it doesn’t matter whether they’re checked or not. There’s no good argument here for  using ampersands.

Then we come to the invisible passing of @_ to the called subroutine. I have no idea why anyone ever thought this was a good idea. The perlsub documentation calls it “an efficiency mechanism” but admits that is it one “that new users may wish to avoid”. If you want @_ to be available to the called subroutine then just pass it in explicitly. Your maintenance programmer (and remember, that could be you in six months time) will be grateful and won’t waste hours trying to work out what is going on.

So, no, there is no good reason to use ampersands when calling subroutines. Please don’t use them.

There is, of course, one case where ampersands are still useful when dealing with subroutines – when you are taking a reference to an existing, named subroutine. But that’s the only case that I can think of.

What do you think? Have I missed something?

It’s unfortunate that a lot of the older documentation on CPAN (and, indeed, some popular beginners’ books) still perpetuate this outdated style. It would be great if we could remove it from all example code.

Send to Kindle

Modern Perl Articles

Back in 2011 I wrote a series of three articles about “Modern Perl” for Linux Format. Although I mentioned all three articles here as they were published, I didn’t post the actual contents of the articles as I wasn’t sure about the copyright situation.

But now I suspect that enough time has passed that copyright is no longer going to be an issue, so I’ve added the full text of the articles to this site. The articles are all about writing a simple web application to track your reading. They use DBIx::Class and Dancer.

Let me know if you find them interesting or useful.

Send to Kindle

Dev Assistant

A couple of days ago, I updated to my laptop to Fedora 21. One of the new features was an application called DevAssistant which claimed that:

It does not matter if you only recently discovered the world of software development, or if you have been coding for two decades, there’s always something DevAssistant can do to make your life easier.

I thought it was worth investigating – particularly when I saw that it had support for Perl.

Starting the GUI and pressing the Perl button gives me two options: “Basic Class” and “Dancer”. I chose the “Basic Class” option. That gave me an dialogue box where I could give my new project a name. I chose “MyClass” (it’s only an example!) This created a directory called MyClass in my home directory and put two files in that directory. Here are the contents of those two files.

It’s great, of course, that the project wants to support Perl. I think that we should do everything we can to help them. But it’s clear to me that they don’t have anyone on the team who knows anything about modern Perl practices.

So who wants to volunteer to help them?

Update: So it turns out that the dev team are really responsive to pull requests :-)

Send to Kindle

Perl’s Problems

It’s been over six weeks since I wrote my blog post on Perl usage. I really didn’t mean to leave it so long to write the follow-up. But real life intervened and I haven’t had time for much blogging. That’s still the case (I should be writing a talk right now) but I thought it was worth jotting down some quick notes about what I think is causing Perl’s decline.


We have a lot to thank Matt Wright for. And I don’t mean that sarcastically. A lot of the popularity of Perl in the mid-90s stems directly from people like Matt and Selena Sol making their collections  of CGI programs available really early on. The popularity of their programs made Perl the de-facto standard for CGI programming.

But that was a double-edged sword. People searching the web for examples of CGI programming found Matt or Selena’s code and assumed they represented best practice. Which, of course, they didn’t. While people were blithely copying Matt’s programming style, good Perl programmers were using to parse their incoming parameters and separating their HTML generation out into templates.

In my previous post, I mentioned that fifteen or twenty years ago Perl was the programming language of choice for internet start-ups. That’s true, but a lot of the code written at that time was in the Matt Wright style. Matt’s style just about works for a guestbook or a form mailer. But when you try to build a business on top of code like that, it quickly becomes obvious that it’s an unmaintainable mess.

Many of the technical architects and CTOs who are making decisions about technology in companies today are the programmers who spent too many late nights battling those balls of mud in the 1990s. They were never really Perl programmers, they were only using it because it was fashionable, and they haven’t been keeping up with recent advances in Perl so it’s not surprising that they often choose to avoid using Perl.


A lot of Perl’s reputation as executable line noise is completely unwarranted. The people who were writing those 1990s balls of mud were under such pressure to deliver that they would have almost certainly delivered something just as unmaintainable whatever language they were using. But some of that reputation is fair. I’ve been teaching Perl for almost fifteen years and I know that there are some parts of Perl that people find confusing. Here are some examples:

Sigils – I can explain things like @array, $array[$key] and even @array[@keys] to people. And most of them get it. But it takes them a while. And then it all goes to pieces again when I have to explain the difference between $array[$key] and $array->[$key].

Context – Does any other programming language have the concept of context? Yes, when used correctly it’s a powerful tool. But it’s hard to explain and a good source of hard-to-find bugs. Can anyone honestly say that they haven’t been bitten by a context bug at some point in the last years?

Data Structures – Is the difference between arrays and array references really necessary? Think of all the complexity that is added because you can’t just pass arrays and hashes into subroutines without being bitten by list flattening. As experienced Perl programmers we know the problems and our brains are hard-wired to work around it. But other languages treat all aggregate data structures as references and it all becomes a lot easier.

I know that each of these features (and half a dozen other examples I could list) makes Perl a richer and more expressive language. But this comes at the cost of learnability and readability. Perhaps that trade-off once seemed like a good idea. When you’re trying to encourage people to look at your language then the advantages seem less obvious.

Of course, none of these features can be changed as they would break pretty much every existing Perl codebase. Which would be a terrible idea. But you can get away with a lot more breakage when you increase your major version number. Which Perl hasn’t been able to do for fourteen years.

Perl 6

I need to be clear here. I think that Perl 6 looks like a great language. I am really looking forward to using on production systems. And it looks like the current Perl 6 team are doing great work towards making that possible. In fact I think that our best approach to reviving Perl’s fortunes is to get a production-ready version of Perl 6 out and to make a big noise about that.

However, that name has been a big problem.

Looking from outside the Perl echo chamber, it’s easy to believe that Perl hasn’t had a major release for twenty years. And that can probably explain a lot of Perl’s current problems.

I know that people who believe that are wrong. The current version of Perl (5.20.1 as I write this) is a lot different to the version that was current when Perl 6 was first announced (which was 5.6.0, I think). Perl has gone through huge changes in the last fourteen years. But the version number hides that.

I also know that we no longer tell people that Perl 6 is the next version of Perl. The Wikipedia page makes it clear in its first sentence that “Perl 6 is a member of the Perl family of programming languages“. So why do people continue to think it’s the next version of Perl? Well, probably because people assume that they know how software version numbers work and don’t bother to check the web site to see it a particular project has changed the standard meaning that has worked well for decades.

So Perl 6 has been simultaneously both good and bad for Perl. Good because a lot of Perl 6 ideas have been backported into Perl 5. But bad because Perl 5 has been unable to change its major version number in order to advertise these improvements to the wider software-using world.

Nothing can be done about this now. The damage is done. As I said at the start of this section, it’s likely that the only thing we can do is to bet heavily on Perl 6 and get it out as soon as possible. Perl 5 will continue to exist. People will continue to maintain and improve it. Some companies will continue to use it. But it’s usage will continue to fall. I really think it’s too late to do anything about that.

Send to Kindle

“I Do Not Want To Use Any Modules”

Almost every day on the Perl groups on LinkedIn (or Facebook, or StackOverflow, or somewhere like that) I see a question that includes the restriction “I do not want to use any modules”.

There was one on LinkedIn yesterday. He wanted to create a MIME message to pass to sendmail, but he didn’t want to install any modules. Because “getting a module installed will have to go though a long long process of approvals”.

And I understand that. I really do. We’ve all seen places where getting new software installed is a problem. But I see that problem as a bug in the development process. A bug that needs to be fixed before anything can get done in a reasonable manner. Here’s what I’ve just written in reply:

Of course it can be achieved without modules. Just create an email in the correct format and pass it to sendmail.

Ah, but what’s the right format? Well, that is (of course) the tricky bit. I have no idea what the correct format is. Oh, I could Google a bit and come up with some ideas. I might even find the RFC that defines the MIME format. And then I’d be able to knock up some code that created something that looked like it would work. But would I be sure that it works? In every case? With all the weird corner-cases that people might throw at it?

This is where CPAN modules come in handy. You’re using someone else’s knowledge. Someone who is (hopefully) an expert in the field. And because modules are used by lots of people, bugs get found and fixed.

A lot of modern Perl programming is about choosing the right set of CPAN modules and plumbing them together. That’s what makes Perl so powerful. That’s what makes Perl programmers so efficient. We’re standing on the shoulders of giants and re-using other people’s code.

If you’re not going to use CPAN then you might as well use shell-scripting or awk.

If you’re in a situation where getting CPAN modules installed is hard, then fixing that problem should be your first priority. Because that’s a big impediment to your Perl programming. And investing time in fixing that will be massively beneficial to you in a very short amount of time.

The obvious solution is to install your own module tree (alongside your own Perl) as part of your application. But that might be overkill in some situations, so you could also consider using the system Perl and asking your sysadmin to install packages from your distribution’s repositories. Of course, that might need a change in process. But it’s a change that is well worth making; a change that will improve your (programming) life immensely.

Update: Some very interesting discussion about this over on Reddit.

Send to Kindle

Perl Usage

In my last blog post, I posted a graph showing that out of 135 companies at a recent Silicon MilkRoundabout recruitment event, only one said that they were using Perl. That has led to some interesting discussions that I’d like to address here.

I should make it clear that I wasn’t presenting my graph as evidence that Perl is dead. Of course you can’t leap to conclusions like that from what I learned at one recruitment event. I do, however, think that the situation is pretty grim.

But firstly, a few points that people made to me in response to my post.

We know that Perl isn’t used in start-ups
Yes. I think we do know that. But I don’t think we’re as worried about that as we should be. Imagine if that job fair was held fifteen years ago. Or twenty years ago. Perl used to be the language of choice for internet start-ups. What happened to change that? (I have some theories that I’ll cover in another blog post) Can this trend be reversed? (Honestly, I don’t think so – but I’m open to arguments to the contrary)

Every programmer I know uses Perl in some way
I think this might have been true fifteen years ago, but it hasn’t been the case for some time. If it’s really true that all programmers that you know still use Perl, then I think you only know a really bizarre cross-section of programmers.

All companies use Perl, but the HR department or management often don’t know
This is similar to the last point. And, again, I think it’s something that used to be true and hasn’t really been true this millennium. But there’s also the idea of Perl being the programmers “secret weapon” that the suits don’t know about. Even if it’s true (and I don’t think it is), then going underground like that is likely to be harmful to Perl’s popularity in the long term.

I think we should stop fooling ourselves here. Perl usage has been declining for over a decade. To a first level of of approximation, Perl is already a dead language.

Of course, The Perl community has spent a lot of the last few years actively denying that. I’ve been responsible for some of that drum-beating myself. But we need to accept that it’s true. For most people outside of the Perl bubble, Perl is a language that they last considered using back in the last millennium.

So, if Perl is dead, why has everyone spent the last five years demonstrating that this isn’t the case? Have they been lying to us? No, I don’t think they have. I just think that they have been looking at the wrong measures of success. Let’s look at some of the arguments I’ve seen.

CPAN is growing faster than ever
We have regular releases of Perl
Some great new features have been added to Perl
These all essentially boil down to the same argument – “Perl isn’t dead because some part of Perl (or its ecosystem) is improving”. I can’t argue with any of those facts, but do they really say anything useful about the long-term viability of the language. It’s great that Perl is constantly improving, but unless the people who are currently ignoring Perl can be persuaded to investigate these improvements, then they do little or nothing to stop Perl’s decline.

Moose might be the most powerful object system in the world. DBIx::Class might be the most flexible ORM available. Projects like these are great. But they don’t seem to be doing much to bring new people to Perl.

There are more YAPCs and Perl Workshops every year
Perl Mongers groups are starting all the time
We get dozens of people to our meetings every month
These arguments all boil down to “the Perl community is growing”. Again, I can’t argue with those facts (well, to be honest, I think the rate of Perl Monger group creation has slowed over the last ten years) but, again, I don’t think they prove what their proponents think they prove.

There is a difference between the Perl community and Perl programmers. Everywhere that I work, I find people who I already know from the community. But I always find far more people who I don’t know because they aren’t at all engaged with the Perl community. And I think it’s that large, untapped, number of non-community Perl programmers who make up the increased numbers of people attending meetings or conferences. This means that we are getting better at bringing our colleagues along to meetings. It doesn’t mean that more people are using Perl.

The number of Perl jobs is rising
Our company can never find enough Perl programmers
We just started a major new project using Perl
Most of the companies who use Perl continue to use Perl. That’s not really news. And some of those companies have grown really big and therefore need lots of Perl programmers to maintain and enhance their Perl programs. And that’s great. But it’s not really evidence of a grow in Perl usage.

Not all the companies who have historically used Perl continue to do so. Over the last five years I know of at least four big Perl-using companies in London who have started to move away from it for new development.

And one reason why people are always looking for Perl programmers is because many programmers have chosen to move away from Perl. I know plenty of people who were regulars at London Perl Mongers meetings ten to fifteen years ago but who haven’t written a line of Perl for over five years. This means, of course, that there is more work to go round those of us who are left. I could probably go through to my retirement maintaining existing Perl codebases. Those of you who are younger than me might not be so lucky.


So, to summarise, people who say that Perl is thriving point to three things – technical advances in Perl, the vibrant Perl community and the number of unfilled Perl jobs that always seem to be around. All of these things are great and are, of course, necessary for a living and growing language.

But they aren’t sufficient. You also need people outside of the community to take notice. And that’s not happening.

Ask yourself three questions.

  1. When did you last read a book on general programming techniques that contained examples written in Perl?
  2. When did you last read documentation for a web site’s API that included examples written in Perl?
  3. When did you last hear of a company using Perl that you didn’t previously know about?

This is why I published that graph a couple of weeks ago. Looking at that data, it really hit home to me just how badly we’re doing.

I have a couple of theories about why most of the world started ignoring Perl. I’ll get to those in my next blog posts. But, annoyingly, I don’t have any good ideas about how we might reverse the situation.

To be honest, currently my best advice (and the course I’ll be taking) is “brush up your Javascript”.


Send to Kindle

Programming Language Usage

Back in May, I spent an afternoon at Silicon MilkRoundabout. Silicon MilkRoundabout is a recruitment fair for techies. It’s specifically aimed at people who want to work for start-ups around the Old Street area (although they aren’t particularly stringent about sticking to that – for example, the BBC were there).

We were given a booklet containing details of all of the companies who were recruiting. Those details usually included information about the tech stack that the companies used.

Over the weekend, I went through that booklet and listed the programming languages mentioned by the companies. The results speak for themselves.

There were 135 companies at the event. About twenty of them unhelpfully listed their tech stack as “ask us for details”.

Here’s the graph:Usage of Programming Languages by Companies at Silicon MilkRoundabout

Usage of Programming Languages by Companies at Silicon MilkRoundabout

I’ll obviously have some more to say about this over the next few days. But I wanted to get the raw data out there as soon as possible.

Send to Kindle

Data Munging with Perl

Data Munging with PerlMany years ago, I wrote a book called Data Munging with Perl. People were kind enough to say nice things about it. A few people bought copies. I made a bit of money.

Recently I re-read it. I thought that some of it was still pretty good. There were some bits, particularly in the early chapters, that talked about general principles that are still as relevant as they were when the book was published.

There were other bits that haven’t aged quite as well. The bits where I talk about particular CPAN modules are all a bit embarrassing as Perl fashions change and newer, better modules are released. Although it was still available from Amazon, I really didn’t want people paying for it as a lot of it was really out of date.

But today I got an interesting letter from the publishers, telling me that they have taken the book out of print. And that all  the rights in the book have reverted to me. Which means that I can now distribute it in any way that I like. And people don’t have to pay a lot of money for a rather out of date book.

So, you can download a PDF of the book from Or I’ve embedded the book below.

I might even have the original documents somewhere. So if I get some spare time I might be able to produce a more reasonable ebook version (but don’t hold your breath!)

Let me know if you find it useful.

Send to Kindle