Why Learn Perl?

A couple of months ago I mentioned some public training courses that I’ll be running in London next month. The courses are being organised by FlossUK and since the courses have been announced the FlossUK crew have been running a marketing campaign to ensure that as many people as possible know about the courses. As part of that campaign they’ve run some sponsored tweets, so information about the courses will have been displayed to people who previously didn’t know about them (that is, after all, the point of marketing).

And, in a couple of cases, the tweet was shown to people who apparently weren’t that interested in the courses.

As you’ll see, both tweets are based on the idea that Perl training is pointless in 2016. Presumably because Perl has no place in the world of modern software development. This idea is, of course, wrong and I thought I’d take some time to explain why it is so wrong.

In order for training to be relevant, I think that two things need to be true. Firstly the training has to be in a technology that people use and secondly there needs to be an expectation that some people who use that technology aren’t as expert in as they would like to be (or as their managers would like them to be). Let’s look at those two propositions individually.

Do people still use Perl? Seems strange that I even have to dignify that question with a response. Of course people still use Perl. I’m a freelance programmer who specialises in Perl and I’m never short of people wanting me to work for them. I won’t deny that the pool of Perl-using companies has got smaller in the last ten years, but they are still out there. And they are still running successful businesses based on Perl.

So there’s no question that Perl satisfies the first of our two points. You just have to look at the size of the Perl groups on Facebook or LinkedIn to see that plenty of people are still using Perl. Or come along to a YAPC and see how many companies are desperate to employ Perl programmers.

I think it’s the second part of the question that is more interesting. Because I think that reveals what is really behind the negative attitude that some people have towards Perl. Are there people using Perl who don’t know all they need to know about it?

Think back to Perl’s heyday in the second half of the 1990s. A huge majority of dotcoms were using Perl to power their web sites. And because web technologies were so new, most of the Perl behind those sites was of a terrible standard. They were horrible monolithic CGI programs with hard-coded HTML within the Perl code (thereby making it almost impossible for designers to improve the look of the web site). When they talked to databases, they used raw SQL that was also hard-coded into the source. The CGI technology itself meant that as soon as your site became popular, your web server was spawning hundreds of Perl processes every minute and response times ballooned. So we switched to mod_perl which meant rewriting all of the code and in many cases the second version was even more unmaintainable than the first.

It’s not surprising that many people got a bad impression of Perl. But any technology that was being used back then had exactly the same problems. We were all learning on the job.

Many people turned their backs on Perl at that point. And, crucially, stopped caring what was going on in Perl development. And like British ex-pats who think the UK still works the way it did when they left in the 1960s, these people think the state of the art in Perl web development is those balls of mud they worked on fifteen or twenty years ago.

And it’s not like that at all. Perl has moved on. Perl has all of the tools that you’d expect to see in any modern programming language. Moose is as good as, if not better than, the OO support in any other language. DBIx::Class is as flexible an ORM as you’ll find anywhere. Plack and PSGI make writing web apps in Perl as easy as it is in any other language. Perl has always been the magpie language – it would be crazy to assume that it hasn’t stolen all the good ideas that have emerged in other languages over the last fifteen years. It has stolen those ideas and in many cases it has improved on them.

All of which brings us back to my second question. Are there people out there who need to learn more about Perl? Absolutely there are. The two people whose tweets I quoted above are good examples. They appear to have bought into the common misconception that Perl hasn’t changed since Perl 5 was released over twenty years ago.

That’s often what I find when I run these courses. There are people out there with ten or fifteen years of Perl experience who haven’t been exposed to all of the great Modern Perl tools that have been developed in the last ten years. They think they know Perl, but their eyes are opened after a couple of hours on the course. They go away with long lists of tools that they want to investigate further.

I’m not saying that everyone should use Perl. If you’re comfortable using other technologies to get your job done, then that’s fine, of course. But if you haven’t followed Perl development for over ten years, then please don’t assume that you know the current state of the language. And please try to resist making snarky comments about things that you know nothing about.

If, on the other hand, you are interesting in seeing how Perl has changed in recent years and getting an overview of the Modern Perl toolset, then we’d love to see you on the courses.

Easy PSGI

When I write replies to questions on StackOverflow and places like that recommending that people abandon CGI programs in favour of something that uses PSGI, I often get some push-back from people claiming that PSGI makes things far too complicated.

I don’t believe that’s true. But I think I know why they say it. I think they say it because most of the time when we say “you should really port that code to PSGI” we follow up with links to Dancer, Catalyst or Mojolicious tutorials.

I know why we do that. I know that a web framework is usually going to make writing a web app far simpler. And, yes, I know that in the Plack::Request documentation, Miyagawa explicitly says:

Note that this module is intended to be used by Plack middleware developers and web application framework developers rather than application developers (end users).

Writing your web application directly using Plack::Request is certainly possible but not recommended: it’s like doing so with mod_perl’s Apache::Request: yet too low level.

If you’re writing a web application, not a framework, then you’re encouraged to use one of the web application frameworks that support PSGI (http://plackperl.org/#frameworks), or see modules like HTTP::Engine to provide higher level Request and Response API on top of PSGI.

And, in general, I agree with him wholeheartedly. But I think that when we’re trying to persuade people to switch to PSGI, these suggestions can get in the way. People see switching their grungy old CGI programs to a web framework as a big job. I don’t think it’s as scary as they might think, but I agree it’s often a non-trivial task.

Even without using a web framework, I think that you can get benefits from moving software to PSGI. When I’m running training courses on PSGI, I emphasise three advantages that PSGI gives you over other Perl web development environments.

  1. PSGI applications are easier to debug and test.
  2. PSGI applications can be deployed in any environment you want without changing a line of code.
  3. Plack Middleware

And I think that you can benefit from all of these features pretty easily, without moving to a framework. I’ve been thinking about the best way to do this and I think I’ve come up with a simple plan:

  • Change your shebang line to /usr/bin/plackup (or equivalent)
  • Put all of your code inside my $app = sub { ... }
  • Switch to using Plack::Request to access all of your input parameters
  • Build up your response output in a variable
  • At the end of the code, create and return the required Plack response (either using Plack::Response or just creating the correct array reference).

That’s all you need. You can drop your new program into your cgi-bin directory and it will just start working. You can immediately benefit from easier testing and later on, you can easily deploy your application in a different environment or start adding in middleware.

As an experiment to find how easy this was, I’ve been porting some old CGI programs. Back in 2000, I wrote three articles introducing CGI programming for Linux Format. I’ve gone back to those articles and converted the CGI programs to PSGI (well, so far I’ve done the programs from the first two articles – I’ll finish the last one in the next day or so, I hope).

It’s not the nicest of code. I was still using the CGI’s HTML generation functions back then. I’ve replaced those calls with HTML::Tiny. And they aren’t very complicated programs at all (they were aimed at complete beginners). But I hope they’ll be a useful guide to how easy it is to start using PSGI.

My programs are on Github. Please let me know what you think.

If you’re interested in modern Perl Web Development Techniques, you might find it useful to attend my upcoming two-day course on the subject.

Update: On Twitter, Miyagawa reminds me that you can use CGI::Emulate::PSGI or CGI::PSGI to run CGI programs under PSGI without changing them at all (or, at least, changing them a lot less than I’m suggesting here). And that’s what I’d probably do if I had a large amount of CGI code that I wanted to to move to PSGI quickly. But I still think it’s worth showing people that simple PSGI programs really aren’t any more complicated than simple CGI programs.

London Perl Workshop Review

(Photo by Mark Keating)

Last Saturday was the annual London Perl Workshop. And, as always, it was a great opportunity to soak up the generosity, good humour and all-round-awesomeness of the European Perl community. I say “European” as the LPW doesn’t just get visitors from London or the UK. There are many people who attend regularly from all over Europe. And, actually, from further afield – there are usually two or three Americans there.

I arrived at about twenty to nine, which gave me just enough time to register and say hello to a couple of people before heading to the main room for Mark Keating’s welcome. Mark hinted that with next year’s workshop being the tenth that he will have organised, he is starting to wonder if it’s time for someone else to take over. More on that later.

I then had a quick dash back down to the basement where I was running a course on Modern Web Development with Perl. It seemed to go well, people seemed engaged and asked some interesting questions. Oh, and my timing was spot on to let my class out two minutes early so that they were at the front of the queue for the free cakes (courtesy of Exonetric). That’s just my little trick for getting slightly higher marks in the feedback survey.

After the coffee break I was in the smaller lecture theatre for three interesting talks – Neil Bowers on Boosting community engagement with CPAN‎ (and, yes, I’ve finally got round to signing up for the CPAN Pull Request Challenge), Smylers on Code Interface Mistakes to Avoid‎ and Neil Bowers (again) on ‎Dependencies and the River of CPAN‎ which was an interesting discussion on how the way you maintain a CPAN module should change as it becomes more important to more people.

Then it was lunch, which I spent in the University cafeteria catching up with friends.

After lunch, I saw Léon Brocard on Making your website seem faster, followed by Steve’s Man Publishing Pint, which turned out to be about publishing ebooks to Amazon easily – something which I’ve been very interested in recently.

The schedule was in a bit of a state of flux, so I missed Andrew Solomon’s talk on How to grow a Perl team‎ and instead saw Steve Mynott talking about Perl 6 Grammars. Following that, I gave my talk on Conference Driven Publishing (which is part apology for not writing the book I promised to write at the last LPW and part attempt to get more people writing and publishing ebooks about Perl).

Then there was another coffee break which I spent getting all the latest gossip from some former colleagues. We got so caught up in it that I was slightly late for Theo van Hoesel’s talk Dancer2 REST assured. I like Theo’s ideas but (as I’ve told him face to face) I would like to see a far simpler interface.

Next up was the keynote. Liz Mattijsen stood in for Jonathan Worthington (who had to cancel at the last minute) and she explained the history of her involvement in Perl and how she was drawn to working on Perl 6. She finished with a brief overview of some interesting Perl 6 features.

Then there were the lightning talks which were their usual mixture of useful, thought-provoking and insane.

Mark Keating closed the conference by thanking everyone for their work, their sponsorship and their attendance. He returned to the theme of perhaps passing on the organisation of the workshop to someone new. No-one, I think, can fail to be incredibly grateful for the effort that Mark has put into organising the last nine workshops and it makes complete sense to me that he can’t maintain that level of effort forever. So it makes sense to start looking for someone else to take over organising the workshop in the future. And, given the complexity of the task, it would be sensible if that person got involved as soon as possible so that we could have a smooth transition during the organisation of next year’s event.

If you’re interested in becoming a major hero to the European Perl community, then please get in touch with Mark.

There was no planned post-workshop event this year. So we broke up into smaller groups and probably colonised most of central London. Personally, I gathered a few friends and wandered off to my favourite restaurant in Chinatown.

I can only repeat what Mark said as he closed the workshop and give my thanks to all of the organisers, volunteers, speakers, sponsors and attendees. There’s little doubt in my mind that the LPW is, year after year, one of the best grass-roots-organised events in the European geek calendar. And this year’s was as good as any.

The Long Death of CGI.pm

CGI.pm has been removed from the core Perl distribution. From 5.22, it is no longer included in a standard Perl installation.

There are good technical reasons for this. CGI is a dying technology. In 2015, there are far better ways to write web applications in Perl. We don’t want to be seen to encourage the use of a technology which no-one should be using.

This does lead to a small problem for us though. There are plenty of web hosting providers out there who don’t have particularly strong Perl support. They will advertise that they support Perl, but that’s just because they know that Perl comes as a standard part of the operating system that they run on their servers. They won’t do anything to change their installation in any way. Neither you nor I would use a hosting company that works like that – but plenty of people do.

The problem comes when these companies start to deploy an operating system that includes Perl 5.22. All of a sudden, those companies will stop including CGI.pm on their servers. And while we don’t want to encourage people to use CGI.pm (or, indeed, the CGI protocol itself) we need to accept that there are thousands of sites out there that have been happily using software based on CGI.pm for years and the owners of these sites will at some point change hosting providers or upgrade their service plan and end up on a server that has Perl 5.22 and doesn’t have CGI.pm. And their software will break.

I’ve always assumed that this problem is some time in the future. As far as I can see, the only mainstream Linux distribution that currently includes Perl 5.22 is Fedora 23. And you’d need to be pretty stupid to run a web hosting business on any version of Fedora. Fedora is a cutting edge distribution with no long term support. Versions of Fedora are only supported for about a year after their release.

So the problem is in the future, but it is coming. At some point Perl 5.22 or one of its successors will make it into Red Hat Enterprise Linux. And at that point we have a problem.

Or so I thought. But that’s not the case. The problem is here already. Not because of Perl 5.22 (that’s still a year or two in the future for most of these web hosting companies) but because of Red Hat.

Red Hat, like pretty much everyone, include Perl in their standard installation. If you install any Linux distribution based on Red Hat, then the out of the box installation includes an RPM called “perl”. But it’s not really what you would recognise as Perl. It’s a cut down version of Perl. They have stripped out many parts of Perl that they consider non-essential. And those parts include CGI.pm.

This change in the way they package Perl started with RHEL 6 – which comes with Perl 5.10. And remember it’s not just RHEL that is affected. There are plenty of other distributions that use RHEL as a base – Centos, Scientific Linux, Cloud Linux and many, many more.

So if someone uses a server running RHEL 6 or greater (or another OS that is based on RHEL 6 or greater) and the hosting company have not taken appropriate action, then that server will not have CGI.pm installed.

What is the “appropriate action” you ask. Well it’s pretty simple. Red Hat also make another RPM available that contains the whole Perl distribution. So bringing the Perl up to scratch on a RHEL host is as simple as running:

yum install perl-core

That will work on a server running RHEL 6 (which has Perl 5.10) and RHEL 7 (which has Perl 5.16). On a future version of RHEL which includes Perl 5.22 or later, that obviously won’t work as CGI.pm won’t be part of the standard Perl installation and therefore won’t be included in “perl-core”. At that point it will still be a good idea to install “perl-core” (to get the rest of the installation that you are missing) but to get CGI.pm, you’ll need to run:

yum install perl-CGI

So this is a plea to people who are running web hosting services using Red Hat style Linux distributions. Please ensure that your servers are running a complete Perl installation by running the “yum” command above.

All of which brings me to this blog post that Marc Lehmann wrote a couple of days ago. Marc found a web site which no longer worked because it had been moved to a new server which had a newer version of Perl – one that didn’t include CGI.pm. Marc thinks that the Perl 5 Porters have adopted a cavalier approach to backward compatibility and that the removal of CGI.pm is a good example of the problems they are causing. He therefore chose to interpret the problems this site was having as being caused by p5p’s approach to backward compatibility and the removal of CGI.pm.

This sounded unlikely to me. As I said above, it would be surprising if any web hosting company was using 5.22 at this point. So, I did a little digging. I found that the site was hosted by BlackNight solutions and that their web says that their servers run Perl 5.8. At the same time, Lee Johnson, the current maintainer of CGI.pm, got in touch with the web site’s owner who confirmed what I had worked out was correct.

Later yesterday I had a conversation with @BlackNight on Twitter. They told me that their hosts all ran Cloud Linux (which is based on RHEL) and that new servers were being provisioned using Cloud Linux 6 (which is based on RHEL 6).

So it seems clear what has happened here. The site was running on an older server which was running Cloud Linux 5. That includes Perl 5.8 and predates Red Hat removing CGI.pm from the “perl” RPM. It then moved to a new host running Cloud Linux 6 which is based on RHEL 6 and doesn’t include CGI.pm in the default installation. So what the site’s owner said is true, he moved to a new host with a newer version of Perl (that new version of Perl was 5.10!) but it wasn’t the new version of Perl that caused the problems, it was the new version of the operating system or, more specifically, the change in  the way that Red Hat (and its derivatives) packaged Perl.

Marc is right that when Perl 5.22 hits the web hosting industry we’ll lose CGI.pm from a lot a web servers. You can make your own mind up on how important that is and whether or not you share Marc’s other opinions on how p5p is steering Perl. But he’s wrong to assume that, in this instance, the problem was caused by anything that p5p have done. In this instance, the problem was caused by Red Hat’s Perl packaging policy and was compounded by a hosting company who didn’t know that upgrading their servers to Cloud Linux 6 would remove CGI.pm.

RHEL 6 was released five years ago. I suspect it’s pretty mainstream in the web hosting industry by now. So CGI.pm will already have disappeared from a large number of web servers. I wonder why we haven’t seen a tsunami of complaints?

Update: More discussion on Reddit and Hacker News.

LPW Slides

A more detailed write-up of the LPW will follow in the next few days. But in the meantime, here are the slides to the three talks I gave.

 

 

London Perl Workshop 2015

This time next week we will all be enjoying the London Perl Workshop. I thought it was worth looking at what the day has in store.

As always (well, except that one time when they had no power) the LPW will take place at the Cavendish Campus of the University of Westminster. I’m told there are exams or something like that taking place on the same day, so it’s important to follow the signs when you get there or you might end up in the wrong place being forced to take an exam.

The workshop starts at 9am, but registration queues can be quite long, so I’d recommend getting there half an hour or so earlier than that. If you get lucky and register quickly, then why not look for an organiser and volunteer to help out for a while.

You’ll want to be in the main room for the welcome address at 9am – just in case there’s any important news about the day. But the talks start at 9:10.

My ‎Modern Perl Web Development‎ course starts then. Hopefully it will be in my usual classroom. Alteratively, Andrew Solomon’s Crash course on Perl, the Universe and Everything‎ starts at the same time and goes on much longer. Or you might want to see some shorter courses. If I wasn’t running my training, I’d want to see Tom Hukins talking about ‎Escaping Insanity‎ and Rick Deller on Developing Your Brand – from a job seeker , Business to sole contractor/consultant‎ – he assures me that his slides are no longer the shocking pink he has used in previous years.

At 11:00 there’s a coffee break sponsored by Evozon. My training finishes at that point, so I’m free to see a few talks. Unfortunately, I want to see all of the talks in the next slot. I suspect I’ll end up seeing Neil Bowers’ Boosting community engagement with CPAN‎ and Smylers’ ‎Don’t Do That: Code Interface Mistakes to Avoid‎, but I could well be tempted into Aaron Crane’s Write-once data: writing Perl like Haskell‎ instead. Or, back on the workshop track, there’s Dominic Humphries on From can to can’t: An intro to functional programming. Just before lunch, I think I’ll see Neil Bowers again. This time he’s talking about Dependencies and the River of CPAN.

After lunch there’s another session where I want to see everything. I’d love to see Stevan Little talking about his latest iteration of the p5-mop, but I suspect I’ll end up seeing Leon Brocard on Making your website seem faster‎ followed by Kaitlyn Parkhurst on Project Management For The Solo Developer. Dominic’s functional programming workshop continues after lunch and is joined by John Davies and Martin Berends talking about Parallel Processing Performed Properly in Perl on Pi‎.

The big talk after the next short break is going to be Matt Trout on A decade of dubious decisions‎ but it’s another I’ll miss as I’m talking about Conference Driven Publishing‎ in another room during the second half of it. During the first half I’d recommend Steve Mynott’s Perl 6 Grammars‎.  But, I saw him practice it at a recent London Perl Mongers technical meeting, so I’ll be seeing Andrew Solomon explaining How to grow a Perl team‎. In the workshop stream, Christian Jaeger will be covering Functional Programming on Perl‎.

Then there’s another coffee break (this time sponsored by Perl Careers) and then we’re into the last few sessions. In the first you have a choice between Jeff Goff on From Regular Expressions to Parsing JavaScript: Learn Perl6 Grammars‎ and Theo van Hoesel on ‎Dancer2 REST assured‎. I think I’ll be in Theo’s talk.

These are followed by Jonathan Worthington’s keynote – The end of the beginning‎ and the lightning talks. It will, no doubt, be a great end to a fabulous day.

The London Perl Workshop is always a great day a learning about Perl and catching up with old friends. And because of the brilliant sponsors, it doesn’t cost the attendees a penny.

If you’re going to be near London next weekend and you have any interest in Perl, then why not register and come along?

Here’s a brief video of last year’s workshop.

Training Courses – More Details

Last week I mentioned the public training courses that I’ll be running in London next February. A couple of people got in touch and asked if I had more details of the contents of the courses. That makes sense of course, I don’t expect people to pay £300 for a days training without knowing a bit about the syllabus.

So here are details of the first two courses (the Moose one and the DBIx::Class one). I hope to have details of the others available by next weekend.

Object Oriented Programming with Perl and Moose

  • Introduction to Object Oriented programming
  • Overview of Moose
  • Object Attributes
  • Subclasses
  • Object construction
  • Data types
  • Delegation
  • Roles
  • Meta-programming
  • Further information

Database Programming with Perl and DBIx::Class

  • Brief introduction to relational databases
  • Introduction to databases and Perl
    • DBI
    • ORM
  • Schema Classes
  • Basic DB operations
    • CRUD
  • Advanced queries
    • Ordering, joining, grouping
  • Extending DBIC
  • Further information

If you have any further questions, please either ask them in the comments or email me (I’m dave at this domain).

And if I’ve sold you on the idea of these courses, the booking page is now open.

Public Training in London – February 2016

For several years I’ve been running an annual set of public training courses in London in conjunction with FLOSS UK (formerly known as UKUUG). For various scheduling reasons, we didn’t get round to running any this year, but we have already made plans for next year.

I’ll be running five days of training in central London from 8th – 12th February. The courses will take place at the Ambassador’s Hotel on Upper Woburn Place. Full details are in the process of appearing on the FLOSS UK web site, but the booking page doesn’t seem to be live yet, so I can’t tell you how much it will cost.

We’re doing something a little different this year. In previous years, I’ve been running two generic two-day courses – one on intermediate Perl and one on advanced Perl. This year we’re running a number of shorter but more focussed courses. The complete list is:

  • Object Oriented Programming with Perl and Moose (Mon 8th Feb)
  • Database Programming with Perl and DBIx::Class (Tue 9th Feb)
  • An Introduction to Testing Perl Programs (Wed 10th Feb)
  • Modern Web Programming with Perl (two day course – Thu/Fri 11th/12th Feb)

This new approach came out of some feedback we’ve received from attendees over the last couple of years. I’m hoping that by offering this shorter courses, people will be able to take more of a “mix and match” approach and will select courses that better fit their requirements. Of course, if you’re interested, there’s no reason why you shouldn’t come to all five days.

I’ll update this page when I know how much the courses will cost and how you can book. But please put these dates in your calendar.

Update: And less than 24 hours after publishing this blog post, the booking page has gone live.

Places are £300 a day (so £600 for the two-day course on web programming) and there’s a special offer of £1,320 for the full week.

Prices are cheaper (by £90 a day) for members. And given that an annual individual membership costs £35, that all sounds like a bit of a no-brainer to me.

Build RPMs of CPAN Modules

If you’ve been reading my blog for a while, you probably already know that I have an interest in building RPMs of CPAN modules. I run a small RPM repository where I make available all of the RPMs that I have built for myself. These will either be modules that aren’t available in other RPM repositories or modules where I wanted a newer version than the currently available one.

I’m happy to take requests for my repo, but I don’t often get any. That’s probably because most people very sensibly use the cpanminus/local::lib approach or something along those lines.

But earlier this week, I was sitting on IRC and Ilmari asked if I had a particular module available. When I said that I didn’t, he asked if I had a guide for building an RPM. I didn’t (well there are slides from YAPC 2008 – but they’re a bit dated) but I could see that it was a good suggestion. So here it is. Oh, and I built the missing RPM for him.

Setting Up

In order to build RPMs you’ll need a few things set up. This is stuff you’ll only need to do once. Firstly, you’ll need two new packages installed – cpanspec (which parses a CPAN distribution and produces a spec file) and rpm-build (which takes a spec file and a distribution and turns them into an RPM). They will be available in the standard repos for your distribution (assuming your distribution is something RPM-based like Fedora or Centos) so installing them is as simple as:

If you’re using Fedora 22 or later, “yum” has been replaced with “dnf”.

Next, you’ll need a directory structure in which to build your RPMs. I always have an “rpm” directory in my home directory, but it can be anywhere and called anything you like. Within that directory you will need subdirectories called BUILD, BUILDROOT, RPMS, SOURCES, SPECS and SRPMS. We’ll see what most of those are for a little later.

The final thing you’ll need is a file called “.rpmmacros” in your home directory. At a minimum, it should contain this:

The packager and vendor settings are just to stop you having to type in that information every time you build an RPM. The _topdir setting points to the “rpm” directory that you created a couple of paragraphs up.

I would highly recommend adding the following line as well:

This turns off the default behavior for adding “Requires” data to the RPM. The standard behaviour is to parse the module’s source code looking for every “use” statement. By turning that off, you instead trust the information in the META.yml to be correct. If you’re interesting in hearing more detail about why I think the default behaviour is broken, then ask me in a pub sometime.

Ok. Now we’re all set. We can build our first RPM.

Building an RPM

Building an RPM is simple. You use “cpanspec” to make the spec file and then “rpmbuild” to build the RPM. You can use “cpanspec” in a few different modes. If you have the module tarball, then you can pass that to “cpanspec”.

That will unwrap the tarball, parse the code and create the spec file.

But if you’re building an RPM for a CPAN module, you don’t need to download the tarball first, “cpanspec” will do that for you if you give it a distribution name.

That will connect to CPAN, find the latest version of the distribution, download the right tarball and then do all the unwrapping, parsing and spec creation.

But there’s another, even cleverer way to use “cpanspec” and that’s the one that I use. If you only know the module’s name and you’re not sure which distribution it’s in, then you can just pass the name of the module.

This is the mode that I always use it in.

No matter how you invoke “cpanspec”, you will end up with the distribution tarball and the spec file – which will be called “perl-Some-Module.spec”. You need to copy these files into the correct directories under your rpm building directory. The tarball goes into SOURCES and the spec goes into SPECS. It’s also probably easiest if you change directory into your rpm building directory.

You can now build the RPM with this command:

You’ll see a lot of output as “rpmbuild” goes through the whole CPAN module testing and building process. But hopefully eventually you’ll see some output saying that the build has succeeded and that an RPM has been written under your RPMS directory (in either the “noarch” or “x86_64” subdirectory). You can install that RPM with any of the following commands:

And that should be that. Of course there are a few things that can go wrong. And that’s what the next section is about.

Fixing Problems

There are a number of things that can go wrong when building RPMs. Here are some of the most common, along with suggested fixes.

Missing prerequisites

This is also known as “dependency hell”. The module you are building is likely to require other modules. And you will need to have those installed before “rpmbuild” will let you build the RPM (and, note, they’ll need to be installed as RPMS – the RPM database doesn’t know about modules you have installed with “cpan” or “cpanminus”).

If you have missing prerequisites, the first step is to try to install them using “yum” (or “dnf”). Sometimes you will get lucky, other times the prerequisites won’t exist in the repos that you’re using and you will have to build them yourself. This is the point at which building an RPM for a single module suddenly spirals into three hours of painstaking work as you struggle to keep track of how far down the rabbit-hole you have gone.

I keep thinking that I should build a tool which parses the prerequisites, works out which ones already exist and automatically tries to build the ones that are missing. It would need to work recursively of course. I haven’t summoned the courage yet.

Extra files

Sometimes at the end of an RPM build, you’ll get an error saying that files were found which weren’t listed in the spec file. This usually means that the distribution contains programs that “cpanspec” didn’t find and therefore didn’t add to the spec file. This is a simple fix. Open the spec file in an editor and look for the section labelled ‘%files’. Usually, it will look something like this:

This is a list of the files which will be added to the RPM. See the _mandir entry? That’s the man page for the module that is generated from the module’s Pod (section 3 is where library documentation goes). We just need to add two lines to the bottom of this section:

This says “add any files you find in the binaries directories (and also any man pages you find for those programs)”.

If you add these lines and re-run the “rpmbuild” command, the build should now succeed.

Missing header files

If you’re building an XS module that is a wrapped around a C library then you will also need the C header files for that library in order to compile the XS files. If you get errors about missing definitions, then this is probably the problem. In RedHat-land a C library called “mycoolthing” will live in an RPM called “libmycoolthing” and the headers will be in an RPM library called “libmycoolthing-devel”. You will need both of those installed.

Your users, however, will only need the C library (libmycoolthing) installed. It’s well worth telling the RPM system that this external library is required by adding the following line to the spec file:

That way, when people install your module using “yum” or “dnf”, it will pull in the correct C library too. “cpanspec” will automatically generate “Requires” lines for other Perl RPMs, but it can’t do it for libraries that aren’t declared in the META.yml file.

 

So that’s it. A basic guide to building RPMs from CPAN distributions. There’s a lot more detail that I could cover, but this should be enough to work for 80-90% of the modules that you will want to build.

If you have any questions, then please leave a comment below.

The Joy of Prefetch

If you heard me speak at YAPC or you’ve had any kind of conversation with me over the last few weeks then it’s likely you’ve heard me mention the secret project that I’ve been writing for my wife’s school.

To give you a bit of background, there’s one afternoon a week where the students at the school don’t follow the normal academic timetable. On that afternoon, the teachers all offer classes on wider topics. This year’s topics include Acting, Money Management and Quilt-Making. It’s a wide-ranging selection. Each student chooses one class per term.

This year I offered to write a web app that allowed the students to make their selections. This seemed better than the spreadsheet-based mechanisms that have been used in the past. Each student registers with their school-based email address and then on a given date, they can log in and make their selections.

I wrote the app in Dancer2 (my web framework of choice) and the site started allowing students to make their selections last Thursday morning. In the run-up to the go-live time, Google Analytics showed me that about 180 students were on the site waiting to make their selections. At 7am the selections part of the site went live.

And immediately stopped working. Much to my embarrassment.

It turned out that a disk failed on the server moments after the site went live. It’s the kind of thing that you can’t predict.But it leads to lots of frustrated teenagers and doesn’t give a very good impression.

To give me time to rebuild and stress-test the site we’ve decided to relaunch at 8pm this evening. I’ve spent the weekend rebuilding the app on a new (and more powerful) server.

I’m pretty sure that the timing of the failure was coincidental. I don’t think that my app caused the disk failure. But a failure of this magnitude makes you paranoid, so I spent a lot of yesterday tuning the code.

The area I looked at most closely was the number of database queries that the app was making. There are two main actions that might be slow – the page that builds the list of courses that a student can choose from and the page which saves a student’s selections.

I started with the first of these. I set DBIC_TRACE to 1 and fired up a development copy of the app. I was shocked to see the app run about 120 queries – many of which were identical.

Of course I should have tested this before. And, yes, it’s an idiotic way to build an application. But I’m afraid that using an ORM like DBIx::Class can make it all too easy to write code like this. Fortunately, it makes it easy to fix it too. The secret is “prefetch”.

“Prefetch” is an option you can pass to the the “search” method on a resultset. Here’s an example of the difference that can make.

There are seven year groups in a British secondary school. Most schools call them Year 7 to Year 13 (the earlier years are in primary school). Each year group will have a number of forms. So there’s a one to many relationship between years and forms. In database terms, the form table holds a foreign key to the year table. In DBIC terms, the Year result class has a “has_many” relationship with the Form result class and the Form result class has a “belongs_to” relation with the Year result class.

A naive way to list the years and their associated forms would look like this:

Run code like that with DBIC_TRACE turned on and you’ll see the proliferation of database queries. There’s one query that selects all of the years and then for each year, you get another query to get all of its associated forms.

Of course, if you were writing raw SQL, you wouldn’t do that. You’d write one query that joins the year and form tables and pulls all of the data back at once. And the “prefetch” option gives you a way to do that in DBIC as well.

All we have done here is to interpose a call to “search” which adds the “prefetch” option. If you run this code with DBIC_TRACE turned on, then you’ll see that there’s only one database query and it’ll be very similar to the raw SQL that you would have written – it brings back the data from both of the tables at the same time.

But that’s not all of the cleverness of the “prefetch” option. You might be wondering what the difference is between “prefetch” and the rather similar-sounding “join” option. Well, with “join” the columns from the joined table would be added to your main table’s result set. This would, for example, create some kind of mutant Year resultset object that you could ask for Form data using calls like “get_column(‘forms.name’)”. [Update: I was trying to simplify this explanation and I ended up over-simplifying to the point of complete inaccuracy – joined columns only get added to your result set if you use the “columns” or “select/as” attributes. And the argument to “get_column()” needs to be the column name that you have defined using those options.] And that’s useful sometimes, but often I find it easier to use “prefetch” as that uses the data from the form table to build Form result objects which look exactly as they would if you pulled them directly from the database.

So that’s the kind of change that I made in my code. By prefetching a lot of associated tables I was able to drastically cut down the number of queries made to build that course selection page. Originally, it was about 120 queries. I got it down to three. Of course, each of those queries is a lot larger and is doing far more work. But there’s a lot less time spent compiling SQL and pulling data from the database.

The other page I looked at – the one that saves a student’s selections – wasn’t quite so impressive. Originally it was about twenty queries and I got it down to six.

Reducing the number of database queries is a really useful way to make your applications more efficient and DBIC’s “prefetch” option is a great tool for enabling that. I recommend that you take a close look at it.

After crowing about my success on Twitter I got a reply from a colleague pointing me at Test::DBIC::ExpectedQueries which looks like a great tool for monitoring the number of queries in your app.