Taming QMail

I run my own email server. It uses QMail. I realise there are at least two problems there – all of the cool kids have been using Gmail for their email since approximately forever, and who the hell uses QMail anyway?

Like most of these situations, it’s that way for historical reasons. And, of course, I’d love to change the current situation but it works well enough most of the time that the occasional pain it gives me isn’t worth the pain of changing the set-up.

So there I am. One of about three people in the world who are still running an email server using QMail. And that occasionally gives me problems.

One problem is the management of the mailing queues. The tools that QMail gives you for checking the contents of the queues and deleting any unnecessary messages are rather primitive. It seems that everyone uses third-party tools. A couple of years ago, I had a WordPress installation that was compromised and was used to send thousands of email messages to random people. The outgoing queue was full of spam and the incoming queue was full of bounce messages. I stopped QMail and started looking for a way to cleanly delete all the bad messages while retaining the tiny number of legitimate ones.

I discovered a program called qmHandle which did exactly what I wanted. It enabled me to remove messages from both queues that matched various criteria and before very long at all, I was back in business (having also cleaned up the WordPress installation and tightening its security).

The qmHandle program was written in Perl. And I always had it in the back of my mind that at some point I’d revisit the program and give something back by fixing it or improving it in some way. A few months ago I found time to do that.

I started by looking at the code. And realised that it had pretty much been written to be as hard to maintain as possible. Ok, that’s probably not true, but it certainly wasn’t written to be particularly easy to understand. What the original author, Michele Beltrame, created was really useful, but it seemed to me that he had made it far harder for himself than he needed to.

So I had found my project. I wanted to hack on qmHandle. But before  I could do that, I needed to rewrite it so that it was easier to work on. That became something that I’d hack on in quite moments over a few weeks. The new version is on Github. I started by importing the original version, so it’s interesting to read the commit history to trace the changes that I made. I think there were three main areas where I improved things.

  1. Splitting most of the logic out into a module. I say “most”, but it’s actually all. The command-line program is now a pleasingly simple:
  2. Improving (by which I mainly mean simplifying) the logic and the syntax. I moved a few variable declarations around (so their scope was smaller) and renamed some so their meaning was more obvious. Oh, and I added a couple of useful CPAN modules – Term::ANSIColor and Getopt::Std.
  3. Using Moose. Switching to an OO approach was a big win in general and Moose made this far easier than it would otherwise have been. At some point in the future, I might consider moving from Moose to Moo, for performance reasons.

For a few weeks now, I’ve been using the revised version on my email server and it seems to be working pretty much the same as the original version. So it’s time to set it loose on the wider world. This afternoon, I released it to CPAN. I’ve said above that the number of people using QMail to handle their email is tiny. But if you’re in that group and you want a more powerful way to manage your mail queues, then the new version of qmHandle might well be useful to you.

There are a few things that I know I need to do.

  1. More tests. The main point of moving most of the code into a module was to make it easier to test. Now it’s time to prove that. The current test suite is tiny. I need to improve that.
  2. Configuration. Currently, the configuration is all hard-coded. And different systems might well need different configuration (for example, the queues might be stored in a different directory). There needs to be a simple way to configure that.
  3. Bug fixes and improvements. This was, after all, why I started doing this. I don’t know what those might be, but I’m sure there are ways to improve the program.

I hope at least someone finds this useful.

Reviving WWW::Shorten

Last July I wrote a post threatening to cull some of my unused CPAN modules. Some people made sensible comments and I never got round to removing the modules (and I’m no longer planning to) but I marked the modules in question as “HANDOFF” and waited for the rush of volunteers.

As predicted in my previous post, I wasn’t overwhelmed by the interest, but this week I got an email from Chase Whitener offering to take over maintenance of my WWW::Shorten modules. I think the WWW::Shorten modules are a great idea, but (as I said in my previous post) the link-shortening industry moves very quickly and you would need to far more dedicated than I am in order to keep up with it. All too often I’d start getting failures for a module and, on investigation, discover that another link-shortening service had closed down.

So I was happy to hand over my modules to Chase. And in the week or so since he’s taken them over he’s been more proactive than I’ve been in the last five years.

  • He’s set up a p5-shorten organisation on Github to control the repos for all of the modules (all of my repos are now forks from those repos).
  • He has started cleaning up all of the individual modules so that they are all a lot more similar than they have previously been.
  • He has improved the documentation.
  • He has started to release new versions of some of the modules to CPAN.

All stuff that I’ve been promising myself that I’d get round to doing – for about the last five years. So I’m really grateful that the project seems to have been given the shot in the arm that I never had the time to give it. Thanks to Chase for taking it on.

Looking at CPAN while writing this post, I see that there are two pages of modules in the WWW::Shorten namespace – and only about 20% of them are modules that I wrote or inherited from SPOON. It’s great to see so many modules based on my work. However, I have no idea how many of them still work (services close down, HTML changes – breaking web scrapers). It would be great if the authors could all work together to share best practice on keeping up with this fast-moving industry. Perhaps the p5-shorten Github organisation would be a good place to do that.

Anyway, that’s seven fewer distributions that I own on CPAN. And that makes me happy. Many thanks to Chase for taking them on.

Now. Who wants Guardian::OpenPlatform::API, Net::Backpack or the AudioFile::Info modules?

Build RPMs of CPAN Modules

If you’ve been reading my blog for a while, you probably already know that I have an interest in building RPMs of CPAN modules. I run a small RPM repository where I make available all of the RPMs that I have built for myself. These will either be modules that aren’t available in other RPM repositories or modules where I wanted a newer version than the currently available one.

I’m happy to take requests for my repo, but I don’t often get any. That’s probably because most people very sensibly use the cpanminus/local::lib approach or something along those lines.

But earlier this week, I was sitting on IRC and Ilmari asked if I had a particular module available. When I said that I didn’t, he asked if I had a guide for building an RPM. I didn’t (well there are slides from YAPC 2008 – but they’re a bit dated) but I could see that it was a good suggestion. So here it is. Oh, and I built the missing RPM for him.

Setting Up

In order to build RPMs you’ll need a few things set up. This is stuff you’ll only need to do once. Firstly, you’ll need two new packages installed – cpanspec (which parses a CPAN distribution and produces a spec file) and rpm-build (which takes a spec file and a distribution and turns them into an RPM). They will be available in the standard repos for your distribution (assuming your distribution is something RPM-based like Fedora or Centos) so installing them is as simple as:

If you’re using Fedora 22 or later, “yum” has been replaced with “dnf”.

Next, you’ll need a directory structure in which to build your RPMs. I always have an “rpm” directory in my home directory, but it can be anywhere and called anything you like. Within that directory you will need subdirectories called BUILD, BUILDROOT, RPMS, SOURCES, SPECS and SRPMS. We’ll see what most of those are for a little later.

The final thing you’ll need is a file called “.rpmmacros” in your home directory. At a minimum, it should contain this:

The packager and vendor settings are just to stop you having to type in that information every time you build an RPM. The _topdir setting points to the “rpm” directory that you created a couple of paragraphs up.

I would highly recommend adding the following line as well:

This turns off the default behavior for adding “Requires” data to the RPM. The standard behaviour is to parse the module’s source code looking for every “use” statement. By turning that off, you instead trust the information in the META.yml to be correct. If you’re interesting in hearing more detail about why I think the default behaviour is broken, then ask me in a pub sometime.

Ok. Now we’re all set. We can build our first RPM.

Building an RPM

Building an RPM is simple. You use “cpanspec” to make the spec file and then “rpmbuild” to build the RPM. You can use “cpanspec” in a few different modes. If you have the module tarball, then you can pass that to “cpanspec”.

That will unwrap the tarball, parse the code and create the spec file.

But if you’re building an RPM for a CPAN module, you don’t need to download the tarball first, “cpanspec” will do that for you if you give it a distribution name.

That will connect to CPAN, find the latest version of the distribution, download the right tarball and then do all the unwrapping, parsing and spec creation.

But there’s another, even cleverer way to use “cpanspec” and that’s the one that I use. If you only know the module’s name and you’re not sure which distribution it’s in, then you can just pass the name of the module.

This is the mode that I always use it in.

No matter how you invoke “cpanspec”, you will end up with the distribution tarball and the spec file – which will be called “perl-Some-Module.spec”. You need to copy these files into the correct directories under your rpm building directory. The tarball goes into SOURCES and the spec goes into SPECS. It’s also probably easiest if you change directory into your rpm building directory.

You can now build the RPM with this command:

You’ll see a lot of output as “rpmbuild” goes through the whole CPAN module testing and building process. But hopefully eventually you’ll see some output saying that the build has succeeded and that an RPM has been written under your RPMS directory (in either the “noarch” or “x86_64” subdirectory). You can install that RPM with any of the following commands:

And that should be that. Of course there are a few things that can go wrong. And that’s what the next section is about.

Fixing Problems

There are a number of things that can go wrong when building RPMs. Here are some of the most common, along with suggested fixes.

Missing prerequisites

This is also known as “dependency hell”. The module you are building is likely to require other modules. And you will need to have those installed before “rpmbuild” will let you build the RPM (and, note, they’ll need to be installed as RPMS – the RPM database doesn’t know about modules you have installed with “cpan” or “cpanminus”).

If you have missing prerequisites, the first step is to try to install them using “yum” (or “dnf”). Sometimes you will get lucky, other times the prerequisites won’t exist in the repos that you’re using and you will have to build them yourself. This is the point at which building an RPM for a single module suddenly spirals into three hours of painstaking work as you struggle to keep track of how far down the rabbit-hole you have gone.

I keep thinking that I should build a tool which parses the prerequisites, works out which ones already exist and automatically tries to build the ones that are missing. It would need to work recursively of course. I haven’t summoned the courage yet.

Extra files

Sometimes at the end of an RPM build, you’ll get an error saying that files were found which weren’t listed in the spec file. This usually means that the distribution contains programs that “cpanspec” didn’t find and therefore didn’t add to the spec file. This is a simple fix. Open the spec file in an editor and look for the section labelled ‘%files’. Usually, it will look something like this:

This is a list of the files which will be added to the RPM. See the _mandir entry? That’s the man page for the module that is generated from the module’s Pod (section 3 is where library documentation goes). We just need to add two lines to the bottom of this section:

This says “add any files you find in the binaries directories (and also any man pages you find for those programs)”.

If you add these lines and re-run the “rpmbuild” command, the build should now succeed.

Missing header files

If you’re building an XS module that is a wrapped around a C library then you will also need the C header files for that library in order to compile the XS files. If you get errors about missing definitions, then this is probably the problem. In RedHat-land a C library called “mycoolthing” will live in an RPM called “libmycoolthing” and the headers will be in an RPM library called “libmycoolthing-devel”. You will need both of those installed.

Your users, however, will only need the C library (libmycoolthing) installed. It’s well worth telling the RPM system that this external library is required by adding the following line to the spec file:

That way, when people install your module using “yum” or “dnf”, it will pull in the correct C library too. “cpanspec” will automatically generate “Requires” lines for other Perl RPMs, but it can’t do it for libraries that aren’t declared in the META.yml file.

 

So that’s it. A basic guide to building RPMs from CPAN distributions. There’s a lot more detail that I could cover, but this should be enough to work for 80-90% of the modules that you will want to build.

If you have any questions, then please leave a comment below.

Installing Modules

If you’ve seen me giving my “Kingdom of the Blind” lightning talk this year, then you’ll know that I’ve been hanging around places like the LinkedIn Perl groups and StackOverflow trying to help people get the most out of Perl. It can be an “interesting” experience.

One of the most frequent questions I see is some variant of “I have found this program, but when I try to run it I get an error saying it can’t find this module”. Of course, the solution to this is simple. You tell them to install the missing module. But, as always, the devil is in the detail and I think that in many cases the answers I seen could be improved.

Most people seem to leap in and suggest that the original poster should install the module using cpan (advanced students might suggest cpanminus instead). These are, of course, great tools. But I don’t think this is the best answer in to these questions.

In most cases, the people asking questions are new to Perl. In some cases they don’t even want to learn any Perl – they just want to use a useful program that happens to be written in Perl. I think that launching these people into the CPAN ecosystem is a bad idea. Yes, eventually, it would be good to get them using perlbrew, local::lib and cpanminus. But one step at a time. First let’s show them how easy it can be to use Perl.

In many of these cases, I think that the best approach is to suggest that they use their native package manager to install a pre-packaged version of the required module.

Yes, I know the system Perl is evil and outdated. Yes, I know that they probably won’t get the latest and greatest version of your CPAN module from apt-get or yum.  And, yes, I know there’s a chance that the required module won’t be available from the package repositories. But I still think it’s worth giving it a try. Because in most cases the module will be  there and available as a recent enough version that it will solve their immediate problem and let them get on with their work.

There are three main reasons for this suggestion:

  1. People in this situation will almost certainly be using the system Perl anyway. And the  system Perl will already have pre-packaged modules installed alongside it. And installing modules using cpanminus alongside pre-packaged modules in the same library installation is a recipe for disaster. The package manager no longer knows what’s installed or what versions are installed and hilarity ensues.
  2. The pre-packaged versions will know about non-Perl requirements for the CPAN module and will pull those in as well as other required CPAN modules. One of the most common requests I see is for GD or one of its related modules. Your package manager will know about the underlying requirement for  libgd. cpanminus and friends probably won’t.
  3. The user is more likely to be used to using their package manager. Teaching them about the CPAN ecosystem can come later. Let’s ease them into using Perl by starting them off with tools that they are familiar with.

I know that both Fedora/Centos/RHEL and Ubuntu/Debian have large numbers of CPAN modules already pre-packaged for easy installation. Let’s suggest that people make use of this work to get up and running with Perl quickly. Later on we can show the the power and flexibility that comes with using the Perl-specific tools.

Of course there’s then a debate about when (and how) we start to wean these people off of the system Perl and pre-build packages and on to perlbrew/cpanminus/etc. But I think that having a community of people who are used to using CPAN modules (albeit in this slightly restricted manner) is an improvement on the current situation where people often avoid CPAN completely because module installation is seen as too difficult.

What do you think? Are there obvious errors in my thinking?

CPAN RPMs

If you’ve been reading my blog for a while, you’ll know that I have an interest in packaging CPAN modules as RPMs for Linux distributions based on Red Hat.

For a few years, I’ve been (infrequently) building spreadsheets which list the various modules that are available as RPMs from the better know repositories (and my own small repository). Over the weekend, I thought that those spreadsheets might be more useful if they were turned into web pages. So that’s what I did.

You can see the lists for Fedora and Centos. Over the next few days, I plan to set up cron jobs so that they are rebuilt daily.

Please let me know if you find these lists useful.