Taming QMail

I run my own email server. It uses QMail. I realise there are at least two problems there – all of the cool kids have been using Gmail for their email since approximately forever, and who the hell uses QMail anyway?

Like most of these situations, it’s that way for historical reasons. And, of course, I’d love to change the current situation but it works well enough most of the time that the occasional pain it gives me isn’t worth the pain of changing the set-up.

So there I am. One of about three people in the world who are still running an email server using QMail. And that occasionally gives me problems.

One problem is the management of the mailing queues. The tools that QMail gives you for checking the contents of the queues and deleting any unnecessary messages are rather primitive. It seems that everyone uses third-party tools. A couple of years ago, I had a WordPress installation that was compromised and was used to send thousands of email messages to random people. The outgoing queue was full of spam and the incoming queue was full of bounce messages. I stopped QMail and started looking for a way to cleanly delete all the bad messages while retaining the tiny number of legitimate ones.

I discovered a program called qmHandle which did exactly what I wanted. It enabled me to remove messages from both queues that matched various criteria and before very long at all, I was back in business (having also cleaned up the WordPress installation and tightening its security).

The qmHandle program was written in Perl. And I always had it in the back of my mind that at some point I’d revisit the program and give something back by fixing it or improving it in some way. A few months ago I found time to do that.

I started by looking at the code. And realised that it had pretty much been written to be as hard to maintain as possible. Ok, that’s probably not true, but it certainly wasn’t written to be particularly easy to understand. What the original author, Michele Beltrame, created was really useful, but it seemed to me that he had made it far harder for himself than he needed to.

So I had found my project. I wanted to hack on qmHandle. But before  I could do that, I needed to rewrite it so that it was easier to work on. That became something that I’d hack on in quite moments over a few weeks. The new version is on Github. I started by importing the original version, so it’s interesting to read the commit history to trace the changes that I made. I think there were three main areas where I improved things.

  1. Splitting most of the logic out into a module. I say “most”, but it’s actually all. The command-line program is now a pleasingly simple:
  2. Improving (by which I mainly mean simplifying) the logic and the syntax. I moved a few variable declarations around (so their scope was smaller) and renamed some so their meaning was more obvious. Oh, and I added a couple of useful CPAN modules – Term::ANSIColor and Getopt::Std.
  3. Using Moose. Switching to an OO approach was a big win in general and Moose made this far easier than it would otherwise have been. At some point in the future, I might consider moving from Moose to Moo, for performance reasons.

For a few weeks now, I’ve been using the revised version on my email server and it seems to be working pretty much the same as the original version. So it’s time to set it loose on the wider world. This afternoon, I released it to CPAN. I’ve said above that the number of people using QMail to handle their email is tiny. But if you’re in that group and you want a more powerful way to manage your mail queues, then the new version of qmHandle might well be useful to you.

There are a few things that I know I need to do.

  1. More tests. The main point of moving most of the code into a module was to make it easier to test. Now it’s time to prove that. The current test suite is tiny. I need to improve that.
  2. Configuration. Currently, the configuration is all hard-coded. And different systems might well need different configuration (for example, the queues might be stored in a different directory). There needs to be a simple way to configure that.
  3. Bug fixes and improvements. This was, after all, why I started doing this. I don’t know what those might be, but I’m sure there are ways to improve the program.

I hope at least someone finds this useful.

Training in Cluj – The Poll

A couple of weeks ago, I mentioned that I was planning to run a one-day training course the day before YAPC Europe in Cluj-Napoca this year. There have been a few discussions of my ideas in a various forums, so now it’s time for the next stage.

Below, you’ll see a simple questionnaire. Please use it to give your feedback on what course you would like me to run – and how much you think it should cost.

I’ll collate all of the responses in a couple of weeks and make an announcement about what I’m going to do.

Training in Cluj

I’m going to be running a day of training before YAPC Europe in Cluj. It’ll be on Tuesday 23rd August. But that’s all I know about the course so far, because I want your help to plan it.

Training has been a part of the YAPC experience for a long time. And I’ve often run courses alongside YAPC Europe. I took a look back through my talk archives and this is what I found.

  • 2003 (Paris) – I gave a half-day tutorial on “Tieing and Overloading Objects”
  • 2006 (Birmingham) – Another half-day tutorial called “Advanced Databases for Beginners”
  • 2008 (Copenhagen) – The “Perl Teach-In” was a one-day course about new and interesting Perl tools
  • 2009 (Lisbon) – A two-day “Introduction to Perl” course
  • 2010 (Pisa) – “Introducing Modern Perl”
  • 2011 (Riga) – “Introducing Modern Perl” (I had completely forgotten giving the same course two years running)
  • 2015 (Granada) – “Database Programming with DBIx::Class and Perl”

The first two (the half-day courses) were both given as part of the main conference. The others were all separate courses run before the conference. For those, you needed to pay extra – but it was a small amount compared with normal Perl training rates.

So now it’s 2016 and I want to run a training course in Cluj. But what should it be about? That’s where you come it. I want you to tell me what you want training on.

I’m happy to update any of the courses listed above. Or, perhaps I could cover something new this year. I have courses that I have never given at YAPC – on Moose, testing, web development and other things. Or I’d be happy to come up with something completely new that you want to hear about.

Please comment on this post, telling me your opinions. I’ll let the discussion run for a couple of weeks, then I’ll collate the most popular-looking choices and run a poll to choose which course I’m going to run.

Don’t forget – training in Cluj on 23rd August. If you’re booking travel and accommodation for the conference then please take that into account.

Oh, and hopefully it won’t just be me. If you’re a trainer and you’re going to be in Cluj for the conference, then please get in touch and we’ll add you to the list. The more courses we can offer, the better.

So here’s your chance to control the YAPC training schedule. What courses would you like to see?

Code Archaeology

Long-time readers will have seen some older posts where I criticised Perl code that I’ve found in various places on the web. I thought it was about time that I admitted to some of the dodgier corners of my programming career.

You may know that one of my hobbies is genealogy. You might also know that there’s a CPAN module for dealing with GEDCOM files and a mailing list for the discussion of the intersection of Perl and genealogy. The list is usually very quiet, but it woke up briefly a few days ago when Ron Savage asked for help reconstructing some old genealogy software of his that had gone missing from his web site. Once he recovered the missing files, I noticed that in the comments he credited a forgotten program of mine for giving him some ideas. This comment included a link to my web site which (embarrassingly) was now a 404. I don’t link to leave broken links on the web, so I swiftly put a holding page in place on my site and went off to find the missing directory.

It turns out that the directory had been used to distribute a number of my early ventures into open source software. The Wayback Machine had many of them but not everything. And then I remembered that I had full back-ups of some earlier versions of my web site squirrelled away somewhere and it only took an hour or so to track them down. So that I don’t mislay them again, I’ve put them all on Github – in an appropriately named repository.

I think that most of this code dates from around 2000-2003. There’s evidence that a lot of it was stored in CVS or Subversion at some time. But the original repositories are long gone.

So, what do we have there? And just how bad is it?

There’s a really old formmail program. And it immediately becomes apparent that when I wrote, not only did I not know as much Perl as I thought, but I was pretty sketchy on the basics of internet security as well. I can’t remember if I ever put it live but I really hope not.

Then there’s the “ms” suite of programs. My freelancing company is called Magnum Solutions and it amused me when I realised that people could potentially assume that this code came from Microsoft. I don’t think anyone ever did. Here, you’ll find the beginnings of what later became the nms project – but the nms versions are far more secure.

There’s the original slavorg bot from the #london.pm IRC channel. The channel still has a similar bot, but the code has (thankfully) been improved a lot since this version.

Then there’s something just called spam. I think I was trying to get some stats on how much spam I was getting.

There are a couple of programs that date from my days wrangling Sybase in the City of London. There’s a replacement for Sybase’s own “isql” command line program. My version is called sqpl. I can’t remember what I didn’t like about isql, or how successful my replacement was. What’s interesting about this program is that there are two versions. One uses DBI to connect to the database, but the other uses Sybase’s own proprietary “CTlib” connection library. Proof, I guess that I was talking to databases back when DBI was too new and shiny to be trusted in production.

The other Sybase-related program is called sybserv. As I recall, Sybase uses a configuration file to define the connection details of the various servers that any given client can connect to. But the format of that file was rather opaque (I seem to remember the IP address being stored as a packed integer in some cases). This program parses this file and presents the data in a far more readable format. I remember using it a lot. I believe it’s the only Perl program I’ve ever written that uses formats.

Then there’s toc. That reads an HTML document, looking for any headers. It then builds a table of contents based on those headers and inserts it into the document. I think it’ll still work.

The final program is webged. This is the one that Ron got inspiration from. It parses a GEDCOM file and turns it into a web site. It works in two modes, you can either pre-generate a whole site (that’s the sane way to use it) or you can use it as a CGI program where it produces each page on the fly as it is requested. I remember that parsing the GEDCOM file was unusably slow, so I implemented an incredibly naive caching mechanism where I stored a Data::Dumper version of the GEDCOM object and just “eval”ed that. I was incredibly proud of myself at the time.

The code in most of these programs is terrible. Or, at least, it’s very much a product of its time. I can forgive the lack of “use warnings” (Perl 5.6 wasn’t widely used back when this code was written) as they all have “-w” instead. But it’s the use of ampersands on most of the subroutine calls that makes me cringe the most.

But please have fun looking at the code and pointing out all of the idiocies. Just don’t put any of the CGI programs on a server that is anywhere near the internet.

And feel free to share any of your early code.

Reviving WWW::Shorten

Last July I wrote a post threatening to cull some of my unused CPAN modules. Some people made sensible comments and I never got round to removing the modules (and I’m no longer planning to) but I marked the modules in question as “HANDOFF” and waited for the rush of volunteers.

As predicted in my previous post, I wasn’t overwhelmed by the interest, but this week I got an email from Chase Whitener offering to take over maintenance of my WWW::Shorten modules. I think the WWW::Shorten modules are a great idea, but (as I said in my previous post) the link-shortening industry moves very quickly and you would need to far more dedicated than I am in order to keep up with it. All too often I’d start getting failures for a module and, on investigation, discover that another link-shortening service had closed down.

So I was happy to hand over my modules to Chase. And in the week or so since he’s taken them over he’s been more proactive than I’ve been in the last five years.

  • He’s set up a p5-shorten organisation on Github to control the repos for all of the modules (all of my repos are now forks from those repos).
  • He has started cleaning up all of the individual modules so that they are all a lot more similar than they have previously been.
  • He has improved the documentation.
  • He has started to release new versions of some of the modules to CPAN.

All stuff that I’ve been promising myself that I’d get round to doing – for about the last five years. So I’m really grateful that the project seems to have been given the shot in the arm that I never had the time to give it. Thanks to Chase for taking it on.

Looking at CPAN while writing this post, I see that there are two pages of modules in the WWW::Shorten namespace – and only about 20% of them are modules that I wrote or inherited from SPOON. It’s great to see so many modules based on my work. However, I have no idea how many of them still work (services close down, HTML changes – breaking web scrapers). It would be great if the authors could all work together to share best practice on keeping up with this fast-moving industry. Perhaps the p5-shorten Github organisation would be a good place to do that.

Anyway, that’s seven fewer distributions that I own on CPAN. And that makes me happy. Many thanks to Chase for taking them on.

Now. Who wants Guardian::OpenPlatform::API, Net::Backpack or the AudioFile::Info modules?