Yak Shaving with Aphra Behn

Aphra Behn Frontispiece

I have a few ideas for static web sites that I want to build. And, currently, the best place to host static web sites is, in my opinion, Github Pages.

And if you’re hosting a site on Github Pages, everyone knows that the best tool to use is Jekyll. Or is it?

I’ve tried to use Jekyll a couple of times and it just confused me. Something about the way it works just doesn’t fit into my head in some way. I’m not sure what it is, but every time I change something, it all breaks completely. I’m sure the problem is with me rather than the software. Everyone else seems to get on with it just fine.

So, anyway, when faced with a problem like that I did what any self-respecting geek would do. I wrote my own tool to solve the problem.

And, of course, I wrote it in Perl (because that’s what I know best) and I used the Template Toolkit (because, well, why wouldn’t I). And because I wrote it to reflect the way that I think about building static web sites, I understand how it works.

To be honest, how it works is pretty simple so far. It takes a bunch of files in an input directory, processes them using the Template Toolkit and writes them into a mirror directory structure under an output directory. So far, not so different to tttree (the tool that comes with the Template Toolkit), but there’s one little improvement that I’m finding very useful.

I like writing text using Markdown. And I thought that it would be great to write text in Markdown, but have it pre-processed to HTML before passing it through the Template Toolkit. A couple of months ago I released Template::Provider::Pandoc which does just that (actually, it does a lot more than that – it will convert between any two text formats that are supported by Pandoc.

And my new site builder software used Template::Provider::Pandoc to process all of the templates in the site. You don’t really want to be using Markdown for the main layout of your site – Markdown is rubbish for building navbars, footers or image carousels – but when I have a large amount of text, I can [% INCLUDE %] a template which includes that text in Markdown, knowing that it will be converted to HTML before being included in the page.

I’ve called the software aphra (for reasons that I’ll get to in a minute). There’s an early version on CPAN and the code is, of course, on Github too.

If you want to try it out, the best documentation, currently, is in the command line tool, aphra, that comes with the distribution.

What about that name?

Yes, it’s a strange name.

When I first realised I’d be writing something like Jekyll, I wanted to call it Hyde. I wanted to be able to say that it was uglier and more powerful than Jekyll. But there’s already a Python sitebuilder called that. Then I considered Utterson (he’s Henry Jekyll’s friend in the novel) but that had been taken too.

So I abandoned the idea of using the name of a character from The Strange Case of Dr Jekyll and Mr Hyde and started looking elsewhere.

I first came across Aphra Behn when I read Philip José Farmer‘s Riverworld books about thirty years ago and she has stuck with me ever since. [I should point out for people who haven’t read Farmer’s books that he takes real historical characters, like Behn, and drops them into a science fiction environment.]

Behn was a British writer who wrote novels, plays and poetry in the second half of the seventeenth century. At a time when women simply didn’t do those things, it just didn’t seem to occur to her that she shouldn’t. She was a great role model to many of the great women writers of the following centuries.

Oh, and she was a spy too, during the Second Anglo-Dutch War.

All in all, she was an inspirational woman who deserves wider recognition. And I hope that, in some small way, that my software will raise her profile.

What’s next?

So now I have my tool, it’s time to start creating the web sites that I wanted. I hope to have some news on those for you in a few weeks.

Or, perhaps, I’ll get bogged down creating a web site for Aphra. I’ve just registered a domain name…

Genealogical Timelines in Perl and SVG

Genealogical Timeline for Prince George

If you ever read my (mostly dead) more general blog, you might know that I’m a bit of an amateur genealogist. I’ve been tracing my family for over twenty-five years and I’ve got some branches of it back to the 1700s (actually, I have one branch back to the late 1600s).

One problem in genealogy is how to present data in a readable and easily-understandable way. Family trees are messy things. Both the roots and the branches can get very tangled. A good way to cut through all of that is to ignore unnecessary branches and just show the ancestors of a given person on the tree.

And that’s what the image at the top of this post shows. In the right-hand side of the image, halfway down, you will see Prince George of Cambridge (ok actually, you’ll see “Princ”, that’s a bug that I need to fix – it works when someone’s lifespan is long enough to fit their name in!) Above and below him (at a quarter and three-quarters of the way down the page) you’ll see his parents. And so on back through time until on the left of the page you’ll see his great, great, grandparents – most of whom were born back in the nineteenth century.

It’s all created with a Perl program, of course. I’ve just uploaded SVG::Timeline::Genealogy to CPAN (it should be there at some point later today) and that can be used to draw these diagrams.

The module is very similar to SVG::Timeline which I wrote about a couple of weeks ago. And that’s completely unsurprising as it’s a sub-class of that module. Interestingly, early drafts of this module pre-date SVG::Timeline, but I recently realised that it should be a sub-class so I spent yesterday re-implementing it (and making more than a few changes to SVG::Timeline as some idiot had made it hard to sub-class!)

There are two ways to use the module. The hard way involves writing your own code:

The easy way involves putting the information in a data file and using the treeline program that is included in the distribution.

The fields in the data are separated by tabs.

The important bit to get right is the “ahnen” attribute. “Ahnen” is short for “Ahnentafel Number” and it’s a concept that is common in genealogy. You take a person in your family tree (say you for example) and give that person a number of 1. Your father then has a number of 2 and your mother is 3. Carry on with that scheme through the generations. You paternal grandparents are 4 and 5, your maternal grandparents are 6 and 7… and so on.

These numbers have a couple of interesting properties. Firstly, if a person has an Ahnentafel Number of $x, then their parents are 2 * $x and 2 * $x + 1. Secondly, with the exception of person 1 (who can obviously be of either sex) all the men have even numbered Ahnentafel Numbers and the women all have odd numbers.

It is therefore these numbers that allow us to convert a flat data file into a tree structure. They tie the records together in the correct order. If you want to know more, I have a module called Genealogy::Ahnentafel which allows you to manipulate these numbers in various ways.

So that’s, SVG::Timeline::Genealogy. Hope you find it useful. Please share any interesting genealogies that you find.

Timeline Diagrams with Perl

Diagram of the "begat" sequence from Genesis 5

Two weeks ago, I introduced you to my new module SVG::TrafficLight and hinted that there were more SVG-based modules to follow. Today, I’d like to talk about the next one – SVG::Timeline.

It all started over a year ago when I was looking through some of the more ridiculous religious questions on Quora when I came across one asking why Adam wasn’t mentioned in the Bible after the first couple of books in Genesis. As part of my answer I wanted to illustrate just how long a time Genesis 5 covers.

I knew that SVG would be the best approach and it only took half an hour or so to whip up the image you can see in my answer (and there’s a newer version, generated with the current version of the code, at the top of this page). It’s important to note that  I didn’t hand-craft SVG that drew the diagram – I wrote code that generated the diagram from an input file.

I then realised that this could be a more generally useful tool, so I set about making the code more generic. It languished on Github for a year or so before I decided it was useful enough to clean it up and release it to CPAN. Let’s take a quick look at how it works.

As you can see from the example above, a timeline is made up of a number of events. An event has a start date, an end date and some text. So you can start a timeline diagram with code like this:

And once you have added all of your events, you can produce the timeline using:

That code writes the SVG document is written to STDOUT, so you’ll probably want to redirect that to a file.

That will draw a timeline of your events using all of the default settings (which, in most cases produce a useful diagram). There are plenty of options that you can pass to the object constructor to tweak things. The most useful are probably the aspect ratio (if your diagram is going to be particularly long or thing – the default is 16/9) and the number of years between gridlines in  the output (the default is ten and you might want to change that if your timeline covers a particularly large or small number of years – like the Genesis example above).

The default behaviour is to colour all of the events the same colour (which can be changed from the default in the constructor for the SVG::Timeline object). But you can also change it for each individual event by adding an optional “colour” parameter to the add_event() call.

But that’s all a lot of work for the simple case. So the distribution also includes a command line program called timeline which does all of that for you. It reads a datafile and produces a timeline diagram based on  the contents.

Each record in the input file has three or four fields separated by tabs. The fields are the parameters for the add_event() call in the order: text, start, end and (optionally) colour.

There are example data files in the distribution for producing some of the timelines I’ve talked about in this article – along with shell scripts showing how to produce timeline diagrams using the command line program.

There are a few things I’d like to add. Support for events with unknown dates (perhaps fading the colour towards the unknown end). Diagrams that go vertically instead of horizontally and support for events that begin and end in the same year (currently, they are zero size and just vanish – I discovered that when I added Paul McGann and Christopher Eccleston to the Doctor Who example).

I find the program… well, if not exactly useful, it’s still fun to play with.

Please let me know if you produce any interesting timelines with it.

Drawing Traffic Lights With Perl

Traffic Lights

For a thing (that you may hear more about at some point in the future) I needed diagrams of traffic lights. But Google Image Search didn’t really have what I was looking for. Everything was either too realistic or not CC-licensed so I could use the images how I wanted.

So I decided to do it myself. But I’m not exactly artistic. I far prefer it when I can get computers to draw images for me. I’ve dabbled with SVG before and it seemed like the perfect tool for the job. And there’s a module from CPAN that makes it simple to create SVG images from Perl.

It only took an hour or so before I was drawing images like the one above – which was exactly what I was looking for.

Initially, I shared my code as a Gist, but since then I’ve extracted the useful bits into a module which I’ve uploaded to CPAN as SVG::TrafficLight. I’ve tried to make it as configurable as possible, so you should be able to use it for all your traffic light drawing needs as well.

Starting to use it is pretty simple.

The default sequence of lights shows the UK’s standard traffic light sequence (green,  amber, red, red and amber, green) but it’s simple enough to produce a different sequence (even one that you would never see on the roads).

If you read the documentation, you’ll see how you can customise pretty much anything in the diagram – the size of the lights, the padding between them, even the colours used.

Let me know if you find it all at useful. SVG is fun. I’ll think I’ll investigate it some more.

 

Version Numbers

Last week I mentioned how I had uploaded a new version of Symbol::Approx::Sub. Because there were pretty major changes to the inner workings of the module (although the interface still looked the same) I decided that I would move it from version 2.07 to version 3. At the same time, I decided that I would switch to a semantic versioning scheme.

Later in the week, I released minor updates to a few more of my modules. And I decided to apply semantic versioning to those as well. But as I was only making minor packaging fixes to these modules, I didn’t increment the major version number. For example, Array::Compare went from 12.2 to 12.2.1.

It turns out that was a mistake.

Well, I don’t really think it was a mistake. I think it was the right thing to do. But it appears that my opinion is at odds with what some parts of the Perl toolchain think.

Last night I got this bug report. It seems that by switching to three-part semantic versions, the version number can (in some quite common circumstances) appear to decrease.

To my mind, a version number is a dot-separated sequence of numbers. So 12.2 is smaller than 12.2.1. Any sane version number comparison will separate the two strings on dots and compare the individual components. Any missing components (12.2 is, for example, one component shorter than 12.2.1) should be assumed to be zero.

But that’s not what the Perl toolchain does. Observe:

When the version number with two components (2.12) is split into components, the second component is bizarrely treated as a three-digit number so it becomes 12o instead of 12 and when it is compared with the second component of the three-component version, 120 is obviously larger than 12 and any tool which relies on this behaviour to work out which version of a module is the most recent will get the wrong answer.

This leads to other “interesting” effects. In my head, versions 1.1, 1.01 and 1.001 are all the same version. The leading zeroes mean nothing. But under this scheme, they are very different version numbers.

I know that versioning isn’t as easy as it should be and I know that some people use bizarre versioning systems. And I’m pretty sure that no matter how bizarre a versioning system is, you’ll almost certainly find an example of it on CPAN. So I suppose that this behaviour was a “least worse” scenario that was chosen to make the most sense given CPAN’s wide range of versioning schemes.

Personally, I see it as a bug in version.pm. But I’m not going to report it as such as I’m sure the Perl toolchain gang know what they’re doing and have very good reasons for adopting this seemingly broken behaviour.

I just need to remember to be more careful when switching my modules to semantic versioning. Using a minor or patch level version change when switching to semantic versioning is likely to lead to confusion and bug reports. Only a major level change (as I did with Symbol::Approx::Sub) is guaranteed to work.

And, I suppose, I’ll need to release Array::Compare 3.0.0 to CPAN pretty soon.