Driving a Business with Perl

I’ve been a freelance programmer for over twenty years. One really important part of the job is getting paid for the work I do. Back in 1995 when I started out there wasn’t all of the accounting software available that you get now and (if I recall correctly) the little that was available was all pretty expensive stuff.

At some point I thought to myself “I don’t need to buy one of these expensive systems, I’ll write something myself”. So I sat down and sketched out a database schema and wrote a few Perl programs to insert data about the work I had done and generate invoices from that data.

I don’t remember much about the early versions. I do remember coming to the conclusion that the easiest way to generate PDFs of the invoices was using LaTex and then wasting a lot of time trying to bend LaTeX to my will. I got something that looked vaguely ok eventually, but it was always incredibly painful if I ever needed to edit it in any way. These days, I use wkhtmltopdf and my life is far easier. I understand HTML and CSS in a way that I will never understand LaTeX.

Why am I telling you this, twenty years after I started using this code? Well, during this last week, I finally decided it was time to put the code on Github. There were two reasons for this. Firstly, I thought that it might be useful for other people. And secondly, I’m ashamed to admit that this is the first time that the code has ever been put under any kind of version control (and, yes, this is an embarrassing case of “do as I say, not as I do“). I have no excuses. The software I used to drive my business was in a few files on a single hard drive. Files that I was hacking away at with gay abandon when I thought they needed changing. I am a terrible role model.

Other than all the obvious reasons, I’m sad that it wasn’t in version control as it would have been interesting to trace the evolution of the software over the last twenty years. For example, the database access started as raw DBI, spent a brief time using Class::DBI and at some point all got moved to DBIx::Class. It’s likely that I wasn’t using the Template Toolkit when I started – but I can’t remember what I was using in its place.

Anyway, the code is there now. I don’t give any guarantees for its quality, but it does the job for me. Let me know if you find any of it interesting or useful (or, even, laughable).

p.s. An interesting side effect of putting it under (public) version control – since I uploaded it to Github I have been constantly tweaking it. The potential embarrassment of having my code available for anyone to see means that I’ve made more improvements to it in the last week that I have in the previous five years. I’m even considering replacing all the command line programs with a Dancer app.

p.p.s. I actually use FreeAgent for all my accounting these days. It’s wonderful and I highly recommend it. But I still use my own system to generate invoices.

Subroutines and Ampersands

I’ve had this discussion several times recently, so I thought it was worth writing a blog post so that I have somewhere to point people the next time it comes up.

Using ampersands on subroutine calls (&my_sub or &my_sub(...)) is never necessary and can have potentially surprising side-effects. It should, therefore, never be used and should particularly be avoided in examples aimed at beginners.

Using an ampersand when calling a subroutine has three effects.

  1. It disambiguates the code so the the Perl compiler knows for sure that it has come across a subroutine call.
  2. It turns off prototype checking.
  3. If you use the &my_sub form (i.e. without parentheses) then the current value of @_ is passed on to the called subroutine.

Let’s look at these three effects in a little more detail.

Disambiguating the code is obviously a good idea. But adding the ampersand is not the only way to do it. Adding a pair of parentheses to the end of the call (my_sub()) has exactly the same effect. And, as a bonus, it looks the same as subroutine calls do in pretty much every other programming language ever invented. I can’t think of a single reason why anyone would pick &my_sub over my_sub().

I hope we’re agreed that prototypes are unnecessary in most Perl code (perhaps that needs to be another blog post at some point). Of course there are a few good reasons to use them, but most of us won’t be using them most of the time. If you’re using them, then turning off prototype checking seems to be a bad idea. And if you’re not using them, then it doesn’t matter whether they’re checked or not. There’s no good argument here for  using ampersands.

Then we come to the invisible passing of @_ to the called subroutine. I have no idea why anyone ever thought this was a good idea. The perlsub documentation calls it “an efficiency mechanism” but admits that is it one “that new users may wish to avoid”. If you want @_ to be available to the called subroutine then just pass it in explicitly. Your maintenance programmer (and remember, that could be you in six months time) will be grateful and won’t waste hours trying to work out what is going on.

So, no, there is no good reason to use ampersands when calling subroutines. Please don’t use them.

There is, of course, one case where ampersands are still useful when dealing with subroutines – when you are taking a reference to an existing, named subroutine. But that’s the only case that I can think of.

What do you think? Have I missed something?

It’s unfortunate that a lot of the older documentation on CPAN (and, indeed, some popular beginners’ books) still perpetuate this outdated style. It would be great if we could remove it from all example code.

Modern Perl Articles

Back in 2011 I wrote a series of three articles about “Modern Perl” for Linux Format. Although I mentioned all three articles here as they were published, I didn’t post the actual contents of the articles as I wasn’t sure about the copyright situation.

But now I suspect that enough time has passed that copyright is no longer going to be an issue, so I’ve added the full text of the articles to this site. The articles are all about writing a simple web application to track your reading. They use DBIx::Class and Dancer.

Let me know if you find them interesting or useful.

Dev Assistant

A couple of days ago, I updated to my laptop to Fedora 21. One of the new features was an application called DevAssistant which claimed that:

It does not matter if you only recently discovered the world of software development, or if you have been coding for two decades, there’s always something DevAssistant can do to make your life easier.

I thought it was worth investigating – particularly when I saw that it had support for Perl.

Starting the GUI and pressing the Perl button gives me two options: “Basic Class” and “Dancer”. I chose the “Basic Class” option. That gave me an dialogue box where I could give my new project a name. I chose “MyClass” (it’s only an example!) This created a directory called MyClass in my home directory and put two files in that directory. Here are the contents of those two files.

main.pl

myClass.pm

It’s great, of course, that the project wants to support Perl. I think that we should do everything we can to help them. But it’s clear to me that they don’t have anyone on the team who knows anything about modern Perl practices.

So who wants to volunteer to help them?

Update: So it turns out that the dev team are really responsive to pull requests :-)

Perl’s Problems

It’s been over six weeks since I wrote my blog post on Perl usage. I really didn’t mean to leave it so long to write the follow-up. But real life intervened and I haven’t had time for much blogging. That’s still the case (I should be writing a talk right now) but I thought it was worth jotting down some quick notes about what I think is causing Perl’s decline.

Reputation

We have a lot to thank Matt Wright for. And I don’t mean that sarcastically. A lot of the popularity of Perl in the mid-90s stems directly from people like Matt and Selena Sol making their collections  of CGI programs available really early on. The popularity of their programs made Perl the de-facto standard for CGI programming.

But that was a double-edged sword. People searching the web for examples of CGI programming found Matt or Selena’s code and assumed they represented best practice. Which, of course, they didn’t. While people were blithely copying Matt’s programming style, good Perl programmers were using CGI.pm to parse their incoming parameters and separating their HTML generation out into templates.

In my previous post, I mentioned that fifteen or twenty years ago Perl was the programming language of choice for internet start-ups. That’s true, but a lot of the code written at that time was in the Matt Wright style. Matt’s style just about works for a guestbook or a form mailer. But when you try to build a business on top of code like that, it quickly becomes obvious that it’s an unmaintainable mess.

Many of the technical architects and CTOs who are making decisions about technology in companies today are the programmers who spent too many late nights battling those balls of mud in the 1990s. They were never really Perl programmers, they were only using it because it was fashionable, and they haven’t been keeping up with recent advances in Perl so it’s not surprising that they often choose to avoid using Perl.

Complexity

A lot of Perl’s reputation as executable line noise is completely unwarranted. The people who were writing those 1990s balls of mud were under such pressure to deliver that they would have almost certainly delivered something just as unmaintainable whatever language they were using. But some of that reputation is fair. I’ve been teaching Perl for almost fifteen years and I know that there are some parts of Perl that people find confusing. Here are some examples:

Sigils – I can explain things like @array, $array[$key] and even @array[@keys] to people. And most of them get it. But it takes them a while. And then it all goes to pieces again when I have to explain the difference between $array[$key] and $array->[$key].

Context – Does any other programming language have the concept of context? Yes, when used correctly it’s a powerful tool. But it’s hard to explain and a good source of hard-to-find bugs. Can anyone honestly say that they haven’t been bitten by a context bug at some point in the last years?

Data Structures – Is the difference between arrays and array references really necessary? Think of all the complexity that is added because you can’t just pass arrays and hashes into subroutines without being bitten by list flattening. As experienced Perl programmers we know the problems and our brains are hard-wired to work around it. But other languages treat all aggregate data structures as references and it all becomes a lot easier.

I know that each of these features (and half a dozen other examples I could list) makes Perl a richer and more expressive language. But this comes at the cost of learnability and readability. Perhaps that trade-off once seemed like a good idea. When you’re trying to encourage people to look at your language then the advantages seem less obvious.

Of course, none of these features can be changed as they would break pretty much every existing Perl codebase. Which would be a terrible idea. But you can get away with a lot more breakage when you increase your major version number. Which Perl hasn’t been able to do for fourteen years.

Perl 6

I need to be clear here. I think that Perl 6 looks like a great language. I am really looking forward to using on production systems. And it looks like the current Perl 6 team are doing great work towards making that possible. In fact I think that our best approach to reviving Perl’s fortunes is to get a production-ready version of Perl 6 out and to make a big noise about that.

However, that name has been a big problem.

Looking from outside the Perl echo chamber, it’s easy to believe that Perl hasn’t had a major release for twenty years. And that can probably explain a lot of Perl’s current problems.

I know that people who believe that are wrong. The current version of Perl (5.20.1 as I write this) is a lot different to the version that was current when Perl 6 was first announced (which was 5.6.0, I think). Perl has gone through huge changes in the last fourteen years. But the version number hides that.

I also know that we no longer tell people that Perl 6 is the next version of Perl. The Wikipedia page makes it clear in its first sentence that “Perl 6 is a member of the Perl family of programming languages“. So why do people continue to think it’s the next version of Perl? Well, probably because people assume that they know how software version numbers work and don’t bother to check the web site to see it a particular project has changed the standard meaning that has worked well for decades.

So Perl 6 has been simultaneously both good and bad for Perl. Good because a lot of Perl 6 ideas have been backported into Perl 5. But bad because Perl 5 has been unable to change its major version number in order to advertise these improvements to the wider software-using world.

Nothing can be done about this now. The damage is done. As I said at the start of this section, it’s likely that the only thing we can do is to bet heavily on Perl 6 and get it out as soon as possible. Perl 5 will continue to exist. People will continue to maintain and improve it. Some companies will continue to use it. But it’s usage will continue to fall. I really think it’s too late to do anything about that.

“I Do Not Want To Use Any Modules”

Almost every day on the Perl groups on LinkedIn (or Facebook, or StackOverflow, or somewhere like that) I see a question that includes the restriction “I do not want to use any modules”.

There was one on LinkedIn yesterday. He wanted to create a MIME message to pass to sendmail, but he didn’t want to install any modules. Because “getting a module installed will have to go though a long long process of approvals”.

And I understand that. I really do. We’ve all seen places where getting new software installed is a problem. But I see that problem as a bug in the development process. A bug that needs to be fixed before anything can get done in a reasonable manner. Here’s what I’ve just written in reply:

Of course it can be achieved without modules. Just create an email in the correct format and pass it to sendmail.

Ah, but what’s the right format? Well, that is (of course) the tricky bit. I have no idea what the correct format is. Oh, I could Google a bit and come up with some ideas. I might even find the RFC that defines the MIME format. And then I’d be able to knock up some code that created something that looked like it would work. But would I be sure that it works? In every case? With all the weird corner-cases that people might throw at it?

This is where CPAN modules come in handy. You’re using someone else’s knowledge. Someone who is (hopefully) an expert in the field. And because modules are used by lots of people, bugs get found and fixed.

A lot of modern Perl programming is about choosing the right set of CPAN modules and plumbing them together. That’s what makes Perl so powerful. That’s what makes Perl programmers so efficient. We’re standing on the shoulders of giants and re-using other people’s code.

If you’re not going to use CPAN then you might as well use shell-scripting or awk.

If you’re in a situation where getting CPAN modules installed is hard, then fixing that problem should be your first priority. Because that’s a big impediment to your Perl programming. And investing time in fixing that will be massively beneficial to you in a very short amount of time.

The obvious solution is to install your own module tree (alongside your own Perl) as part of your application. But that might be overkill in some situations, so you could also consider using the system Perl and asking your sysadmin to install packages from your distribution’s repositories. Of course, that might need a change in process. But it’s a change that is well worth making; a change that will improve your (programming) life immensely.

Update: Some very interesting discussion about this over on Reddit.

Perl Usage

In my last blog post, I posted a graph showing that out of 135 companies at a recent Silicon MilkRoundabout recruitment event, only one said that they were using Perl. That has led to some interesting discussions that I’d like to address here.

I should make it clear that I wasn’t presenting my graph as evidence that Perl is dead. Of course you can’t leap to conclusions like that from what I learned at one recruitment event. I do, however, think that the situation is pretty grim.

But firstly, a few points that people made to me in response to my post.

We know that Perl isn’t used in start-ups
Yes. I think we do know that. But I don’t think we’re as worried about that as we should be. Imagine if that job fair was held fifteen years ago. Or twenty years ago. Perl used to be the language of choice for internet start-ups. What happened to change that? (I have some theories that I’ll cover in another blog post) Can this trend be reversed? (Honestly, I don’t think so – but I’m open to arguments to the contrary)

Every programmer I know uses Perl in some way
I think this might have been true fifteen years ago, but it hasn’t been the case for some time. If it’s really true that all programmers that you know still use Perl, then I think you only know a really bizarre cross-section of programmers.

All companies use Perl, but the HR department or management often don’t know
This is similar to the last point. And, again, I think it’s something that used to be true and hasn’t really been true this millennium. But there’s also the idea of Perl being the programmers “secret weapon” that the suits don’t know about. Even if it’s true (and I don’t think it is), then going underground like that is likely to be harmful to Perl’s popularity in the long term.

I think we should stop fooling ourselves here. Perl usage has been declining for over a decade. To a first level of of approximation, Perl is already a dead language.

Of course, The Perl community has spent a lot of the last few years actively denying that. I’ve been responsible for some of that drum-beating myself. But we need to accept that it’s true. For most people outside of the Perl bubble, Perl is a language that they last considered using back in the last millennium.

So, if Perl is dead, why has everyone spent the last five years demonstrating that this isn’t the case? Have they been lying to us? No, I don’t think they have. I just think that they have been looking at the wrong measures of success. Let’s look at some of the arguments I’ve seen.

CPAN is growing faster than ever
We have regular releases of Perl
Some great new features have been added to Perl
These all essentially boil down to the same argument – “Perl isn’t dead because some part of Perl (or its ecosystem) is improving”. I can’t argue with any of those facts, but do they really say anything useful about the long-term viability of the language. It’s great that Perl is constantly improving, but unless the people who are currently ignoring Perl can be persuaded to investigate these improvements, then they do little or nothing to stop Perl’s decline.

Moose might be the most powerful object system in the world. DBIx::Class might be the most flexible ORM available. Projects like these are great. But they don’t seem to be doing much to bring new people to Perl.

There are more YAPCs and Perl Workshops every year
Perl Mongers groups are starting all the time
We get dozens of people to our meetings every month
These arguments all boil down to “the Perl community is growing”. Again, I can’t argue with those facts (well, to be honest, I think the rate of Perl Monger group creation has slowed over the last ten years) but, again, I don’t think they prove what their proponents think they prove.

There is a difference between the Perl community and Perl programmers. Everywhere that I work, I find people who I already know from the community. But I always find far more people who I don’t know because they aren’t at all engaged with the Perl community. And I think it’s that large, untapped, number of non-community Perl programmers who make up the increased numbers of people attending meetings or conferences. This means that we are getting better at bringing our colleagues along to meetings. It doesn’t mean that more people are using Perl.

The number of Perl jobs is rising
Our company can never find enough Perl programmers
We just started a major new project using Perl
Most of the companies who use Perl continue to use Perl. That’s not really news. And some of those companies have grown really big and therefore need lots of Perl programmers to maintain and enhance their Perl programs. And that’s great. But it’s not really evidence of a grow in Perl usage.

Not all the companies who have historically used Perl continue to do so. Over the last five years I know of at least four big Perl-using companies in London who have started to move away from it for new development.

And one reason why people are always looking for Perl programmers is because many programmers have chosen to move away from Perl. I know plenty of people who were regulars at London Perl Mongers meetings ten to fifteen years ago but who haven’t written a line of Perl for over five years. This means, of course, that there is more work to go round those of us who are left. I could probably go through to my retirement maintaining existing Perl codebases. Those of you who are younger than me might not be so lucky.

 

So, to summarise, people who say that Perl is thriving point to three things – technical advances in Perl, the vibrant Perl community and the number of unfilled Perl jobs that always seem to be around. All of these things are great and are, of course, necessary for a living and growing language.

But they aren’t sufficient. You also need people outside of the community to take notice. And that’s not happening.

Ask yourself three questions.

  1. When did you last read a book on general programming techniques that contained examples written in Perl?
  2. When did you last read documentation for a web site’s API that included examples written in Perl?
  3. When did you last hear of a company using Perl that you didn’t previously know about?

This is why I published that graph a couple of weeks ago. Looking at that data, it really hit home to me just how badly we’re doing.

I have a couple of theories about why most of the world started ignoring Perl. I’ll get to those in my next blog posts. But, annoyingly, I don’t have any good ideas about how we might reverse the situation.

To be honest, currently my best advice (and the course I’ll be taking) is “brush up your Javascript”.

 

Programming Language Usage

Back in May, I spent an afternoon at Silicon MilkRoundabout. Silicon MilkRoundabout is a recruitment fair for techies. It’s specifically aimed at people who want to work for start-ups around the Old Street area (although they aren’t particularly stringent about sticking to that – for example, the BBC were there).

We were given a booklet containing details of all of the companies who were recruiting. Those details usually included information about the tech stack that the companies used.

Over the weekend, I went through that booklet and listed the programming languages mentioned by the companies. The results speak for themselves.

There were 135 companies at the event. About twenty of them unhelpfully listed their tech stack as “ask us for details”.

Here’s the graph:Usage of Programming Languages by Companies at Silicon MilkRoundabout

Usage of Programming Languages by Companies at Silicon MilkRoundabout

I’ll obviously have some more to say about this over the next few days. But I wanted to get the raw data out there as soon as possible.

Data Munging with Perl

Data Munging with PerlMany years ago, I wrote a book called Data Munging with Perl. People were kind enough to say nice things about it. A few people bought copies. I made a bit of money.

Recently I re-read it. I thought that some of it was still pretty good. There were some bits, particularly in the early chapters, that talked about general principles that are still as relevant as they were when the book was published.

There were other bits that haven’t aged quite as well. The bits where I talk about particular CPAN modules are all a bit embarrassing as Perl fashions change and newer, better modules are released. Although it was still available from Amazon, I really didn’t want people paying for it as a lot of it was really out of date.

But today I got an interesting letter from the publishers, telling me that they have taken the book out of print. And that all  the rights in the book have reverted to me. Which means that I can now distribute it in any way that I like. And people don’t have to pay a lot of money for a rather out of date book.

So, you can download a PDF of the book from http://perlhacks.com/dmp.pdf. Or I’ve embedded the book below.

I might even have the original documents somewhere. So if I get some spare time I might be able to produce a more reasonable ebook version (but don’t hold your breath!)

Let me know if you find it useful.

Dots and Perl

I was running a training course this week, and a conversation I had with the class reminded me that I have been planning to write this article for many months.

There are a number of operators in Perl that are made up of nothing but dots. How many of them can you name?

There are actually five dot operators in Perl. If the people on my training courses are any guide, most Perl programmers can only name two or three of them. Here’s a list. It’s in approximate order of how well-known I think the operators are (which is, coincidentally, also the order of increasing length).

One Dot

Everyone knows the one dot operator, right? It’s the concatenation operator. It converts its two operands to strings and concatenates them.

It’s also sometimes useful to use it to force scalar context onto an expression. Consider the difference between these two statements.

The difference between the two statements is tiny, but the output is very different.

There’s not much more to say about the single dot.

Two Dots

Things start to get a little more interesting when we look at two dots. The two dot operator is actually two different operators depending on how it is used. Most people know that it can be used to generate a range of values.

You can also use it in something like a for loop. This is an easy way to execute an operation a number of times.

In older versions of Perl, this could potentially eat up a lot of memory as a temporary array was created to contain the range and therefore something like

could be a problem. But in all modern versions of Perl, that temporary array is not created and the expression causes no problems.

Two dots acts as a range operator when it is used in list context. In scalar context, its behaviour is different. It becomes a different operator – the flip-flop operator.

The flip-flop operator is so-called because it flip-flops between two states. It starts as returning false and continues to do so until something “flips” it into its true state. It then continues to return true until something else “flops” it back into a false state. The cycle then repeats.

So what causes it to flip-flop between its two states? It’s the evaluation of its left and right operands. Imagine you are processing a file that contains text that you are interested in and other text that you can ignore. The start of a block that you want to process is marked with a line containing “START” and the end of the block is marked with “END”. Between and “END” marker and the next “START”, there can be lots of text that you want to ignore.

A naive way to process this would involve some kind of “process this” flag.

I’ve often seen code that looks like this. The author of that code didn’t know about the flip-flop operator. The flip-flop operator encapsulates all of that logic for you. Using the flip-flop operator, the previous code becomes this:

The flip-flop operator returns false until its left operand (/START/) evaluates as true. It then returns true until its right operand (/END/) evaluates as true. And then the cycle repeats.

The flip-flop operator has one more trick up its sleeve. One common requirement is to only process certain line numbers in a file (perhaps we just want to process lines 20 to 40). If one of the operands is a constant number, then it is compared the the current record number (in $.) and the operand is fired if the line number matches. So processing lines 20 to 40 of a file is a simple as:

Three Dots

It was the three dot operator that triggered the conversation which reminded me to write this article. A three dot operator was added in Perl 5.12. It’s called the “yada-yada” operator. It is used to stand for unimplemented code.

Programmers have been leaving “TODO” comments in their code for decades. And they’ve been using ellipsis (a.k.a. three dots) to signify unimplemented code for just as long. But now you can actually put three dots in your source code and it will compile and run. So what’s the benefit of this over just leaving a TODO comment? Well, what happens if you call a function that just contains a comment? The function executes and does nothing. You might not realise that you haven’t implemented that function yet. With the yada-yada operator standing in for the unimplemented code, Perl throws an “unimplemented” error and you are reminded that you need to write that code before calling it.

But the yada-yada operator wasn’t the first three dot operator in Perl. There has been another one in the language for a very long time (you might say since the year dot!) And I bet very few of you know what it is.

The original three dot operator is another flip-flop operator. And the difference between it and the two dot version is subtle. It’s all to do with how many tests can be run against the same line. With the two dot version, when the operator flips to true it also checks the right-hand operand in the same iteration – meaning that it can flip from false to true and back to false again as part of the same iteration. If you use the three dot version, then once the operator flips to true, then it won’t check the right-hand operand until the next iteration – meaning that the operator can only flip once per iteration.

When does this matter? Well if our text file sometimes contains empty records that contain START and END on the same line, the difference between the two and three dot flip-flops will determine exactly which lines are processed.

If the two dot version encounters one of these empty records, it flips to true (because it matches /START/) and then flops back to false (because it matches (/END/). However, the flop to false doesn’t happen until after the expression returns a value (which is true). The net effect, therefore is that the line is printed, but the flip-flop is left in the false state and the following lines won’t be printed until one contains a START.

If the three dot version encounters one of these empty records, it also flips to true (because it matches /START/) but then doesn’t check the right-hand operand. So the line gets printed and the flip-flop remains in its true state so the following lines will continue to be printed until one contains an end.

Which is correct? Well, of course, it depends on your requirements. In this case, I expect that the two dot version gives the results that most people would expect. But the three dot version is also provided for the cases when its behaviour is required. As always, Perl gives you the flexibility to do exactly what you want.

But I suspect that the relatively small number of people who seem to know about the three dot flip-flip would indicate that its behaviour isn’t needed very often.

So, there you have them. Perl’s five dot operators – concatenation, range, flip-flop, yada-yada and another flip-flop. I hope that helps you impress people in your next Perl job interview.

Update: dakkar points out that yada-yada isn’t actually an operator. He’s absolutely right, of course, it stands in place of a statement and has no operands. But being slightly loose with our terminology here makes for a more succinct article.