Installing Modules

If you’ve seen me giving my “Kingdom of the Blind” lightning talk this year, then you’ll know that I’ve been hanging around places like the LinkedIn Perl groups and StackOverflow trying to help people get the most out of Perl. It can be an “interesting” experience.

One of the most frequent questions I see is some variant of “I have found this program, but when I try to run it I get an error saying it can’t find this module”. Of course, the solution to this is simple. You tell them to install the missing module. But, as always, the devil is in the detail and I think that in many cases the answers I seen could be improved.

Most people seem to leap in and suggest that the original poster should install the module using cpan (advanced students might suggest cpanminus instead). These are, of course, great tools. But I don’t think this is the best answer in to these questions.

In most cases, the people asking questions are new to Perl. In some cases they don’t even want to learn any Perl – they just want to use a useful program that happens to be written in Perl. I think that launching these people into the CPAN ecosystem is a bad idea. Yes, eventually, it would be good to get them using perlbrew, local::lib and cpanminus. But one step at a time. First let’s show them how easy it can be to use Perl.

In many of these cases, I think that the best approach is to suggest that they use their native package manager to install a pre-packaged version of the required module.

Yes, I know the system Perl is evil and outdated. Yes, I know that they probably won’t get the latest and greatest version of your CPAN module from apt-get or yum.  And, yes, I know there’s a chance that the required module won’t be available from the package repositories. But I still think it’s worth giving it a try. Because in most cases the module will be  there and available as a recent enough version that it will solve their immediate problem and let them get on with their work.

There are three main reasons for this suggestion:

  1. People in this situation will almost certainly be using the system Perl anyway. And the  system Perl will already have pre-packaged modules installed alongside it. And installing modules using cpanminus alongside pre-packaged modules in the same library installation is a recipe for disaster. The package manager no longer knows what’s installed or what versions are installed and hilarity ensues.
  2. The pre-packaged versions will know about non-Perl requirements for the CPAN module and will pull those in as well as other required CPAN modules. One of the most common requests I see is for GD or one of its related modules. Your package manager will know about the underlying requirement for  libgd. cpanminus and friends probably won’t.
  3. The user is more likely to be used to using their package manager. Teaching them about the CPAN ecosystem can come later. Let’s ease them into using Perl by starting them off with tools that they are familiar with.

I know that both Fedora/Centos/RHEL and Ubuntu/Debian have large numbers of CPAN modules already pre-packaged for easy installation. Let’s suggest that people make use of this work to get up and running with Perl quickly. Later on we can show the the power and flexibility that comes with using the Perl-specific tools.

Of course there’s then a debate about when (and how) we start to wean these people off of the system Perl and pre-build packages and on to perlbrew/cpanminus/etc. But I think that having a community of people who are used to using CPAN modules (albeit in this slightly restricted manner) is an improvement on the current situation where people often avoid CPAN completely because module installation is seen as too difficult.

What do you think? Are there obvious errors in my thinking?

The Return of blogs.perl.org

About an hour ago we turned blogs.perl.org back on. There’s also a blog post where we explain what happened in a lot more detail.

If you have an account on the site then you will have received an email explaining what you need to do now. Basically, we’ve invalidated all of the passwords so you’ll need to ask the system for a new one.

Sorry again for the inconvenience. And huge thanks to the rest of the blogs.perl.org team (particularly Aaron Crane) for fixing this.

blogs.perl.org

It seems that last night blogs.perl.org was hacked. I first became aware of it when someone pointed me at this story a few hours ago. As you’ll see, the contents of the mt_author table have been made public.

We’re still investigating the extent of the hack. But, as a precaution, we have configured the site so that all dynamic pages return a 404 response. This will, unfortunately, prevent you from logging on to the site.

We will publish more information when we have it.

Apologies for the inconvenience.

Update:

  • As I said, the mt_author table was leaked
  • This contains both your username and password
  • The password is salted and encrypted (with crypt)
  • If you use your blogs.perl.org password elsewhere, we strongly recommend that you change it

Update 2:
Here’s a cut-down version of the published data that includes only the name columns. Hopefully you can use this to work out whether or not you have an account on the system.

Dots and Perl

I was running a training course this week, and a conversation I had with the class reminded me that I have been planning to write this article for many months.

There are a number of operators in Perl that are made up of nothing but dots. How many of them can you name?

There are actually five dot operators in Perl. If the people on my training courses are any guide, most Perl programmers can only name two or three of them. Here’s a list. It’s in approximate order of how well-known I think the operators are (which is, coincidentally, also the order of increasing length).

One Dot

Everyone knows the one dot operator, right? It’s the concatenation operator. It converts its two operands to strings and concatenates them.

# $string contains 'one stringanother string'
my $string = 'one string' . 'another string';

It’s also sometimes useful to use it to force scalar context onto an expression. Consider the difference between these two statements.

say "Time: ", localtime;
say "Time: " . localtime;

The difference between the two statements is tiny, but the output is very different.

There’s not much more to say about the single dot.

Two Dots

Things start to get a little more interesting when we look at two dots. The two dot operator is actually two different operators depending on how it is used. Most people know that it can be used to generate a range of values.

my @numbers = (1 .. 100);
my @letters = ('a' .. 'z');

You can also use it in something like a for loop. This is an easy way to execute an operation a number of times.

do_something() for 1 .. 3;

In older versions of Perl, this could potentially eat up a lot of memory as a temporary array was created to contain the range and therefore something like

do_something() for 1 .. 1000000;

could be a problem. But in all modern versions of Perl, that temporary array is not created and the expression causes no problems.

Two dots acts as a range operator when it is used in list context. In scalar context, its behaviour is different. It becomes a different operator – the flip-flop operator.

The flip-flop operator is so-called because it flip-flops between two states. It starts as returning false and continues to do so until something “flips” it into its true state. It then continues to return true until something else “flops” it back into a false state. The cycle then repeats.

So what causes it to flip-flop between its two states? It’s the evaluation of its left and right operands. Imagine you are processing a file that contains text that you are interested in and other text that you can ignore. The start of a block that you want to process is marked with a line containing “START” and the end of the block is marked with “END”. Between and “END” marker and the next “START”, there can be lots of text that you want to ignore.

A naive way to process this would involve some kind of “process this” flag.

my $process_this = 0;
while (<$file>) {
  $process_this = 1 if /START/;
  $process_this = 0 if /END/;
  process_this_line($_) if $process_this;
}

I’ve often seen code that looks like this. The author of that code didn’t know about the flip-flop operator. The flip-flop operator encapsulates all of that logic for you. Using the flip-flop operator, the previous code becomes this:

while (<$file>) {
  process_this_line($_) if /START/ .. /END/;
}

The flip-flop operator returns false until its left operand (/START/) evaluates as true. It then returns true until its right operand (/END/) evaluates as true. And then the cycle repeats.

The flip-flop operator has one more trick up its sleeve. One common requirement is to only process certain line numbers in a file (perhaps we just want to process lines 20 to 40). If one of the operands is a constant number, then it is compared the the current record number (in $.) and the operand is fired if the line number matches. So processing lines 20 to 40 of a file is a simple as:

while (<$file>) {
  process_this_line($_) if 20 .. 40;
}

Three Dots

It was the three dot operator that triggered the conversation which reminded me to write this article. A three dot operator was added in Perl 5.12. It’s called the “yada-yada” operator. It is used to stand for unimplemented code.

sub reverse_polarity {
  # TODO: Must write this
  ...
}

Programmers have been leaving “TODO” comments in their code for decades. And they’ve been using ellipsis (a.k.a. three dots) to signify unimplemented code for just as long. But now you can actually put three dots in your source code and it will compile and run. So what’s the benefit of this over just leaving a TODO comment? Well, what happens if you call a function that just contains a comment? The function executes and does nothing. You might not realise that you haven’t implemented that function yet. With the yada-yada operator standing in for the unimplemented code, Perl throws an “unimplemented” error and you are reminded that you need to write that code before calling it.

But the yada-yada operator wasn’t the first three dot operator in Perl. There has been another one in the language for a very long time (you might say since the year dot!) And I bet very few of you know what it is.

The original three dot operator is another flip-flop operator. And the difference between it and the two dot version is subtle. It’s all to do with how many tests can be run against the same line. With the two dot version, when the operator flips to true it also checks the right-hand operand in the same iteration – meaning that it can flip from false to true and back to false again as part of the same iteration. If you use the three dot version, then once the operator flips to true, then it won’t check the right-hand operand until the next iteration – meaning that the operator can only flip once per iteration.

When does this matter? Well if our text file sometimes contains empty records that contain START and END on the same line, the difference between the two and three dot flip-flops will determine exactly which lines are processed.

If the two dot version encounters one of these empty records, it flips to true (because it matches /START/) and then flops back to false (because it matches (/END/). However, the flop to false doesn’t happen until after the expression returns a value (which is true). The net effect, therefore is that the line is printed, but the flip-flop is left in the false state and the following lines won’t be printed until one contains a START.

If the three dot version encounters one of these empty records, it also flips to true (because it matches /START/) but then doesn’t check the right-hand operand. So the line gets printed and the flip-flop remains in its true state so the following lines will continue to be printed until one contains an end.

Which is correct? Well, of course, it depends on your requirements. In this case, I expect that the two dot version gives the results that most people would expect. But the three dot version is also provided for the cases when its behaviour is required. As always, Perl gives you the flexibility to do exactly what you want.

But I suspect that the relatively small number of people who seem to know about the three dot flip-flip would indicate that its behaviour isn’t needed very often.

So, there you have them. Perl’s five dot operators – concatenation, range, flip-flop, yada-yada and another flip-flop. I hope that helps you impress people in your next Perl job interview.

Update: dakkar points out that yada-yada isn’t actually an operator. He’s absolutely right, of course, it stands in place of a statement and has no operands. But being slightly loose with our terminology here makes for a more succinct article.

Perl APIs

For a lot of programmers out there, Perl has become largely invisible. They just never come across it. That might seem strange to you as you sit inside the Perl community echo chamber reading the Perl Ironman or p5p, but try this simple experiment.

Think of a web site that you use and that supplies an API. Now go to that API’s documentation and look at the example code. What languages are the examples written in? PHP? Almost certainly. Ruby? Probably. Python? Probably. C#? Quite possibly. Perl? Almost certainly not[1].

Perl has fallen so far off the radar of most people that when web sites write example code for these APIs, they very rarely consider Perl as a language worth including. And because they don’t bother including Perl then any random programmer coming to that API will assume that the API doesn’t support Perl, or (at the very least) that the lack of examples will make using Perl harder than it would be with plenty of examples to copy from.

This then becomes a self-fulfilling prophecy as Github fills with more and more projects using other languages to talk to these APIs. And the chance of anyone who isn’t already a Perl user ever trying to interact with these APIs using Perl falls and falls.

Of course, this is all completely wrong. These APIs are just going to be a series of HTTP requests using REST or XML-RPC (or, if you’re really unlucky, SOAP). Perl has good support for all of that. You might need to use something like OAuth to get access to the API – well Perl does that too.

Of course, in some cases good Perl support exists already – Net::Twitter is a good example. And to be fair to Twitter, their API documentation doesn’t seem to give any examples in specific languages – so Perl isn’t excluded here. But in many other cases, the Perl version languishes unnoticed on CPAN while other languages get mentioned on the API page.

I think that we can try to address this in 2014. And I’d like to ask you to help me. I’ve set up a mailing list called perl-api-squad where we can discuss this. In a nutshell, I think that the plan should be something like this:

  1. Identify useful APIs where there is either no Perl API or no Perl examples
  2. Write CPAN API wrappers where they are missing
  3. Approve API owners and offer them Perl examples to add to their web site

That doesn’t sound too complicated to me. And I think (or, perhaps, hope) that most API owners will be grateful to add more examples of API usage to their site – particularly if it involves next to no effort on their part.

I also expect that the Perl API Squad will produce a web site that lists Perl API support. We might even move towards producing a framework that makes it easy to write a basic Perl wrapper around any new API.

What do you think? Is this a worthwhile project? Who’s interested in joining in?

[1] Yes, I know there are exceptions. But they are just that – exceptions.

Misunderstanding Context

Over the last few days I’ve been involved in a discussion on LinkedIn[1]. It has been interesting as it shows how many people still misunderstand many of the intricacies of context and, in particular, how it ties in with the values returned from subroutines.

The original question asked why these two pieces of code acted differently:

my ($index) = grep {
  $array[$_] == $variable
} 0 .. $#array;
my $index = grep {
  $array[$_] == $variable
} 0 .. $#array;

The first one gives the first index where the element equals $variable, the second one gives the number of indexes where the element equals $variable.

The answer, of course, comes down to context.  And most people who answered seemed to understand that, but many of their explanations were still way off the mark. Here’s the first answer:

grep returns an array – so the first one will return the value of the first match, but 2nd one in scalar context will return the size of the array, the number of matches. I believe.

The first statement here – “grep returns an array” – is wrong, so any explanation built on that fact is going to be fundamentally flawed.

After a couple of similar answers, I jumped in and pointed out that the only way to know how grep will work in different contexts is to read the documentation; which says:

returns the list value consisting of those elements for which the expression evaluated to true. In scalar context, returns the number of times the expression was true.

At that point it got a bit weird. People started telling me all sorts of strange things in order to show that the earlier answers were better than mine. I was told that lists and arrays are the same thing, that there was something called “array context” and that it was possible for function to return arrays.

I think I’ve worked out what most of the misconceptions in the discussion were. Here’s a list.

1. Arrays are not lists

I know this is a very common misunderstanding. I come across it all the time. People use the terms “list” and “array” interchangeably and end up thinking that they are the same thing. They aren’t. A list is a data value and an array is a variable. They act the same way a lot of the time but unless you understand the difference, you will make mistakes.

Mike Friedman wrote a great blog post that explains the difference in considerable detail. But the core difference is this – arrays are persistent (at least while they are in scope) data structures; lists are ephemeral.

On training courses, I tell people that if they understand the difference between an array and a list they’ll be in the top 20% of Perl programmers. In my experience, that’s pretty close to the truth.

2. Subroutines return lists

A subroutine can only ever return a list. Never an array. It can be an empty list. It can be a list with only one item. But it’s always a list.

There’s one small “gotcha” here. I know this bit me a few times in the past. If you read the documentation for return, it says this:

Evaluation of EXPR may be in list, scalar, or void context, depending on how the return value will be used

This means that when you have code like return @array, it’s @array that is evaluated in the context of the subroutine call. So, depending on  the context of the call, this will return either a list consisting of all the elements in @array, or a list with one element which is the number of elements in @array.

3. There is no “array context”

When you confuse lists and arrays, then it’s not surprising that you might also decide that you can also invent a new context called “array context”. There are only list context, scalar context and void context. Ok, actually there are a few more specialised contexts that you can use (see the Want module for details) but most of the time you’ll only be dealing with those three.

Of course, the name of the wantarray function doesn’t help at all here. But it’s worth noting that the documentation for wantarray ends by saying:

This function should have been named wantlist() instead.

4. You  can’t guess contextual behaviour

One common argument I got when pointing this out on LinkedIn ran along the lines of “but grep acts like it returns an array, so it’s a useful mental model – it helps people to understand context”.

The first part of this is true – grep does act like it returns an array. It returns a list (which you might often store in an array) in list context and it returns the length of that list in scalar context. I can see how you would mistake that for returning an array. But  see my point 2 above. Subroutines do not ever return arrays.

But is it a useful mental model? Does it help people understand how subroutines work in different contexts?

No. It helps people to understand how this particular function works. And there are several other Perl functions that work the same way (keys is one example). But there are plenty of other functions  that don’t work that way. The canonical example is localtime. In list context, it returns a list of nine values; in scalar context it returns a single value (which isn’t the number 9).  Another good example is caller. In list context, it returns a list of items (which can contain either three or eleven elements); in scalar context it returns the first item of that list. There are many more examples.

So the problem with a mental model that assumes that functions work as though they return arrays is that it only works for a subset of functions. And you’d need to waste effort remembering which functions it works for and which ones it doesn’t work for. You’d be far better off (in my opinion) realising that there is no rule that works for all functions and just remembering how each function works (or, more practically, remembering to check the documentation whenever you need to know).

 

If you’ve seen me giving a lightning talk at a conference this  year, you’ll know that I’ve been encouraging more people from the Perl community to get involved in discussions in places outside the echo chamber, places like LinkedIn. It’s interesting because you see how people outside of the community see Perl. You see the misunderstandings that they work with and you might see how we can improve the Perl documentation to help them get around those misunderstandings.

This morning, this comment was posted to the LinkedIn discussion that I’ve been talking about:

This stuff should be included in a default perl tutorial to avoid common mistake.

And, you know, I think he’s probably right.

[1] Because of the way LinkedIn works, you won’t be able to see this discussion unless you’re a member of LinkedIn and, probably, a member of the Perl group as well.

Two Books

I’ve recently received review copies of a couple of new books. Here are the reviews of those books that I have submitted to Amazon.

Designing the Internet of Things – Adrian McEwen & Hakim Cassimally (Wiley)

I’ve been hearing people talking about “the internet of things” for a few years now. And I’ve always meant to find out what they meant by the term. Well now, thanks to this book, I have a really good idea. I might even think about building something for the internet of things myself (you know, in my copious free time!)

Adrian and Hakim obviously know what they are talking about. They both have experience of working on these kinds of projects. And, crucially, they also have the writing skills to pass their expertise on to the reader.

Some of the stuff in the earlier chapters, I already knew. But things like an introduction to TCP/IP and networking will be useful to many people who don’t have such a technical background. Software, I can do – it was the chapters about hardware that I found most useful. I now know far more than I did about the different toolkits that are available for building internet-connected devices.

Part I is about prototyping your device and part 2 is about building it. I’m sure that other books cover these topics – although, perhaps, not in the focussed way that this book does. But it’s part 3 where this book really shines. These three chapters cover business models, scaling your manufacturing process and some of the ethical issues that these devices raise. This section really makes the book a “one-stop shop” for finding all the information you’ll need to take your vague idea to a complete product and (hopefully) a profitable company.

Perl One-Liners – Peteris Krumins (No Starch)

Perl one-liners are an important part of its power and flexibility. The ability to process a file quickly without having to write a program is often really useful. Any Perl programmers should take the time to get to know the command like switches that make this possible. This book is a pretty good introduction to this way of using Perl.

So why only three stars? Well, I have a couple of reservations about the book. Firstly, there are a few technical errors which the editors should have caught. For example, a few times the author refers to “array context” where he means “list context”. The difference between arrays and lists is often difficult for beginners to master and it doesn’t help when books blur the distinction.

My other reservation is with the programs themselves. The book boasts “130 programs that get things done”. But I think they have had to stretch things a bit to get to that number. One program might be “print lines that match a pattern”. Then the next program will be “print lines that don’t match a pattern”. I’m not sure that inverting the logic in a one-liner is enough of a difference to justify counting it separately. Sometimes you’ll come across two or three pages of examples all of which are only tiny permutations of each other.

But it’s good to see publishers bringing out books on Perl. And this is certainly an area of Perl that hasn’t received much coverage before. I just think it’s a rather thin concept to spin out to a book. Even this stretched, it’s a rather thin book (140 pages – 50 of which are appendices). It might have been better as a cheap Kindle-only publication.

Perl in Banks

An email has flooded in:

I came across your presentation ‘Perl in the Enterprise’ and happen to have a burning concern closely related to one of the bullets ‘Banks see <Perl> as a competitive advantage’.

I am consulting at a major bank in South Africa, and our team have been using Perl very productively to develop and maintain a customer loyalty rewards  application. This application has now suddenly caught the attention of the ‘Architecture Board’ who are questioning whether Perl (or any other scripting language) is appropriate for production use. A presentation will have to be made to this board to present a business case.

Whatever the basis for this concern, some quotable and referencable stories from other banks, such as the list included in your clients on the website, would be gold for us. I have searched extensively on the web and cannot find anything in this light which is also recent, as in the last 5 years. Other than Nvidia and Booking.com I can only find anecdotal, 2nd hand evidence of Perl use for medium-to-large applications in well-known corporates.

Do you by any chance have anything or any contacts in those banks that would be willing to respond to an email to confirm their use of Perl and perhaps repeat the competitive advantage claim?

The presentation he’s talking about is Perl in the Enterprise, which I gave at Linux World Expo in 2005. The particular line he’s interested in is on slide 6 where I say that banks see using Perl as a competitive advantage. I’m pretty sure that I was paraphrasing Phillip Moore of Morgan Stanley who had said something similar in a keynote at OSCON in about 2001.

Obviously I know that banks in the City of London still use Perl. But it’s been several years since I worked at one, so my personal experience is slightly out of date. What my correspondent needs is people who are currently working in banks who are happy to go public and say “we use Perl in our production systems”. It would be even more helpful if they could add “and here’s why…”.

Can anyone out there help?

Perl Search Revisited

A couple of years ago I wrote a blog post about a Google Custom Search that I had set up to create a specialised search engine for Perl.

Recently I’ve revisited this idea. I’ve given the search engine its own subdomain and I’ve added some new sites to the list of sites that it covers. I’ve also given it a simplified look (thanks Bootstrap) and it’s now being hosted on Github pages.

It lives at http://search.perlhacks.com/. Please give it a try and let me know what you think.

Talks from Kiev

It has only been a few weeks since YAPC::Europe in Kiev and already all of the videos are available on YouTube. Here are the recordings of my three talks.

On the first day I spoke about “25 Years of Perl”.

Later that day I was one of the lightning talk speakers. My talk starts at about 52 minutes.

Then on the second day I spoke about “Matt’s PSGI Archive”.