Categories
Programming

Counting Weekends and Wrapping Text

I said that I probably wouldn’t have time to get involved with the Perl Weekly Challenge every week and that has, unfortunately, proven to be the case. But I had a few free minutes earlier in the week so I decided to look at this week’s challenges. I’m glad I did because they seemed to fit the way my brain works pretty well and I had solutions written rather quickly.

Challenge 1: Write a script to display months from the year 1900 to 2019 where you find 5 weekends i.e. 5 Friday, 5 Saturday and 5 Sunday.

This would be simple enough to just brute-force. But when I started to think about it, I realised there’s a bit of a trick we can use which can cut down our search space quite significantly.

If we’re looking for a month with five Fridays, Saturdays and Sundays then we need a month with 31 days (as four weeks is twenty-eight days and we need three extra days). Only seven months ever have 31 days – January, March, May, July, August, October and December. There is no point at all in ever looking in any other month. You might also realise that those three extra days need to be Friday 29th, Saturday 30th and Sunday 31st. And that means that the first day of the month also needs to be a Friday.

So, the problem simplifies to “Find months with 31 days where the 1st is a Friday”. And here’s the code I wrote to do that:

I’ve seen a few other solutions published and people seem to split into one group who spotted the shortcuts and another who didn’t. But the actual solutions seem very similar. Some people used DateTime instead of Time::Piece and others used low-level functions like timelocal().

Challenge 2: Write a script that can wrap the given paragraph at a specified column using the greedy algorithm.

Honestly, I didn’t think very hard about this at all. I just read the Wikipedia description of the algorithm and wrote a pretty much word-for-word Perl translation of that.

Next week is all about the European Perl Conference so I very much doubt if I’ll have time to try the Perl Weekly Challenges. But I hope to be able to try more of the problems in the coming weeks.

Categories
Programming

A Subtle Bug

Earlier this week, I saw this code being recommended on Stack Overflow. The code contains a nasty, but rather subtle bug. The version I saw has been fixed now, but I thought there were some interesting lessons to learn by looking at the problems in some detail.

Let’s start by working out what the bug is. Here’s the code:

On first glance, it seems fine. It uses the common “open or die” idiom. It uses the modern approach of using a lexical filehandle. It even uses the three-argument version of “open()”. Code like has appeared in huge numbers of Perl programs for years. What can possibly be the problem?

I’ll give you a couple of minutes to have a closer look and work out what you think the problem is.

[ … time passes … ]

So what do you think? Do you see what the problem is?

The problem is that there is no error checking.

“What do you mean, Dave?” I hear you say. “There’s error checking there – I can see it plainly.” Some of you might even be wondering if I’m going senile.

And, yes, it certainly looks like it checks for errors. But the error checking doesn’t work. Let me prove that to you. We can check it with a simple command line program.

You would expect to see the “die” message there. But it doesn’t appear. Ok, perhaps I’m lying. Perhaps I really do have a file called “not.there”. Let’s try another, slightly different, version of the code.

And there we see the error message. That file really doesn’t exist.

So what went wrong with the first version? Of course, a good way to start working that out is to compare the two versions and look at the differences between them. The difference here is that when I put parentheses around the parameters to “open()” it started working. And when you fix things by adding parentheses it’s a pretty sure bet that the problem comes down to precedence.

The order of precedence for Perl operators is listed in perldoc perlop. If you look at that list you’ll see that the “or” operator we used (“||”) is at position 16 on the list. But what other operators are we using in our code? The answer is lurking down at position 21 on the list. When we call a Perl built-in function without using parentheses around the parameters, it’s known as a list operator. And list operators have rather low precedence.

All of which means that our original code is actually parsed as if we had written it like this:

Notice the parentheses that have appeared around $file and (crucially) the whole “or die” clause. That means that the bracketed expression is evaluated and passed to “open()” as its third argument. And when Perl evaluates that expression, it does that clever “Boolean short-circuiting” thing. An expression of “A || B” evaluates A first and if that is true, it returns it. Only if A is false will it go on to evaluate B and return that. In our case, the filename will always be true (well, unless you have a file called “0”) so the second half of the expression (the “or die…” bit) is never evaluated and, effectively, ignored.

Which is why I said, back at the start, that this code has no error checking at all – that’s literally true as the error checking has no effect at all.

So how do we fix it? Well, we’ve already seen one approach – you can explicitly add parentheses around the arguments to “open()”. But Perl programmers don’t like to use unnecessary punctuation and I’m sure I’ve seen this written without parentheses, so how does that work?

If you take another look at the table of operator precedence and look down below the list operators, you’ll see another “or” operator (the one that’s actually the word “or”, rather than punctuation). It’s right at the bottom of the list – at position 24. And that means we can use that version without needing the parentheses around the parameters to “open()”.

And that’s the version that you’ll see in most codebases. But, as we’ve seen, it’s vitally important to use the correct version of the “or” operator.

The worst thing about this bug is that it appears at the worst time. If your file exists and you can open it successfully, then everything works fine. Things only go wrong when… well, when things go wrong. If you can’t open your file for some reason, you won’t know about it. Which is bad.

So it’s important to test that your code works correctly when things go wrong. And that’s why we have modules like Test::Exception. You could write a test program like this:

And it would fail every time. But if you switched to the other “or” operator, it will work.

There’s one other approach you can take. You can use autodie in your code and just forget about adding “or die” to any of your calls to “open()”.

This is an easy bug to introduce into your code and a hard one to track down. Who’s confident that it doesn’t appear in any of their code?

Categories
Programming

Please Don’t Use CGI.pm

Earlier this week, the Perl magazine site, perl.com, published an article about writing web applications using CGI.pm. That seemed like a bizarre choice to me, but I’ve decided to use it as an excuse to write an article explaining why I think that’s a really bad idea.

It’s important to start by getting some definitions straight – as, often, I see people conflating two or three of these concepts and it always confuses the discussion.

  • The Common Gateway Interface (CGI) is a protocol which defines one way that you can write applications that create dynamic web pages. CGI defines the interface between a web server and a computer program which generates a dynamic page.
  • A CGI program is a computer program that is written in a manner that conforms to the CGI specification. The program optionally reads input from its environment and then prints to STDOUT a stream of data representing a dynamic web page. Such programs can be (and have been!) written in pretty much any programming language.
  • CGI.pm is a CPAN module which makes it easier to write CGI programs in Perl. The module was included in the Perl core distribution from Perl 5.004 (in 1997) until it was removed from Perl 5.22 (in 2015).

A Brief Introduction to CGI.pm

CGI.pm basically contained two sets of functions. One for input and one for output. There was a set for reading data that was passed into the program (the most commonly used one of these was param()) and a set for producing output to send to the browser. Most of these were functions which created HTML elements like <h1> or <p>. By about 2002, most people seemed to have worked out that these HTML creation functions were a bad idea and had switched to using a templating engine instead. One output function that remained useful was header() which gave the programmer an easy way to create the various headers required in an HTTP response – most commonly the “Content-type” header.

For at least the last ten years that I was using CGI.pm, my programs included the line:

as it was only the param() and header() functions that I used.

I should also point out that there are two different “modes” that you can use the module in. There’s an object-oriented mode (create an object with CGI->new and interact with it through methods) and a function-based mode (just call functions that are exported by the module). As I never needed more than one CGI object in a program, I always just used the function-based interface.

Why Shouldn’t We Use CGI.pm Today?

If you’re using CGI.pm in the way I mentioned above (using it as a wrapper around the CGI protocol and ignoring the HTML generation functions), then it’s not actually a terrible way to write simple web applications. There are two problems with it:

  1. CGI programs are slow. They start up a Perl process for each request to the CGI URL. This is, of course, a problem with the CGI protocol itself, not the CGI.pm module. This might not be much of a problem if you have a low-traffic application that you want to put on the web.
  2. CGI.pm gives you no help building more complicated features in a web application. For example, there’s no built-in support for request routing. If your application needs to control a number of URLs, then you either end up with a separate CGI program for each URL or you shoe-horn them all into the same program and set up some far-too-clever mod_rewrite magic. And everyone reinvents the same wheels.

Basically, there are better ways to write web applications in Perl these days. It was removed from the Perl code distribution in 2015 because people didn’t want to encourage people to use an outdated technology.

What are these better methods? Well, anything based on an improved gateway interface specification called the Perl Server Gateway Interface (PSGI). That could be a web framework like Dancer2, Catalyst or Web::Simple or you could even just use raw PSGI (by using the toolkit in the Plack distribution).

Often when I suggest this to people, they think that the PSGI approach is going to be far more complex than just whipping up a quick CGI program. And it’s easy to see why they might think that. All too often, an introduction to PSGI starts by building a relatively powerful (and, therefore, complicated) web application using Catalyst. And while Catalyst is a fine web framework, it’s not the simplest way to write a basic web application.

But it doesn’t need to be that way. You can write PSGI programs in “raw PGSI” without reaching for a framework. Sure, you’ll still have the problems listed in my point two above, but when you want to address that, you can start looking at the various web frameworks. Even so, you’ll have three big benefits from moving to PSGI.

The Benefits of PSGI

As I see it, there are three huge benefits that you get from PSGI.

Software Ecosystem

The standard PSGI toolkit is called Plack. You’ll need to install that. That will give you adapters enabling you to use PSGI programs in pretty much any web deployment environment. It also includes a large number of plugins and extensions (often called “middleware”) for PSGI. All of this software can be added to your application really simply. And any bits of your program that you don’t have to write is always a big advantage.

Testing and Debugging

How do you test your CGI program? Probably, you use something like Selenium (or, perhaps, just LWP) to fire requests at the server and see what results you get back.

And how about debugging any problems that your testing finds? All too often, the debugging that I see is warn() statements written to the web server error log. Actually, when answering questions on StackOverflow, often the poster has no idea where to find the error log and we need to resort to something like use CGI::Carp 'fatalsToBrowser', which isn’t exactly elegant.

A PSGI application is just a subroutine. So it’s trivial for testing tools to call the subroutine with the correct parameters. This makes testing PSGI programs really easy (and all of the tools to do this are part of the Plack distribution I mentioned above). Similarly, there are tools debugging a PSGI program far easier than the equivalent CGI program.

Deployment Flexibility

This, to me, is the big one. I talked earlier about the performance problems that the CGI environment leads to. You have a CGI program that is only used by a few people on your internal network. And that’s fine. The second or so it takes to respond to each request isn’t a problem. But it proves useful and before you know it, many more people start to use it. And then someone suggests publishing it to external users too. The one-second responses stretch to five or ten seconds, or even longer and you start getting complaints about the system. You know you should move it to a persistent environment like FastCGI or mod_perl, but that would require large-scale changes to the code and how are you ever going to find the time for that?

With a PSGI application, things are different. You can start by deploying your PSGI code in a CGI environment if you like (although, to be honest, it seems that very few people do that). Then when you need to make it faster, you can move it to FastCGI or mod_perl. Or you can run it as a standalone web service and configure your web proxy to redirect requests to it. Usually, you’ll be able to use exactly the same code in all of these environments.

In Conclusion

I know why people still write CGI programs. And I know why people still write them using CGI.pm – it’s what people know. It’s seen as the easy option. It’s what twenty-five years of web tutorials are telling them to do.

But in 2018 (and, to be honest, for most of the last ten years) that’s simply not the best approach to take. There are more powerful and more flexible options available.

Please don’t write code using CGI.pm. And please don’t write tutorials encouraging people to do that.

Categories
Conferences

Professional Programmer is Professional

(The image above was the first result I got when searching Google Images for a CC-licensed image for “professional programmer”.)

Two weeks ago, I wrote about the SEO workshop I’m running on Tuesday morning just before The Perl Conference in Glasgow this August. Today, I’d like to give a few more details about the other workshop I’m running that day. After lunch, I’m running a workshop called “The Professional Programmer”. What’s that about?

I came into programming through what was a very traditional route. I did a degree in Computer Studies which I finished in 1988. And for the last thirty years I’ve been working as a programmer for a number of different companies from tiny start-ups to huge multi-nationals.

But more and more, I’m working with people who didn’t come through the same route. It’s very common that I’ll be working with people who don’t have a degree. And it’s rare that I’ll work with someone who’s been in the industry as long as I have (for I am an Old Man). I’m not saying for a second that those people aren’t just as capable of doing the job as I am. But I am saying that I know stuff that some of those people won’t have worked out yet.

This certainly isn’t going to be me telling you stuff that I learned on my degree. To be honest, I can’t think of much on my degree that I’ve used in my career. On my degree course, SQL was introduced as a cutting-edge technology (one lecturer even described it as a reporting tool that could be used by end-users!) We also did classes on COBOL and Assembler. No, there’s very little there that would be of much interest to people working in the modern software industry.

A few days ago, I started to sketch out some of the things I might want to talk about. I think the plan is going to be that we start with some of the technologies that sit alongside the programming that we all do every day and slowly move away from hard tech into the fluffier areas of the industry that we work in. Here are some of the topics I hope to cover.

Adjacent Technologies

Ok, we all have a programming language or two under our belts. But what else do we need to know?

How well do you know the operating systems that you work on? What, for example, is the most obscure Unix tool that you know? At what level do you understand the networking features that your code almost certainly makes use of? Can you debug network connectivity problems? To what level of detail do your really know the HTTP request-response cycle?

What data storage systems do you use? How well do you know SQL? Do you use No SQL systems as part of your technology stack? If not, could you? Do you cache things at the right level in your application? Should you be caching more things? Do you have a CDN? Do you know what a CDN is and what it does for you?

Are you an expert in the tools that you use every day? I don’t care if you prefer vi or emacs (or, I suppose, anything else), but are you an expert in using your editor? I’m happy to admit this is one area where I fall short. I bounce between many different editors and I’ve never really become an expert in any of them.

Are you the person in your team that people come to with git questions? Or do you just know half a dozen commands that seem to do approximately the right thing most of the time? Your source code control system is a vital part of your workflow. Get to know it well.

How well do you know your continuous integration environment? Do you know which buttons to press to get a release built? Or are you the person who is constantly tweaking and improving the Jenkins jobs that power the release process? And what underlies your release process? Are you building RPMs or some other type of package or do you build a new Docker container and deploy that in the cloud? How well do you know the cloud provider that you’re using? Are there new AWS features that could replace parts of your existing infrastructure? (The answer to that question is always yes.)

How good are your tests? What’s your unit test coverage? How many different types of automated testing does your system use? Do you know the difference between unit tests and integration tests? What tools are you using for automated testing? How well do you know how to use them? Is there something better out there?

Software Engineering & Architecture

What level are you involved in architectural decisions? How do you decide on a design for your application? Are you using largely procedural code or does your system make good use of classes? Is it possible for a system to be too object-oriented? How do you know when you’ve crossed that boundary?

How is your knowledge of design patterns? Do you know what a factory class is? Do you know why you would use one? Have you ever written one? Do you have an opinion on MVC designs? What is good and bad about the frameworks that you use? What would you like to do differently?

Are you maintaining a monolithic codebase from fifteen years ago? Do you have a plan to modernise your code? Have you implemented any microservices yet? How do you go about replacing small parts of a monolith with microservices? What are the advantages of a microservices architecture?

Is your team using an agile software development methodology? Is it Scrum, Kanban, XP or do you just cherry-pick bits from all of them? Is your team really agile or do you just pay lip service to agile techniques? Are you self-managing? How accurate are your estimates? Can you improve that? How well do you know the Agile Manifesto? To what extent do you agree with it?

The Business

What does your company do? What does success look like? How does what you do contribute to that success? How well do you understand the business? Do you have suggestions for improving the business outside of your team?

Do you understand the environment that the company operates in? What do you know about the economic pressures on the company? Is the company publicly or privately owned? Do you have shares in the company? Do you know what they are worth?

Personal Development

What level are you currently at? Do you know what you need to do in order to progress in the company? Do you have a plan to achieve that? Do you have a mentor inside the company who can help you come up with that plan? Will the company give you budget for training and personal development?

Do you need to communicate with business people inside the company? How good is your written and spoken English? Do you know how to use apostrophes? Do you need to give presentations to people in the company? How comfortable are you with public speaking? Can you get better at that?

How well-known are you outside of the company? Can you blog about your technical expertise? (You probably need to be careful if you’re blogging about stuff you do at work.) Do you speak at conferences? Should you start speaking at conferences?

 

As you can see, when I start writing this stuff down, it can easily all get a bit “stream of consciousness”. Hopefully in the five weeks between now and the workshop, I can tie it down and impose a little more structure on it.

But not too much structure. I’d like to keep this pretty loose. I want the workshop to be very much a two-way discussion.

I hope that sounds interesting to some of you. The workshop will be in the afternoon of Tuesday 14th August. To attend any of the workshops, you’ll need to buy an extra ticket. Tickets for either of my half-day workshops are £75.

I hope to see some of you there. Please let me know in the comments if you have any questions about this workshop.

Categories
Programming

Line of Succession

I’m a republican. No… wait… come back! That’s not what I mean.

I’m a long way from being a supporter of the Republican Party. I mean “republican” in its older meaning of “someone who thinks their country should be a republic. That is to say, I’m not a big fan of the British royal family.

But while I believe that the UK should get rid of the royal family, I’m also fascinated by them. In particular, I’m fascinated by the laws that determine the line of succession – that is the list of people in line to take over the throne.

When I was a child I believed that the line of succession was a big list that had every British person’s name on it and that it would only take a single catastrophic event to propel my name to the top of that list. Later on I discovered the Act of Settlement (1701), which is the law which actually defines the line of succession (modulo a few later tweaks). I was disappointed to find that there were only a few thousand people on the list (and that didn’t include me!) and also that a lot of the people on the list weren’t British (largely due to Queen Victoria’s children marrying royalty from all over Europe).

A few months ago, I started to think about building a web site that would allow people to explore the line of succession through time. And over the last few weeks, I have build the site. It’s at lineofsuccession.co.uk. On the main page, you will see the current line of succession. And in the navigation bar is a drop-down menu that allows you to move to a few interesting data (the days that the last four monarchs came to the throne) and a date picker allowing you to choose any random date.

The code is, of course, on Github. The web app is a pretty standard Dancer2 application which really doesn’t do anything clever. Most of the complexity in an application like this is in the data gathering.

Currently I have just over a hundred people in the database. That’s most of the descendants of Edward VII (there a few lines that I haven’t completely filled out yet), but eventually I want to go back to all include all of the descendants of Electress Sophia (the person who the crown was “settled on” in the Act of Settlement). I’ve heard estimates that she has somewhere between 5-6,000 descendants. So I have a bit of work to do there!

Other than more people, I have a few other things I’d like to add to the site:

  • More names and titles People on the line of succession tend to have many titles during their lives. The data model already supports the concept of a name that is only valid on a range of dates (see the current queen described as Princess Elizabeth of York on the day before her father became king,  Princess Elizabeth when he was king, Duchess of Edinburgh when she got married and, finally, Queen Elizabeth II when she became queen. But tracking down and adding all that data is hard work.
  • Excluded people Catholics are excluded from the line of succession. And people who married Catholics were also excluded until recently. Oh, and obviously children born out of wedlock (that’s more common than you might expect in some of the more obscure branches of the modern Windsor family). Of course, people can convert to (and from) Catholicism at any time, so supporting that in the app would mean implementing some kind of “exclusions” data. But it would be good to show these people, perhaps in a dimmer font.
  • Tree views Today I added text to each person showing their relationship to the monarch. That can help a reader to visualise the family tree, but I’d like to make it more explicit. Nested lists could make it easy to see the relationships. And, later, perhaps show the whole tree using SVG.
  • Position changes Sometimes the line of succession feels a bit like the pop charts. Take Prince Harry, for example. He entered the chart at position 3 and stayed there until Prince George bumped him down to number 4 and Princess Charlotte pushed him down to 5. He’s only likely to fall further in the coming years. The same thing happened to Princess Anne, who was number 3 when she was born, but currently languishes down at number 12. I think it would be interesting to plot those changes over time.
  • Make it prettier Bootstrap does the job. It allows design dunces like me to get a reasonable-looking site up and running in no time. But it’s not very regal.

Anyway, there’s my current itch scratched. And, as in so many cases, it’s just given me more itches. But please let me know if you find the site at all interesting or useful.