Categories
Programming

Please Don’t Use CGI.pm

Earlier this week, the Perl magazine site, perl.com, published an article about writing web applications using CGI.pm. That seemed like a bizarre choice to me, but I’ve decided to use it as an excuse to write an article explaining why I think that’s a really bad idea.

It’s important to start by getting some definitions straight – as, often, I see people conflating two or three of these concepts and it always confuses the discussion.

  • The Common Gateway Interface (CGI) is a protocol which defines one way that you can write applications that create dynamic web pages. CGI defines the interface between a web server and a computer program which generates a dynamic page.
  • A CGI program is a computer program that is written in a manner that conforms to the CGI specification. The program optionally reads input from its environment and then prints to STDOUT a stream of data representing a dynamic web page. Such programs can be (and have been!) written in pretty much any programming language.
  • CGI.pm is a CPAN module which makes it easier to write CGI programs in Perl. The module was included in the Perl core distribution from Perl 5.004 (in 1997) until it was removed from Perl 5.22 (in 2015).

A Brief Introduction to CGI.pm

CGI.pm basically contained two sets of functions. One for input and one for output. There was a set for reading data that was passed into the program (the most commonly used one of these was param()) and a set for producing output to send to the browser. Most of these were functions which created HTML elements like <h1> or <p>. By about 2002, most people seemed to have worked out that these HTML creation functions were a bad idea and had switched to using a templating engine instead. One output function that remained useful was header() which gave the programmer an easy way to create the various headers required in an HTTP response – most commonly the “Content-type” header.

For at least the last ten years that I was using CGI.pm, my programs included the line:

as it was only the param() and header() functions that I used.

I should also point out that there are two different “modes” that you can use the module in. There’s an object-oriented mode (create an object with CGI->new and interact with it through methods) and a function-based mode (just call functions that are exported by the module). As I never needed more than one CGI object in a program, I always just used the function-based interface.

Why Shouldn’t We Use CGI.pm Today?

If you’re using CGI.pm in the way I mentioned above (using it as a wrapper around the CGI protocol and ignoring the HTML generation functions), then it’s not actually a terrible way to write simple web applications. There are two problems with it:

  1. CGI programs are slow. They start up a Perl process for each request to the CGI URL. This is, of course, a problem with the CGI protocol itself, not the CGI.pm module. This might not be much of a problem if you have a low-traffic application that you want to put on the web.
  2. CGI.pm gives you no help building more complicated features in a web application. For example, there’s no built-in support for request routing. If your application needs to control a number of URLs, then you either end up with a separate CGI program for each URL or you shoe-horn them all into the same program and set up some far-too-clever mod_rewrite magic. And everyone reinvents the same wheels.

Basically, there are better ways to write web applications in Perl these days. It was removed from the Perl code distribution in 2015 because people didn’t want to encourage people to use an outdated technology.

What are these better methods? Well, anything based on an improved gateway interface specification called the Perl Server Gateway Interface (PSGI). That could be a web framework like Dancer2, Catalyst or Web::Simple or you could even just use raw PSGI (by using the toolkit in the Plack distribution).

Often when I suggest this to people, they think that the PSGI approach is going to be far more complex than just whipping up a quick CGI program. And it’s easy to see why they might think that. All too often, an introduction to PSGI starts by building a relatively powerful (and, therefore, complicated) web application using Catalyst. And while Catalyst is a fine web framework, it’s not the simplest way to write a basic web application.

But it doesn’t need to be that way. You can write PSGI programs in “raw PGSI” without reaching for a framework. Sure, you’ll still have the problems listed in my point two above, but when you want to address that, you can start looking at the various web frameworks. Even so, you’ll have three big benefits from moving to PSGI.

The Benefits of PSGI

As I see it, there are three huge benefits that you get from PSGI.

Software Ecosystem

The standard PSGI toolkit is called Plack. You’ll need to install that. That will give you adapters enabling you to use PSGI programs in pretty much any web deployment environment. It also includes a large number of plugins and extensions (often called “middleware”) for PSGI. All of this software can be added to your application really simply. And any bits of your program that you don’t have to write is always a big advantage.

Testing and Debugging

How do you test your CGI program? Probably, you use something like Selenium (or, perhaps, just LWP) to fire requests at the server and see what results you get back.

And how about debugging any problems that your testing finds? All too often, the debugging that I see is warn() statements written to the web server error log. Actually, when answering questions on StackOverflow, often the poster has no idea where to find the error log and we need to resort to something like use CGI::Carp 'fatalsToBrowser', which isn’t exactly elegant.

A PSGI application is just a subroutine. So it’s trivial for testing tools to call the subroutine with the correct parameters. This makes testing PSGI programs really easy (and all of the tools to do this are part of the Plack distribution I mentioned above). Similarly, there are tools debugging a PSGI program far easier than the equivalent CGI program.

Deployment Flexibility

This, to me, is the big one. I talked earlier about the performance problems that the CGI environment leads to. You have a CGI program that is only used by a few people on your internal network. And that’s fine. The second or so it takes to respond to each request isn’t a problem. But it proves useful and before you know it, many more people start to use it. And then someone suggests publishing it to external users too. The one-second responses stretch to five or ten seconds, or even longer and you start getting complaints about the system. You know you should move it to a persistent environment like FastCGI or mod_perl, but that would require large-scale changes to the code and how are you ever going to find the time for that?

With a PSGI application, things are different. You can start by deploying your PSGI code in a CGI environment if you like (although, to be honest, it seems that very few people do that). Then when you need to make it faster, you can move it to FastCGI or mod_perl. Or you can run it as a standalone web service and configure your web proxy to redirect requests to it. Usually, you’ll be able to use exactly the same code in all of these environments.

In Conclusion

I know why people still write CGI programs. And I know why people still write them using CGI.pm – it’s what people know. It’s seen as the easy option. It’s what twenty-five years of web tutorials are telling them to do.

But in 2018 (and, to be honest, for most of the last ten years) that’s simply not the best approach to take. There are more powerful and more flexible options available.

Please don’t write code using CGI.pm. And please don’t write tutorials encouraging people to do that.

Categories
Programming

Easy PSGI

When I write replies to questions on StackOverflow and places like that recommending that people abandon CGI programs in favour of something that uses PSGI, I often get some push-back from people claiming that PSGI makes things far too complicated.

I don’t believe that’s true. But I think I know why they say it. I think they say it because most of the time when we say “you should really port that code to PSGI” we follow up with links to Dancer, Catalyst or Mojolicious tutorials.

I know why we do that. I know that a web framework is usually going to make writing a web app far simpler. And, yes, I know that in the Plack::Request documentation, Miyagawa explicitly says:

Note that this module is intended to be used by Plack middleware developers and web application framework developers rather than application developers (end users).

Writing your web application directly using Plack::Request is certainly possible but not recommended: it’s like doing so with mod_perl’s Apache::Request: yet too low level.

If you’re writing a web application, not a framework, then you’re encouraged to use one of the web application frameworks that support PSGI (http://plackperl.org/#frameworks), or see modules like HTTP::Engine to provide higher level Request and Response API on top of PSGI.

And, in general, I agree with him wholeheartedly. But I think that when we’re trying to persuade people to switch to PSGI, these suggestions can get in the way. People see switching their grungy old CGI programs to a web framework as a big job. I don’t think it’s as scary as they might think, but I agree it’s often a non-trivial task.

Even without using a web framework, I think that you can get benefits from moving software to PSGI. When I’m running training courses on PSGI, I emphasise three advantages that PSGI gives you over other Perl web development environments.

  1. PSGI applications are easier to debug and test.
  2. PSGI applications can be deployed in any environment you want without changing a line of code.
  3. Plack Middleware

And I think that you can benefit from all of these features pretty easily, without moving to a framework. I’ve been thinking about the best way to do this and I think I’ve come up with a simple plan:

  • Change your shebang line to /usr/bin/plackup (or equivalent)
  • Put all of your code inside my $app = sub { ... }
  • Switch to using Plack::Request to access all of your input parameters
  • Build up your response output in a variable
  • At the end of the code, create and return the required Plack response (either using Plack::Response or just creating the correct array reference).

That’s all you need. You can drop your new program into your cgi-bin directory and it will just start working. You can immediately benefit from easier testing and later on, you can easily deploy your application in a different environment or start adding in middleware.

As an experiment to find how easy this was, I’ve been porting some old CGI programs. Back in 2000, I wrote three articles introducing CGI programming for Linux Format. I’ve gone back to those articles and converted the CGI programs to PSGI (well, so far I’ve done the programs from the first two articles – I’ll finish the last one in the next day or so, I hope).

It’s not the nicest of code. I was still using the CGI’s HTML generation functions back then. I’ve replaced those calls with HTML::Tiny. And they aren’t very complicated programs at all (they were aimed at complete beginners). But I hope they’ll be a useful guide to how easy it is to start using PSGI.

My programs are on Github. Please let me know what you think.

If you’re interested in modern Perl Web Development Techniques, you might find it useful to attend my upcoming two-day course on the subject.

Update: On Twitter, Miyagawa reminds me that you can use CGI::Emulate::PSGI or CGI::PSGI to run CGI programs under PSGI without changing them at all (or, at least, changing them a lot less than I’m suggesting here). And that’s what I’d probably do if I had a large amount of CGI code that I wanted to to move to PSGI quickly. But I still think it’s worth showing people that simple PSGI programs really aren’t any more complicated than simple CGI programs.

Categories
Speaking

Modern Perl Web Development

Modern Web Development with Perl from Dave Cross

Last Saturday was the annual London Perl Workshop. I’ll have more to say about the day later[1], but I just wanted to take the time to share the slides for the workshop that I ran in the morning. It was a quick guide to modern Perl web development. And, as far as I’m concerned, that basically means PSGI and Plack.

Update: People asked me to put the example PSGI apps from the workshop online somewhere. They’re now all on github. Let me know if you find them at all useful.

[1] Executive summary: a wonderful day, thanks to everyone who was involved – particularly Mark Keating.