Many people’s first experience of Perl comes when they download a free CGI script from the web. In this article Dave Cross discusses why that might be a bad way to start.
This article was originally the lead article on perl.com in January 2002.
Introduction
No matter how much we try to convince people that Perl is a multi-purpose programming language, we’d be deluding ourselves if we didn’t admit that the majority of programmers first come into contact with Perl through their experience with CGI programs. People have a small Web site and one day they decide that they need a guest book, a form mail script or a hit counter. Because these people aren’t programmers, they go out onto the Web to see what pre-written scripts they can find. And there are plenty to choose from. Try searching on “CGI scripts” at Google. I found about 2 million hits. The first two were those well-known sites – Matt’s Script Archive and the CGI Resource Index. Our Web site owner will visit one of these sites, find the required scripts and install them on his site. What could be simpler? See, the Web is as easy as people make it out to be. In this article, I’ll take a closer look at this scenario and show that all is not as rosy as I’ve portrayed it above.CGI Script Quality
An important factor that Google takes into account when displaying search results is the number of links to a given site. Google assumes that if there are a large number of links to a given Web page, then it must be a well-known page and that Google’s visitors will want to visit that site first. Notice that I said “well-known” in that previous paragraph. Not “useful” or “valuable”. Think about this for a second. The types of people that I described in the introduction are not programmers. They certainly aren’t Perl programmers. Therefore, they are in no position to make value judgments on the Perl code that they download from the Internet. This means that the “most popular” site becomes a self-fulfilling prophecy. The best known site is listed first on the search engines. More people download scripts from that site, assuming that the most popular site must have the highest quality scripts and that the popular sites end up becoming more popular. At no point does any kind of quality control enter into the process. OK, so that’s not strictly true. If the scripts from a particular site just didn’t work at all, then word would soon get out and that site’s scripts would become unpopular. But what if the problems were more subtle and didn’t manifest themselves on all sites. Here is a list of some potential problems:- Not checking the results of an
opencall. This will work fine if the expected file exists and has the right permissions. But what happens when the file doesn’t exist? Or it exists but the CGI process doesn’t have permissions to read from it or write to it? - Bad CGI parameter parsing code. CGI parameter parsing is one of those things that is easy to do badly and hard to do well. It’s simple enough to write a parser function that handles most cases, but does it handle both GET and POST requests? What about keys with multiple associated values? And does it process file uploads correctly?
- Lack of security. Installing a CGI program allows anyone with an Internet connection to run a program on your server. That’s quite a scary thing to allow. You’d better be well aware of the security implications. Of course, if people only ever run the script from your HTML form, then everything will probably be fine, but a cracker won’t do that. He’ll fire
interesting” sets of parameters at your script in an attempt to find its weaknesses. Suddenly a form mail script is being used to send copies of vital system files to the cracker.It’s also worth bearing in mind that because these scripts are available on the Web, crackers can easily get the source code. They can then work out any insecurities in the scripts and exploit them. Recently, a friend’s Web site came under attack from crackers and amongst the traces left in the access log were a large number of calls to well-known CGI scripts. For this reason, it is even more important that you are careful about security when writing CGI scripts that are intended to be used by novice Webmasters.
Setting a Good Example
Although the people who are downloading these scripts aren’t usually programmers, there often comes a time when they want to start changing the way a program works and perhaps even writing their own CGI programs. When this time comes, they will go to the scripts they already have for examples of how to write them. If the original script contained bad programming practices, then these will be copied in the new scripts. This is the way that many bad programming practices have become so common among Perl scripts. I, therefore, think that it’s a good idea for any publicly distributed programs to follow best programming practices as much as possible.Script Quality – A Checklist
So now we have an obvious problem. I said before that the people who are downloading and installing these scripts aren’t qualified to make judgements on the quality of the code. Given that there are some problematic scripts out there, how are they supposed to know whether they should be using a particular script that they find on the Web? It’s a difficult question to answer, but there are some clues that you can look for that give a idea of how well-written a script is. Here’s a brief checklist:- Does the script use
-wanduse strict? The vast majority of Perl experts recommend using these tools when writing Perl programs of any level of complexity. They make any Perl program more robust. Anyone distributing Perl programs without them probably doesn’t know as much Perl as they think they do. - Does the script use Perl’s taint mode? Accepting external data from a Web browser is a dangerous business. You can never be sure what you’ll get. If you add
-Tto a program’s shebang line, then Perl goes into taint mode. In this mode Perl distrusts any data that it gets from external sources. You need to explicitly check this data before using it. Using-Tis a sign that the author is at least thinking about CGI security issues. - Does the script use CGI.pm? Since Perl 5.004, CGI.pm has been a part of the standard Perl distribution. This module contains a number of functions for handling various parts of the CGI protocol. The most important one is probably
param, which deals with the parsing of the query string to extract the CGI parameters. Many CGI scripts write their own CGI parameter parsing routine that is missing features or has bugs. The one in CGI.pm has been well-tested over many years in thousands of scripts – why attempt to reinvent it? - How often is the script updated? One reason for a script not to use CGI.pm might be that it hasn’t been updated since the module was added to the Perl distribution. This is generally a bad sign. You should look for scripts that are kept up to date. If there hasn’t been been a new version of the script for several years, then you should probably avoid it.
- How good is the support? Any program is of limited use if it’s unsupported. How do you get support for the program? Is there an e-mail address for the author? Or is there a support mailing list? Try dropping an e-mail to either the author or the mailing list and see how quickly you get a response.
nms – A New CGI Program Archive
Having spent most of this article being quite negative about existing CGI program archives, let’s now get a bit more positive. In the summer of 2001, a group of London Perl Mongers started to wonder what would be involved in writing a set of new CGI programs that could act as replacements for the ones in common use. After some discussion, the nms project was born. The name nms originally stood for a disparaging remark about one of the existing archives, but we decided that we didn’t want the kind of negativity in the name. By that time, however, the abbreviated name was in common usage so we decided to keep it – but it no longer stands for anything. The objectives for nms were quite simple. We wanted to provide a set of CGI programs which fulfilled the following:- As easy (or easier) to use as existing CGI scripts.
- Use best programming practices
- Secure
- Bug-free (or, at least, well supported)
sendmail rather than using one of the e-mail modules. In these cases, we decided that getting people to use the scripts (by not relying on CPAN) was more important to us than following best practices.
nms is a SourceForge project. You can get the latest released versions of the scripts from http://nms-cgi.sourceforge.net or, if you’re feeling braver, then you can get the leading edge versions from CVS at the project page at http://sourceforge.net/projects/nms-cgi/. Both of those pages also have links to the nms mailing lists. We have two lists, one for developers and one for support questions. There is also a FAQ that will hopefully answer any further questions that you have about the project.
Here is a list of the scripts available from nms
- Countdown Count down the time to a certain date
- Free For All Links A simple Web link database
- Formmail Send e-mails from Web forms
- Guestbook A simple guest book script
- Random Image Display a random image
- Random Links Display a link chosen randomly from a list
- Random Text Display a randomly chosen piece of text
- Simple Search Simple Web site search engine
- SSI Random Image Display a random image using SSI
- Text Clock Display the time
- Text Counter Text counter
A Plea for Help
So now we have a source of well-written CGI programs that we can point users to. What more needs to be done? Well, the whole point of writing this article was to ask more people to help. There’s always more work to do :-)- Peer review. We think we’ve done a pretty good job on the scripts, but we’re not interested in resting on our laurels. The more people that look at the scripts the more likely we’ll catch bugs and insecurities. Please download the scripts and take a look at them. Pass any bugs on to the developers mailing.
- Testing. We test the scripts on as many platforms with as many different configurations as we can, but we’ll always miss one or two. Please try to install the scripts on your systems and let us know about any problems you have.
- Documentation. Our documentation isn’t any worse than the documentation for the existing archives, but we think it could be much better. If you’d like to help out with this, then please get in touch with us.
- Advocacy. This is the most important one. Please tell everyone that you know about nms. Everywhere that you see people using other CGI scripts, please explain to them the potential problems and show them where to get the nms scripts. Having written these scripts, we feel it’s important that they get as wide exposure as possible. If you have any ideas for promoting nms, then please let us know.

I am the guy you are talking about in this article. I guess I am going to have to change my formmail scripts. I wish I knew how they did the exploits. It seems someone uses my formmail to send their own emails. The machine I am now on uses WHM for admin. In the logs I am seeing -remote- sends thousnads of emails and don’t have a clue how to stop it. I guess I have stumbled on a clue here. The problem is like the article says I have use MSA scripts as a teacher and have hundreds of scripts that send email using the old formmail.pl code to send them.
Any suggestions where to figure this out would be nice, thanks.
Hi Roger,
The first thing you need to do is to remove the compromised formmail program from your web server.
After that, feel free to email me (dave at this domain name) and I’ll be happy to see what I can do to help you fix this.