Many people are discovering that the scripting language Perl is the most useful language for getting many computing tasks done. Many of them fail to discover the vast amount of documentation that comes with the language. In this article Dave Cross takes an overview of “probably the best set of free documentation for any software currently available”.
This article first appeared in the May 1999 issue of the online Perl magazine PerlMonth.
If you’re reading this edition of PerlMonth then it’s a fair bet that you are already aware of Perl’s usefulness, flexibility and just all-round coolness. It is, however, a little more likely that you are unaware that within its distribution, Perl contains what is probably the best set of free documentation for any software currently available. In this article I want to introduce those of you who are less familar with the Perl documentation to the huge amount of information that it contains and how it can almost certainly make your coding life easier.
Obviously with such a massive documentation set, this can only be the most cursory of tours, so hang on to your hats.
Where is this documentation then?
By default, when you install Perl (and here I’m talking about downloading and building your own Perl installation as most Unix users do – I’ll come to the ready-made ActivePerl for Win32 in a while) the Perl documentation set is installed in /usr/lib/perl5/pod. The documentation is in the form of a number of POD files. POD stands for plain old documentation and is a simple text-based format. It’s human-readable, so that if you really want to you can just open the files in your favourite text editor. This probably isn’t the best way to access them though and the standard Perl distribution supplies a number of better tools.
The simplest, and most often used, of these tools is perldoc. This is a perl script that reads POD files and displays them to the user in a similar fashion to more (or less, if that takes your fancy). Typing
on your command line will display the contents of the introductory POD file, which is called perl.pod. This is basically a list of all of the other POD files and what they contain. We’ll get back to looking at these files in more detail a little later on.
Another nice feature of perldoc is that it takes a command line switch of -f followed by the name of a Perl function. Called in this mode it will display the documentation for the given function. For example typing
perldoc -f localtime
will tell you all you ever needed to know about the ‘localtime’ function (we will be returning to this example later).
Full details on how to use perldoc can be found by typing
at your command line.
Other POD tools supplied as standard with Perl will convert POD to various other common text formats. The most useful are probably pod2html and pod2man. Hopefully the names of these programs are self-explanatory. Once again, typing
will display full documentation for these programs.
There are, I realise, some people who don’t use a full Perl distribution. They write their scripts on a machine that doesn’t have Perl installed and then upload the scripts to their ISP’s web server to run them as CGIs. The best advice to these people would be to install a version of Perl on the machine that they develop on (it is, after all, completely free), but in a worst case scenario where this can’t happen, the documentation is all available on the World Wide Web at http://perldoc.perl.org/
That’s all very well, but I’m on Windows and I can’t find these POD files.
As I said earlier, POD is the standard way to distribute documentation with Perl. Well-behaved module authors will use it to document how to use their modules. However, Windows users are well known for liking things to be non-standard.
Most Windows users who use Perl have downloaded ActivePerl from ActiveState (http://www.activestate.com. ActiveState have built Perl from the standard sources and have bundled it up into a nice self-installing executable. This distribution does not contain the standard PODs. Instead they have used pod2html (or something very similar) to convert them all to HTML files. The ActivePerl installation will place an ‘ActivePerl’ program group on your Start Menu. This program group contains one item labelled ‘Online Help’ and selecting this icon will open your favourite web browser with the ActivePerl help system displayed in it. This consists of the standard Perl documentation together with a number of extensions that ActiveState have added. These are largely the documentation for Win32-specific Perl modules.
I hope my Windows readers won’t mind too much if for the rest of the article I only explain the Unix way of doing things. I’m sure you can work out the nearest Windows equivalent.
So What’s In This Wonderful Documentation Then?
Let’s take this opportunity to take a quick look at some of the files in the Perl documentation set and see what information can be gleaned from them. As mentioned above, typing
will give a list of the Perl documentation contained within your distribution. Please note that the exact makeup of the documentation set varies slightly from release to release as the functionality of Perl changes (and the documentation is extended and improved). You can however be sure of two things. Firstly the documentation will include all of the files that I will discuss in this article and secondly, if there is ever a discrepancy between your distribution’s documentation and the contents of a Perl book that you are studying, it is a sure bet that the online documentation is more accurate.
The perl.pod file suggests the best order in which to read the rest of the pod files and I intend to follow that sequence in this discussion.
The first four pod files listed are perl (an overview of Perl), perldelta (a description of the changes in versions that have lead to your current Perl distribution), perlfaq (the Perl Frequently Asked Questions) and perltoc (the table of contents for the rest of the documentation). Of these the beginner should certainly read the overview and the FAQ.
The FAQ really is very aptly named as any regular reader of the comp.lang.perl.misc newsgroup will readily testify. A huge amount of time is wasted in this newsgroup answering questions that already have a definitive answer in this document. The FAQ has been worked on over a number of years by Tom Christiansen and, latterly, Nathan Torkington and they have produced a document which every Perl programmer should be intimately acquainted with.
The FAQ is divided into nine sections, each of which address a different area of Perl programming. The areas covered in each section are:
- perlfaq1 – General Questions About Perl
- perlfaq2 – Obtaining and Learning about Perl
- perlfaq3 – Programming Tools
- perlfaq4 – Data Manipulation
- perlfaq5 – Files and Formats
- perlfaq6 – Regexes
- perlfaq7 – General Language Issues
- perlfaq8 – System Interaction
- perlfaq9 – Networking
Each section contains a large number of questions (and, more importantly, their answers). It’s worth looking in detail at a couple of the most frequently asked questions and seeing how easy it is to find their answers in the FAQ.
Easily the most frequently asked question in comp.lang.perl.misc over the last twelve months has been ‘Is Perl Y2K Compliant?’ This is asked at least once a week. Let’s see if we can find the answer in the FAQ. There are a number of ways to go about this. The simplest is to look in perlfaq.pod which has a complete list of all of the questions in each section of the FAQ. It is simple enough to look through this list to see if anything looks like it might fit the bill. Sure enough, in perlfaq4, in amongst a number of questions about dates, we see a question that says ‘Does Perl have a year 2000 problem? Is Perl Y2K compliant?’ That sounds like a likely candidate to me. Opening perlfaq4 and looking for the question, we find a well-written and informative article about the problem and how Perl programmers can deal with it. Problem solved!
Another very frequently asked question concerns checking the validity of email addresses. A number of people have devoted a lot of time to developing a regular expression that matches valid email addresses. Many of them don’t even bother reading the relevant RFCs first to find out what the full email address specification is. They create regular expressions which match email addresses from their own limited experience and frequently get it horribly wrong. They then post these as ‘solutions’ to the questions asked in comp.lang.perl.misc and the errors are propagated even further.
Let’s look in the FAQs to see whether we can get any more definitive information. Looking in perlfaq.doc and reading the list of questions, there is nothing that looks appropriate in perlfaq4 (data manipulation) or perlfaq6 (regular expressions), however in the list for perlfaq9 (networking) we see ‘How do I check a valid mail address?’. Opening perlfaq9.pod and reading the article we find the question answered in great detail.
It is so easy to find the answers to these questions (and hundreds more) in the FAQ that it is difficult to know why anyone needing the answers would want to post a message to a newsgroup. They will end up waiting up to three days to read a number of differing opinions when the authoritative answer is already available on their system.
Moving on with our tour of perl.pod, the next few files we come to are perldata, perlsyn and perlop. For a beginner, these cover three of the most important areas of Perl programming, data types, syntax and operators respectively.
You can find a complete description of all of Perl’s data types in perldata.pod. It discusses scalars, lists, hashes and typeglobs in great detail. This is also the place to go if you need more explanation of the concept of context.
If you need to know more about the details of Perl syntax, then perlsyn.toc is where you need to look. This file covers details of declarations, simple and compound statements, loops and blocks. It also has an introduction to POD syntax (although this is covered in more details in perlpod.pod). If you want to know how to get round the absence of a switch statement in Perl, this is also the file to read.
Perl has more operators than the average programming language and in order to make sense of them you should read perlop.pod. As well as obvious operations (arithmetic, relational, logical, etc.) other less obvious constructs are considered to be operators in Perl. These include quoting operators (”, “” and \`\` together with their canonical versions q//, qq // and qx//) and the input operators like <>. One of the most important sections of this file is right at the start where there is a complete list of Perl operators together with their precendence and associativity.
For a beginner Perl programmer the next five files on the list are also very important. They are perlre.pod, perlrun.pod, perlfunc.pod, perlvar.pod and perlsub.pod and they cover regular expressions, running Perl, Perl functions, special Perl variables and subroutines.
Regular expressions form the heart of many Perl programs and no-one can truly consider themselves a Perl programmer until they have a good understanding of this topic. The explanation of regular expressions in perlre.pod is as thorough as you could possibly want.
Perl has a large number of command line arguments, and the correct use of them can make many programs a lot shorter. They are the basis of a number of Perl one-liners. In perlrun.pod these arguments are all in explained in detail, together with a number of other run-time considerations. For example, the usual ways to run scripts under Unix is to make use of the ‘#!’ to tell the operating system which interpreter to use to. This is not available on many other operating systems (notably Windows) and this file discusses ways to emulate it.
You can find a complete list of Perl functions in perlfunc.pod. If you need to know the correct order of arguments to ‘read’ or what values are returned by ‘localtime’, then this is the file to read. Not reading this file in detail can also lead to you adding to the number of frequently asked questions being repeated in comp.lang.perl.misc. For a good example of this, let’s return to the localtime function. Reading the description of this function in perlfunc.pod, it clearly says:
“$year is the number of years since 1900, that is, $year is 123 in year 2023, and not simply the last two digits of the year.”
Why is it then that no a week goes by without someone asking why localtime only returns two digit dates and when this will be fixed? Is it any wonder that some of the regulars can get a little testy?
It’s worth reiterating at this point the shortcut that perldoc offers if you just need to read the description of one particular function. Typing
perldoc -f [function name]
will display the required documentation.
The next file on our brief tour is perlvar.pod. This file explains all of Perl’s special variables. It lists them by their ‘Perlish’ name together with the more verbose name you can use if you add ‘use English’ to the top of your script. Everything is here from $_ (or $ARG) to $^X (or $EXECUTABLE_NAME).
The last file that I want to look at in any kind of depth on this tour is perlsub.pod. This file describes the declaration, definition and use of subroutines in Perl. This includes discussion of parameter passing, the differences between ‘my’ and ‘local’ and the relatively new (and still underused) concept of prototypes. If you need to know how to pass lists and hashes to subroutines without the values getting muddled up with your scalars, then this is the place to be.
We have looked at less than a third of the documentation files that come with the standard distribution of Perl, but already we have covered all of the basics of the language and have enough in-depth knowledge to allow us to write some really quite complex programs. Some of the major topics that we haven’t yet touched on are module usage, references, complex data structures and object oriented programming. All of these are covered in some considerable depth in the rest of the POD files and hopefully the descriptions in perl.pod will allow you to find them.
As I said at the start of the article, it’s impossible to do justice to the Perl documentation set in one short article, so this has been a very fleeting visit. If you only go away with one concept from this article, it should be the fact that the answer to your Perl problem is very likely to be sitting somewhere on your own computer and that looking for it there is likely to be much quicker than asking the question on Usenet.