Modules, References, Data Structures and Objects

Many people are discovering that the scripting language Perl is the most useful language for getting many computing tasks done. Many of them fail to discover the vast amount of documentation that comes with the language. In the second of his articles about “probably the best set of free documentation for any software currently available”, Dave Cross looks in more depth at more complex areas of Perl.

This article first appeared in the June 1999 issue of the online Perl magazine PerlMonth (which no longer seems to be online).

Note: All of the articles in the RTFM series are really old. It’s likely that many of the links no longer work. I’m leaving these articles online for historical reasons, but these days you should visit perl.org for links to the best current Perl resources.

Introduction

In last month’s column I gave you a brief overview of the absolutely essential parts of the Perl documentation set. This month I am going to go a little deeper into Perl and discuss the parts of the documentation that cover modules, references and complex data structures. At the end I’ll also touch on Perl objects. There are less files to cover this month so I’ll be able to go into a little more detail on each one.

Please note that in recent versions of Perl there have been a number of changes in the documentation that covers these topics. If you can’t find a particular file that I discuss below, it may be that it wasn’t included with your version of Perl. As always, upgrading to the latest stable version of Perl is highly recommended.

Modules

Most Perl programmers will come across modules by using them before they start to write them. For example, you all (I hope) start each and every one of your scripts with the line

[text]
use strict;
[/text]

This instruction loads and compiles the code within the library module strict.pm. This module enables certain stricter checks on the code that follows.

The standard Perl distribution comes with a very large number of modules. These files encapsulate various pieces of code into chunks that are only loaded when required. Other standard modules that you might use include:

  • Cwd – to load functions to get your current working directory
  • File::Basename – to break a complete path name into its component parts
  • integer – to force Perl to do all maths using integers
  • CGI – to make it far easier to write CGI scripts

The list of standard modules included in your version of Perl (and it does change from version to version) is contained in perlmodlib.pod. Remembering what I discussed last month about reading the Perl documentation, you can read this file by typing:

[text]
perldoc perlmodlib
[/text]

at your command line. This will give you a brief description of all of the standard Perl modules installed on your system (there will be well over a hundred of them). You can then get further information about a given module by typing:

[text]
perldoc [module name]
[/text]

for example

One of the joys of working with the Perl community is that we like to share things. One of the things that we most like to share is modules. What this means is that when someone writes a useful module, they will usually make it available for others to use. Obviously not everyone will find every module of use, so it would be a waste of space to include every module in the standard Perl distribution. To get round this problem there is a repository of Perl modules called the Comprehensive Perl Archive Network (or CPAN to its friends). The file perlmodlib.pod also talks about CPAN. It includes a list of all the CPAN mirrors known to the authors at the time of writing together with a list of the major categories that CPAN is split into.

The last major part of perlmodlib.pod is a large section about writing your own modules and submitting them to CPAN. This file talks about high level concepts (like naming conventions) and administative tasks (like how you actually get your files onto CPAN). To find out about the nitty gritty of actually writing a module you need to look at perlmod.pod.

In perlmod.pod you’ll find all you need to know about creating your own modules. It begins with a discussion of the concept of packages which define separate namespaces which you use to prevent your module’s variable and function names from clashing with variable and function names in the main script that is using your module. This leads on to a discussion of symbol tables which are where your package variables are stored. The file then talks about the differences between packages, modules and classes and gives an example of a simple module (classes, or objects, are covered in more detail in perlobj.pod which we’ll discuss later). It finishes with a discussion of the differences between ‘use’ and ‘require’.

The last of the three files which discusses modules is perlmodinstall.pod. As you can probably guess from the name, this files explains how to install modules that you’ve downloaded from CPAN. It discusses the ‘old’ way of building modules together using the standard Unix ‘make’ utility as well with the ‘new’ way using the CPAN.pm module. It also discusses ways to install modules if you are on a different operating system like Windows, MacOS or VMS.

References

The next topic I want to cover is references. You can do a lot of clever stuff in Perl without using references, but if you want really want to unlock the power of Perl you need to understand references. They are as fundamental to Perl as pointers are to C (but they are quite a lot cleverer than pointers!)

In perlref.pod you can find out all there is to know about references. It starts by talking generally about the differences between hard and symbolic references before getting down to the details of how to create references to your data (or subroutines). Even people who think they know refernces may find it hard to list all of the seven methods that this file gives for creating references. Having created references the next important thing is to use them to get back to your data (or subroutines). This is the subject of the next section of the file. The file also discusses symbolic references in more detail and why ‘use strict’ makes them illegal (and why this is a good thing!)

Finally the files gets a bit deep and starts talking about closures and function templates. Whilst these are undoubtably extremely powerful Perl features, most beginners would be excused for skipping them on a first read through.

The information in perlref.pod is complete and definitive (or, at least, aims to be), but some people have found it all a little hard going. To get round this problem a new file has been added to the most recent Perl versions. This file is called perlreftut.pod and is a tutorial introduction to references. It was written by Mark-Jason Dominus and is based on an article he wrote for The Perl Journal last year. This file covers the most of the same topics as perlref.pod but people seem to find it a more friendly introduction to the topic (a llama rather than a camel).

Complex Data Structures

One of the best reasons for getting to know references in Perl is that it makes it easy (actually it makes it *possible*) to use complex data structures. This is because arrays and hashes can only contain scalar values, so it is impossible, for example, to have an array of hashes. What you can have though is an array of references to hashes, and this concept can be extended to create data structures of arbitary complexity.

Complex data structures are the subjects of our next two files, perldsc.pod and perllol.pod. Of the two, perllol.pod is the simpler as it only discusses one data structure, the list of lists (or array of arrays). Obviously it covers it in some depth. The other file, perldsc.pod, is the Data Structures Cookbook and it covers many different data structures. It starts by discussing general issues about using references to build up complex data structures before giving in-depth examples of a list of lists, a hash of lists, a list of hashes and a hash of hashes, discussing how to populate them from a variety of sources and (just as importantly) how to access the data in them. Finally it gives examples of some even more complex data structures.

Objects

Perl objects are implemented using modules and imposing certain rules on how the module is constructed. There are a number of files that discuss these concepts.

In perlobj.pod, you will find the definitive guide to creating Perl objects. It explains how to define a class using a Perl module with methods that are just subroutines. An object is an instance of the class and in Perl terms it is represented by a reference (usually to a hash). It talks a lot about constructors and destructors as well as the two syntaxes for calling methods (direct and indirect) and the problems to be wary of with both.

As with perlref.pod there was a feeling that this file was a little hard for some beginners to understand so a more friendly tutorial file was written.

This tutorial file is called perltoot.pod. TOOT stands for Tom’s Object Oriented Tutorial (Tom is Tom Christiansen, the author of a great deal of the Perl documentation). This tutorial talks you through creating a simple Perl class and then making various improvements to it.

Another useful file for writing Perl objects is perlbot.pod – the Bag’o Object Tricks. This has a number of very useful tips for designing and implementing objects in Perl. This is well worth a read if you plan on creating objects. Many Perl objects have been written and many lessons have been learnt by programmers before you. This file contains the distilled wisdom of those programmers.

The last two files that I want to talk about this month are about creating Perl modules that are not exclusively written in Perl. Most commonly, these are modules where part of the functionality is written in C. This might be for performance reasons or it might be because the module acts as an interface to an external system and there already exists an interface in C (a good example of this is a database interface like Sybperl where a database connection library in C is provided by the vendor of the database – in this case Sybase).

This kind of module is often called a Perl extension and the linkage between the Perl code and the non-Perl sections is most easily created using an interface language called XS. This is a very advanced topic in Perl, so I don’t want to go into any detail. I should just mention that there are two files called perlxs.doc and perlxstut.doc which explain XS in some detail. Once again, one file (perlxs.pod) is a reference guide and the other (perlxstut.pod) is a tutorial intoduction.

Conclusion

We’ve come a lot further in our tour of the Perl documentation this month. All of the major topics that we didn’t cover last month have now had at least a mention.

Once more the important point to take away is that the answers to any questions you have about Perl are already installed on your system and a short time looking for the definitive answer in the standard documentation is far more efficient than waiting a couple of days for a number of contradictory replies to a Usenet post.

You’ll get fewer flames too.

Comments

One response to “Modules, References, Data Structures and Objects”

  1. […] Modules, References, Data Structures and Objects […]

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.