In this beginner level article, Dave Cross introduces the Template Toolkit.
This article was originally published in Linux Format in April 2004.
Template processing is a method of producing output that takes some fixed (or boilerplate) text and puts variable data values inside it. The most obvious example is that of a form letter. When you get that prize draw win notification from the Readers Digest it has been made to look as though it is personal to you when, in fact, a few pieces of information about you have been inserted into a letter template.
The Template Toolkit is a piece of software that allows you to carry out powerful template processing operations. It is written in Perl so it runs on pretty much any computing platform that you can name, but you don’t need to know any Perl in order to use it. If, however, you do know Perl, then a whole extra level of power is opened up to you.
Using the Template Toolkit
Installing TT gives you two new command line utilities - tpage and ttree. You use tpage to process a single template file and write the output to STDOUT. If your needs are more sophisticated, ttree can process a whole directory tree of templates and write the output to another directory. We’ll start by using tpage.
The tpage command takes only one argument, –define. This allows you to define data variables to be inserted into your template. You can use many variable definitions at once. You would therefore use it from the command line something like this
$ tpage --define name='Mr Cross' --define amount='100' \ --define due='1st April' letter.tt
This defines three variables called “name”, “amount” and “due” which can now be referenced within the template. The template itself is in the file letter.tt and it looks like this:
Dear [% name %], According to our records you owe us £[% amount %]. Please pay before [% due %] or we will send the boys round. Regards.
You can see from this simple example that the places where you want you variables to be inserted are marked with [% ... %] tags. These are known as template /directives/. We’ll look at many different types of directives over the next few months, but currently we are looking at the simplest of directives which just contains the name of a variable. When the template is processed, this directive will be replaced by the current value of the variable.
So, if we process this template using the command line we saw above, we’ll get this output:
Dear Mr Cross, According to our records you owe us £100. Please pay before 1st April or we will send the boys round. Regards.
Obviously, if we had to type a command line like that every time we had to send a form letter, we really haven’t gained much over typing them all in manually. In a while, we’ll see a way to make this easier, but first let’s take two minutes to tidy up the output a little.
Currently, our template just takes the values we give it and reproduces them. For the amount due, it might look better if we forced the value to always have two decimal places. We can do this by using the Format plugin.
In TT, plugins are a way for the template processor to provide extra functionality by interacting with external resources. In this case the Format plugin provides an interface to a C-style “printf” feature. We change the template so that it looks like this:
[% USE money=format('%.2f') -%] Dear [% name %], According to our records you owe us £[% money(amount) %]. Please pay before [% due %] or we will send the boys round. Regards.
We’ve added a “USE” directive which loads the Format plugin and we’ve used the plugin to create a formatting function called “money”. Any expression that is passed to this function is formatted according to the format string that was used when “money” was created. In this case, the /%.2f/ ensures that the amount has two decimal places. If we process out template using the same call to tpage as before, we now get slightly changed output.
Dear Mr Cross, According to our records you owe us £100.00. Please pay before 1st April or we will send the boys round. Regards.
Another small change that we’ve made to our template is the addition of a ‘-’ character at the end of the USE directive. Generally, all whitespace outside of a directive is passed straight through to the output, but the ‘-’ character tells TT to ignore the whitespace (including newlines) that follow the directive. The net effect of this is that adding the USE directive doesn’t add a blank line to the output.
TT comes with a useful set of standard plugins. We’ll see some more of them in the next few sections.
Reading data from a file
As I mentioned before we haven’t really gained much if we have to type in a long command each time we want to process a form letter. It would be far easier if we could read the data in from some external source. And, of course, we can. We can read data from all sorts of external sources. We’ll start by reading the data from a text file.
We’ll assume that our data file has the following format:
name : amount : due Mr Cross : 10 : 1st April Mr Smith : 20 : 1st March Mr Jones : 50 : 1st February
The first line of the file contains the names of the fields and the other lines contain the actual data. We can now change our template to look like this:
[% USE money = format('%.2f') -%] [% USE debtors = datafile(file) -%] [% FOREACH debtor = debtors %] Dear [% debtor.name %], According to our records you owe us £[% money(debtor.amount) %]. Please pay before [% debtor.date %] or we will send the boys round. Regards. [%- END %]
As you’ll see, we’ve added another directive which loads the Datafile plugin. This plugin opens the given data file and returns an iterator object which can be used to access the data in the file. We assign this iterator object to the variable “debtors”. Note that the name of the file to be used is in the variable “file”. This is now the only variable that we need to pass to the template processor.
Having opened the data file we can process it a row at a time using the FOREACH directive. FOREACH iterates across a list, setting the loop variable to each element of the list in turn. In this example the loop variable is called “debtor” and each time round the loop it gets one of the values returned by the “debtors” iterator. These values are objects that contain the individual data items from each row in the data file. You can access these items using a dot notation, for example the name from the current row of data is in “debtor.name”. We have changed the names of the variables used in the rest of the template to reflect this.
The end of the FOREACH loop is marked with an END directive and notice the ‘-’ and the start of that directive which strips out any whitespace preceding it (i.e. the newline before it.)
Of course, the random format that I chose for the data file just happened to be the default format for the Datafile plugin (using colons as the delimiter) but it’s easy to use an alternative delimiter. For example, if our debtors file had been delimited with pipe characters we would have used code like this:
[% USE debtors = datafile(file, delim => '|') %]
The delimiter can be surrounded by optional whitespace which is removed from the values. The first row must contain the names of the data items and any blank lines or comment lines (which start with a # character) are ignored.
Splitting the output
The remaining problem is that this prints out all of the letters in a continuous piece of text, but actually we want each letter on a separate page. We can achieve this by inserting a form-feed character (0x0C) at the end of each page like this:
[% USE money = format('%.2f') -%] [% USE debtors = datafile(file) -%] [% FOREACH debtor = debtors -%] Dear [% debtor.name %], According to our records you owe us £[% money(debtor.amount) %]. Please pay before [% debtor.date %] or we will send the boys round. Regards. [% UNLESS loop.last -%] ^L [%- END %] [%- END %]
The ^L is the representation of a form-feed character when displayed in many text editors.
There are a couple of other changes that I’ve made to the template. These prevent an extra form-feed being output after the last letter. A printer will automatically perform a form feed at the end a print job, so if our output contains one as well there will be two form-feeds in the job and an extra (blank) sheet of paper will be used. We can prevent that using the code shown.
The code uses an UNLESS directive to optionally display part of the template. UNLESS works like IF but the logic is reversed, the code within the block is executed only if the UNLESS condition is false.
In this condition we check the “loop” variable. This is a special TT internal variable which contains various interesting pieces of information about the current FOREACH loop. There are boolean flags “first” and “last” which return true only if you are in the first or last iteration respectively. The “size” data item contains the number of elements in the list and the “count” and “index” items contain slightly different views of the current iteration. “count” gives you the number of the iteration (from one to “size”) and “index” gives you one less than that number (useful if you’re a Perl programmer and used to array indexes that start from zero).
In our current example we use “loop.last” to avoid printing the extra form-feed.
That’s all fine if you have your data in a suitable data file. But maybe it’s stored in a database instead.
Accessing a database
The great thing about TT’s plugin system is that it’s very easy for your template to get data from all sorts of interesting places. For example, there’s a plugin to Perl’s database interface system, DBI. This allows your template to access data stored in most kinds of database.
Assuming that our data is stored in a table called debtors in a MySQL database called accounts, we can change our template to look like this:
[% USE money = format('%.2f') -%] [% USE DBI(database = 'dbi:mysql:accounts' username = 'acc_user' password = 'sekrit') -%] [% FOREACH debtor = DBI.query('select name, amount, due from debtors') -%] Dear [% debtor.name %], According to our records you owe us £[% money(debtor.amount) %]. Please pay before [% debtor.due %] or we will send the boys round. Regards. [% UNLESS loop.last -%] ^L [%- END %] [%- END %]
It’s interesting to note just how few changes we have had to make here. All of the loop code is identical, it’s just the code that sets up the loop that is different.
The new code simple enough to follow. We load the DBI plugin, passing it the various parameters required to connect to the database. We then run an SQL query against the database. This returns an iterator object that we can use in a FOREACH directive in exactly the same way that we used the iterator that the Datafile plugin created. Each value returned by the iterator is object which contains data items for each of the columns selected from the database. The names of these data items are given by the column names in the SQL statement.
There’s another formatting improvement that we can make at this time – we can reformat the date. If we assume that the due date is stored in a database date column then most databases will return it in the format YYYY-MM-DD which isn’t very user friendly. We can use the Date plugin to fix that. We can load the Date plugin with a directive like this:
[% USE date(format = '%d %B') -%]
And then use it with a directive like this:
[% date.format(debtor.due) %]
Here we have given the Date plugin a format string. This is in the same format as the format strings used by the Unix “date” command. Our “%d %B” format gives us the day of the month followed by the full month name. Having seeded the plugin with that format, any dates that are passed to the date.format function are converted to that format.
If you want to override the default format at any time, you can pass a new format definition as the second argument to the date.format function like this:
[% date.format(debtor.due, '%A, %d %B %Y') %]
As %A is the full weekday name and %Y gives the year, this example will display the due date in the format “Sunday, 01 February 2004″.
One small problem with the Date plugin is that it is a little fussy about the format of the date. It only accepts dates as either the number of seconds since 1st Jan 1970 (known as the Unix epoch) or in the format h:m:s d/m/y. That’s not the format that we’re currently getting from the database so we’ll need to use MySQL’s date_format function to correct that.
[% FOREACH debtor = DBI.query('select name, amount, date_format(due, "%h:%i:%s %d/%m/%Y") as due from debtors') -%]
Two things to notice here. Firstly, whilst the MySQL date foramt strings look at lot like the standard Unix ones used by TT, they are actually different (%i for minute instead of %M, for example). Secondly, we’ve added a column alias (“as due”) to the date column so the the DBI plugin continues to use that name for the column.
Dealing with XML
So we’ve managed to extract our date from data files and databases. What if our data is stored as an XML document? As you’d expect, TT has plugins to handle that too. In this example I’ll use the XML.XPath plugin to access data within a document. Let’s assume that our debtor document looks like this:
<debtors> <debtor> <name>Mr Cross</name> <amount>10</amount> <date>1st March</date> </debtor> <debtor> <name>Mr Smith</name> <amount>20</amount> <date>1st February</date> </debtor> <debtor> <name>Mr Jones</name> <amount>50</amount> <date>1st February</date> </debtor> </debtors>
Here’s the template that we’ll use to process it.
[% USE money = format('%.2f') -%] [% USE debtors = XML.XPath(file) -%] [% FOREACH debtor = debtors.findnodes('/debtors/debtor') -%] Dear [% debtor.findvalue('name') %], According to our records you owe us £[% money(debtor.findvalue('amount')) %]. Please pay before [% debtor.findvalue('date') %] or we will send the boys round. Regards. [% UNLESS loop.last -%] ^L [%- END %] [%- END %]
The template first creates a XML.XPath object by passing the name of the XML file to the XML.XPath plugin in the USE directive. This XML.XPath object can then respond to XPath queries in a number of ways. The first way that we use is to use the “findnodes” function to get a list of all of the nodes matching the query ‘/debtors/debtor’. This returns an iterator that we can use in a FOREACH loop like all of the iterators that we have seen previously. Each object returned by this iterator is another XML/XPath object representing one of the nodes in the set. In this case each element is a ‘debtor’ node that is contained within the main ‘debtors’ node. We can then use the “findvalue” function to get the text value contained in the various nodes that we are interested in.
Getting the Template Toolkit
Where to download the Template Toolkit
If you’re happy installing Perl modules from the Comprehensive Perl Archive Network then you can go to http://search.cpan.org/dist/Template-Toolkit/ and download it from there. It is installed in exactly the same way as most other Perl modules.
If you’d rather not get involved with CPAN then there are also Debian packages and RPMs available. The Debian packages can be downloaded from the Debian packages respository at http://packages.debian.org/unstable/interpreters/libtemplate-perl and http://packages.debian.org/unstable/doc/libtemplate-perl-doc. The RPMs can be found by searching at http://rpmfind.net. The package name is perl-Template-Toolkit.
If you need any more information about installing TT or you want to use the bleeding edge CVS versions, then you can access those at the official TT web site at http://www.template-toolkit.org/.
Template Toolkit Documentation
Where to go for more information
This articles has just scratched the surface of the Template Toolkit. We’ll be going into more detail later in this series, but in the meantime you can get more information from a number of sources.
The TT distribution comes with a large number of manual pages. The best place for a beginner to start is probably with “man Template::Manual” which is a guide to all the other TT manual pages and “man Template::Tutorial” which introduces a couple of tutorials that are part of the documentation set.
The official TT web page is at http://template-toolkit.org/ (or http://tt2.org/ if you don’ like typing). Here you’ll find all the man pages online together with a number of talks about TT, the latest version of the software and a number of other interesting things.
There is a mailing list for the discussion of things relating to TT. You can subscribe to it at http://www.template-toolkit.org/mailman/listinfo/templates.
All of the core TT developers are regular posters to this list.
The book “Perl Template Toolkit” by Darren Chamberlain, David Cross and Andy Wardley has recently published by O’Reilly. You can get more details about it from http://www.oreilly.com/catalog/perltt.