Daily Archives: 10 February, 2010

META.yml and Building RPMs

An email has flooded in. It was in response to my piece about Building RPMs from CPAN Distributions and it was from Andreas Koenig. Andreas runs PAUSE, which is the service CPAN authors use to upload stuff to CPAN, so he knows what he’s talking about when it comes to CPAN (and many other matters). He says this:

It’s not correct that the META.yml contains the exact list of dependencies. The META.yml is not the authoritative source for them. The reason behind is that dependencies do differ across architectures. Exceptions to this rule may declare dynamic_config=0. In order to obtain the real list of dependencies you must run your Makefile.PL or Build.PL. Recent Module::Build provides a MYMETA.yml after Build.PL has run. You could use that instead. MakeMaker always had the dependency as a comment in the Makefile.

He is, of course, right. My previous article skipped blithely over some of the more gnarly corners of this problem. I should point out that Gabor and I discussed some of these over the weekend but it’s almost certainly worthwhile going into a little more detail.

It’s true that a static META.yml file can’t deal with all of the possibilities. Here are a number of examples of areas that need to be looked at in more detail.

Environmental differences
This is the area that Andreas is talking about. And the Padre problem I mentioned on Monday is one example of this. Padre runs on several different platforms. And some dependencies will only be required on certain platforms. For example the Win32::API module is only required if the module is being installed on Windows.

But it’s not just different operating systems or architectures that cause problems like this. If you’re trying to use Plack on a server with Apache 2 installed, you’ll need Apache2::Request. If your server has Apache 1 installed you’ll need Apache::Request. In each case, you won’t need the request module for the Apache version that you aren’t using. As things stand, the META.yml for Plack doesn’t list either of the Apache request modules, but a more intelligent system could work out which one of them is required and add that one to the list of dependencies.

“Choose One” requirements
Some modules exist simply as a way of allowing the user to choose between one of a number of implementations of a feature. A good example is JSON::Any. There are (at least) three different JSON modules on CPAN – JSON, JSON::DWIW and JSON::XS). Different systems will have different ones installed. JSON::Any allows a program to use any JSON module and not care which of them is installed. But how do you model that dependency? If you make any (or all) of the supported modules a required dependency, you rather miss the whole point of the module. JSON::Any’s META.yml ignores the problem and leaves it to the Makefile.PL to work out what to do. The Fedora RPM for this module takes a weird approach and makes JSON::XS a required dependency. Even if META.yml could support this mode of working, RPM doesn’t have this feature.

Added features
Some modules have optional requirements. That is, if certain other modules are installed then the module gains more features. One example is the Template-XML distribution. Template-XML contain a plugin (Template::Plugin::XML) for the Template Toolkit. Template::Plugin::XML is a wrapper around a number of XML processing modules. If a particular module (for example, XML::DOM) is installed then Template::Plugin::XML allows the user to uses XML::DOM for XML processing. It works similarly for XML::LibXML, XML::Parser, XML::RSS, XML::Simple and XML::XPath. None of them are required, different functionality is turned on for each one that is installed. You don’t have to configure Template::Plugin::XML at all to work with these modules. It just works if a particular module is installed. If, at a later date, you remove that module then Template::Plugin::XML removes the features supported by that module.

This seems to be somewhere where I have philosophical differences with the Fedora RPM packaging team. I believe that all of these modules should be seen as optional and there for shouldn’t be listed as dependencies in the META.yml or the RPM. The Fedora team disagrees. They want each RPM to depend on all of the modules it needs in order to have as many features as possible, The Template-XML RPM therefore requires all of the XML processing modules I listed above. That seems wrong to me.

META.yml supports the concept of  “recommended modules”. I think that these optional modules should be listed there. But I don’t believe that RPM has a similar feature.

So there are a few problems that I see with the META.yml approach. In the face of these issues I should probably back down slightly from my previous position that META.yml is the definitive way to get a list of dependencies. What I now believe is that parsing META.yml will give you a better position to start from than parsing the Perl code and extracting all of the “use” statements.

But I hadn’t previously heard of the MYMETA.yml that Andreas mentioned in his email. That’s certainly a way to get round the environmental differences I listed above. I don’t think it solves the other two issues though.

Are there any other corner cases that I’ve missed. Does anyone else have any opinions on building RPMs from CPAN distributions?