META.yml and Building RPMs

An email has flooded in. It was in response to my piece about Building RPMs from CPAN Distributions and it was from Andreas Koenig. Andreas runs PAUSE, which is the service CPAN authors use to upload stuff to CPAN, so he knows what he’s talking about when it comes to CPAN (and many other matters). He says this:

It’s not correct that the META.yml contains the exact list of dependencies. The META.yml is not the authoritative source for them. The reason behind is that dependencies do differ across architectures. Exceptions to this rule may declare dynamic_config=0. In order to obtain the real list of dependencies you must run your Makefile.PL or Build.PL. Recent Module::Build provides a MYMETA.yml after Build.PL has run. You could use that instead. MakeMaker always had the dependency as a comment in the Makefile.

He is, of course, right. My previous article skipped blithely over some of the more gnarly corners of this problem. I should point out that Gabor and I discussed some of these over the weekend but it’s almost certainly worthwhile going into a little more detail.

It’s true that a static META.yml file can’t deal with all of the possibilities. Here are a number of examples of areas that need to be looked at in more detail.

Environmental differences
This is the area that Andreas is talking about. And the Padre problem I mentioned on Monday is one example of this. Padre runs on several different platforms. And some dependencies will only be required on certain platforms. For example the Win32::API module is only required if the module is being installed on Windows.

But it’s not just different operating systems or architectures that cause problems like this. If you’re trying to use Plack on a server with Apache 2 installed, you’ll need Apache2::Request. If your server has Apache 1 installed you’ll need Apache::Request. In each case, you won’t need the request module for the Apache version that you aren’t using. As things stand, the META.yml for Plack doesn’t list either of the Apache request modules, but a more intelligent system could work out which one of them is required and add that one to the list of dependencies.

“Choose One” requirements
Some modules exist simply as a way of allowing the user to choose between one of a number of implementations of a feature. A good example is JSON::Any. There are (at least) three different JSON modules on CPAN – JSON, JSON::DWIW and JSON::XS). Different systems will have different ones installed. JSON::Any allows a program to use any JSON module and not care which of them is installed. But how do you model that dependency? If you make any (or all) of the supported modules a required dependency, you rather miss the whole point of the module. JSON::Any’s META.yml ignores the problem and leaves it to the Makefile.PL to work out what to do. The Fedora RPM for this module takes a weird approach and makes JSON::XS a required dependency. Even if META.yml could support this mode of working, RPM doesn’t have this feature.

Added features
Some modules have optional requirements. That is, if certain other modules are installed then the module gains more features. One example is the Template-XML distribution. Template-XML contain a plugin (Template::Plugin::XML) for the Template Toolkit. Template::Plugin::XML is a wrapper around a number of XML processing modules. If a particular module (for example, XML::DOM) is installed then Template::Plugin::XML allows the user to uses XML::DOM for XML processing. It works similarly for XML::LibXML, XML::Parser, XML::RSS, XML::Simple and XML::XPath. None of them are required, different functionality is turned on for each one that is installed. You don’t have to configure Template::Plugin::XML at all to work with these modules. It just works if a particular module is installed. If, at a later date, you remove that module then Template::Plugin::XML removes the features supported by that module.

This seems to be somewhere where I have philosophical differences with the Fedora RPM packaging team. I believe that all of these modules should be seen as optional and there for shouldn’t be listed as dependencies in the META.yml or the RPM. The Fedora team disagrees. They want each RPM to depend on all of the modules it needs in order to have as many features as possible, The Template-XML RPM therefore requires all of the XML processing modules I listed above. That seems wrong to me.

META.yml supports the concept of  “recommended modules”. I think that these optional modules should be listed there. But I don’t believe that RPM has a similar feature.

So there are a few problems that I see with the META.yml approach. In the face of these issues I should probably back down slightly from my previous position that META.yml is the definitive way to get a list of dependencies. What I now believe is that parsing META.yml will give you a better position to start from than parsing the Perl code and extracting all of the “use” statements.

But I hadn’t previously heard of the MYMETA.yml that Andreas mentioned in his email. That’s certainly a way to get round the environmental differences I listed above. I don’t think it solves the other two issues though.

Are there any other corner cases that I’ve missed. Does anyone else have any opinions on building RPMs from CPAN distributions?

8 thoughts on “META.yml and Building RPMs

  1. Maybe irrelevant to the discussion, but debian/ubuntu dpkg format does have the “recommends” and the “depends on one of those” notion:
    claudio@amsterdam:~$ apt-cache show libroxen-xmlutils |grep ‘Depends:’Depends: roxen3, pike (>= 0.6) | pike7 (>= 7.0.36) | pike7.2 | pike7.6
    claudio@amsterdam:~$ apt-cache show libjson-perl |grep ‘Recommends:’
    Recommends: libjson-xs-perl (>= 2.24)
    claudio

  2. rpm supports optional dependencies with Suggests:
    i’m currently facing the same problem for (official) mandriva rpms. i’m trying to move the prereqs extraction from using perl.req to a new perl.req-from-meta if meta.yml and/or meta.json is packaged (it defaults to using perl.req otherwise).
    it’s a work in progress though…

  3. “Choose One” requirements
    In RPM (and I guess also in dpkg format) the problem of depending on one of possible implementations is solved via metapackages. JSON::XS, JSON::DWIW and JSON would have ‘Provides: perl-JSON-implementation’ (or something like that), and JSON::Any would require such metapackage, i.e. ‘Requires: perl-JSON-implementation’.
    This is used for example if given program requires web server, or mail daemon, but it doesn’t matter to it what exactly package / program is installed.
    See subsection “Virtual Packages” in /usr/share/doc/rpm-*/dependencies

  4. note also that mymeta.yml is not currently supported by eumm… this is enough to stop the idea of using mymeta (for now).

  5. MYMETA was agreed upon at the Oslo Hackthon as part of the Oslo Consensus. It was called METALOCAL at the time, but subsequent discussions have renamed it MYMETA. The goal is to standardize how the toolchain can communicate post-configuration information instead of tools having to independently scrape Makefile.PL or _build/prereqs. It doesn’t do anything new — it just does it in a standard way. CPAN(PLUS) in Perl 5.10.1 already support MYMETA if it exists after configuration.

    There was a proposal and review process for the definition of a CPAN Meta Spec 2.0. That has been postponed pending other high priority projects from the Pumpking and Perl NOC.

    After 2.0 is finalized, I expect MYMETA to be implemented for ExtUtils::MakeMaker (and/or Module::Install), though CPAN Meta 2.0 will be represented in JSON, not YAML, due to inconsistencies in YAML implementation.

    Hope that clarifies the current state of affairs..

    dagolden

  6. This seems to be somewhere where I have philosophical differences with the Fedora RPM packaging team. I believe that all of these modules should be seen as optional and there for shouldn’t be listed as dependencies in the META.yml or the RPM. The Fedora team disagrees. They want each RPM to depend on all of the modules it needs in order to have as many features as possible, The Template-XML RPM therefore requires all of the XML processing modules I listed above. That seems wrong to me.

    I don’t think you and the Fedora Team have philosophical differences, you’re just in 2 different places. The CPAN culture is to minimize dependencies. Or our users complain loudly. On the other hand, packagers control the repository, they have all the dependencies available, they can make sure they install properly, and they only have one shot at installing all that’s needed for a package to work, unless they break it up in several packages. So it makes sense for them to err on the side of caution and include as many dependencies as possible. Or their users complain loudly.

    That’s especially true for RPMs, which don’t seem to have (yet?) Debian’s notion of optional but recommended dependencies.

    Of course this doesn’t apply to the cases you describe of alternate dependencies, but the current packaging systems would have to deal with those by creating alternate packages: Plack-Apache1 and Plack-Apache2 for example.

    As a side note I am not sure the CPAN culture of minimizing dependencies is that healthy, and I like the fact that distributions don’t hesitate to install plenty of code at once.

    In the end, it doesn’t really matter what option is chosen, users will still complain loudly ;–)

  7. “The Fedora RPM for this module takes a weird approach and makes JSON::XS a required dependency.”
    How is this a “weird approach”? We needed to pick _something_ to use as a RPM dependency or perl-JSON-Any wouldn’t be guaranteed to be functional after installed, and JSON::XS was the fastest (at the time, anyways). Is there a better approach that we could reasonably have taken?

  8. “Is there a better approach that we could reasonably have taken?”
    Yes, the approach suggested by jnareb here seems to me to be a good way round this issue.

Leave a Reply