Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
h1. Version-Expression Transformations in Maven 2.2+

...

Relevant JIRAs

Background

In Maven 2.0.x, POMs that used an expression when specifying a version in a dependency, plugin, or the project itself (or its parent) resulted in those expressions being preserved in the POM that was installed or deployed. This leads to a variety of problems, the most common of which seems to arise when the expression resolves to a value derived from either the command line or the larger build environment (os.name for instance, or user.name). In order to make these POMs truly reflect the build environment from which they were deployed - and also to make them less apt to change in unpredictable ways based on the user's environment - version expressions needed to be resolved in any POM that gets installed or deployed.

This has been a particular problem for version elements, but in principle it's possible that it could be a problem for any element that forms part of an artifact coordinate, including groupId, artifactId, version, type/packaging, and classifier.

Initial Solution

...



h2. Relevant JIRA Issues
* Main
** [MNG-4167|http://jira.codehaus.org/browse/MNG-4167]
** [MNG-4140|http://jira.codehaus.org/browse/MNG-4140]
** [MNG-3057|http://jira.codehaus.org/browse/MNG-3057]
* Similar
** [MNG-2971|http://jira.codehaus.org/browse/MNG-2971]
** [MNG-2446|http://jira.codehaus.org/browse/MNG-2446]
** [MNG-2412|http://jira.codehaus.org/browse/MNG-2412]
h2. Use cases and correct behavior
* coordinate expression use cases
** jar packaging
*** main concern here is the way transitive
            dependencies will be resolved. dependency POMs
            will be interpolated during the consumer's build,
            which could result in invalid artifact references,
            or at least changed references from how the jar
            dependency was built.
*** NOTE: This is a similar scenario to artifacts that
            don't specify their version in a locked-down
            range...i.e. they are suggestions, not absolute
            requirements.
** parent POM dependencies
*** main concern here is maintaining dynamic artifact
            coordinates in dependencyManagement, plugins, etc.
            so that property references can be overridden by
            child POMs. For example:
**** specifying <mavenVersion> property for a suite
                of maven-related deps in dependencyManagement,
                which specify:
                <version>${mavenVersion}</version>
**** the parent POM may supply a basic default of:
                <mavenVersion>2.0.9</mavenVersion>
**** If a child POM overrides with:
                <mavenVersion>2.1.0</mavenVersion>, it should
                be able to make use of all the maven artifacts
                specified in the dependencyManagement of the
                parent with the newly-overridden version.
*** also shares the same concerns as jar packaging WRT
            transitive dependencies
** envar and other user-specific expressions
*** these will evaluate to the local value. For
            instance:
**** os.name
**** java.version
**** user.dir
**** user.home
**** etc.
*** if a POM uses these in expressions for artifact
            coordinates, it may result in artifact references
            that only resolve on certain environments
**** f.e. if built using JDK 1.6 and no
                corresponding artifact with '1.6' in its
                coordinate exists, but one does for '1.5' and
                '1.4', then the build will fail on that 1.6
                environment
** properties in profiles
*** these may change the artifact coordinate(s)
            according to which profile is active, or whether
            the default properties (in the POM main section)
            are used
*** when loading POMs from the repository:
**** settings profiles are NOT activated here
**** profiles.xml profiles are NOT activated here
**** profiles specified in -P cli options are NOT
                activated here
**** only those that trigger based on the following
                criteria will be activated:
    **** system properties
    **** cli-specified properties
    **** other non-property activators
* plugin requirements for POM information
** release plugin
*** invoked directly from cli, outside of any lifecycle
**** invokes different lifecycle builds on the
                project, but currently uses separate java
                processes to do so
*** must modify the original pom.xml file as it exists
            on disk, WITHOUT ANY OTHER MANIPULATIONS
*** must NOT have new files added that are then
            referenced from project.file
**** this means that transforming the original POM
                file and writing it to a new location, which
                is then set on project.file, WILL NOT WORK
** enforcer plugin
*** normally bound to the lifecycle
*** MAY require access to unaltered, original POM in
            order to execute rules
** gpg plugin
*** normally bound to the lifecycle
*** requires access to the POM file that will
            eventually be installed or deployed. This file
            MUST NOT be changed after GPG runs.
** shade plugin
*** normally bound to the lifecycle
*** requires access to project.originalModel, which
            MUST reflect the information in the POM file that
            will be installed or deployed.
**** this is necessary for the shade plugin to be
                able to generate a dependency-reduced POM.
h2. Implementation strategies
* 2.0.10
** expressions in artifact coordinates are ignored. Users
        have plenty of rope with which to hang themselves
* 2.1.0
** attempted to resolve artifact versions to concrete
        terms during install/deploy process
** implemented as an ArtifactTransformation, just like
        snapshot handling.
*** incidentally, snapshot handling violates the
            requirements for the shade plugin AND the gpg
            plugin.
** Problems:
*** also modifies plugin configurations where the
            element '<version>' is used
*** fails to account for activated profiles that may
            supply/change interpolation values
*** only accounts for artifact versions, not
            artifactId, groupId, classifier, or type
*** modifies the POM after it has been signed by GPG,
            making the signature worthless
*** modifies the POM without reflecting the new
            information in the originalModel for the shade
            plugin to use
**** transformation happens too late for shade
                plugin anyway, though
*** This breaks legitimate use cases for expressions
            in artifact coordinates, like those detailed in
            the 'pom' and 'jar' packaging scenarios, above
* 2.2.0-current
** In the latest attempt to resolve artifact coordinate
        expressions, the solution from 2.1.0 has been:
*** generalized to look at all artifact fields
            (groupId, artifactId, version, classifier, type)
*** moved into DefaultMavenProjectBuilder, to be run
            just before a project is returned to the build
            process
**** this makes the transformed artifact coordinate
                information available in POM-file form to all
                plugins in the build process, such as GPG
**** it also means that the POM transformation
                happens AT ALL TIMES
*** Problems
**** release plugin tries to add the transformed
                version of the POM as a new file to SCM, since
                that's the POM file referenced from the
                project instance
**** shade plugin still cannot gain access to
                transformed information, since it's not
                reflected in project.originalModel
    **** since the transformed information isn't
                    original, this may not be appropriate
                    anyway, though...
*** This breaks legitimate use cases for expressions
            in artifact coordinates, like those detailed in
            the 'pom' and 'jar' packaging scenarios, above
* 2.2.0-final
** For this release, we're probably going to have to
        reverse course and remove all POM transformation code
** We need a more comprehensive design review, and much
        more planning on how to introduce this sort of feature
        without breaking the use cases above
*** legitimate/safe coordinate expressions should be
            supported
*** any transformation must be reflected in all
            locations that plugins look for the information
**** either that, or the plugins must migrate to
                any new api we put in place to support
                coordinate transformation
* future

h1. Original Document

h2. Relevant JIRAs

* [MNG-3057|http://jira.codehaus.org/browse/MNG-3057]
* [MNG-4140|http://jira.codehaus.org/browse/MNG-4140]
* [MNG-4167|http://jira.codehaus.org/browse/MNG-4167]

h2. Background

  In Maven 2.0.x, POMs that used an expression when specifying a version in a dependency, plugin, or the project itself (or its parent) resulted in those expressions being preserved in the POM that was installed or deployed. This leads to a variety of problems, the most common of which seems to arise when the expression resolves to a value derived from either the command line or the larger build environment (os.name for instance, or user.name). In order to make these POMs truly reflect the build environment from which they were deployed - and also to make them less apt to change in unpredictable ways based on the user's environment - version expressions needed to be resolved in any POM that gets installed or deployed.
  
  This has been a particular problem for version elements, but in principle it's possible that it could be a problem for any element that forms part of an artifact coordinate, including groupId, artifactId, version, type/packaging, and classifier.

h2. Initial Solution

  Our initial attempt at solving this problem was to introduce a new ArtifactTransformation that runs each time an artifact is installed or deployed into a repository. The new transformation is managed - alongside things like the SnapshotTransformation - from the ArtifactTransformationManager, and is therefore invisible to basically everything outside of maven-artifact/maven-artifact-manager. This new ArtifactTransformation implementation (called VersionExpressionTransformation) performs a tightly focused interpolation step on the POM that's attached to an artifact via a ProjectArtifactMetadata instance. The first version of this transformation, released in Maven 2.1.0, simply did a string search for <version>*</version>, interpolated the element value, and replaced the original value in the POM content. Finally, the modified POM was written to $\{project.build.directory}/pom-transformed.xml and the new file replaced the old in the ProjectArtifactMetadata instance so it would be picked up during artifact installation or deployment.

...


  
  Because of [MNG-4140|http://jira.codehaus.org/browse/MNG-4140] (where <version/> elements in plugin configurations were being interpolated inappropriately in places such as parent POMs), the above approach has been revised to use targeted XPath expressions to isolate the element values to interpolate. The essential strategy remains the same, however.

h2.

...

 Current Problem

...



  In spite of the aforementioned problems that are solved by the VersionExpressionTransformation, this strategy of modifying POMs on install or deploy has introduced a new bug. Any plugin that produces an artifact or artifact metadata that is derived from the POM will be based on the unmodified file which hasn't had its version expressions interpolated. In cases like the shade plugin (which modifies the dependency specification of the POM) or the gpg plugin (which produces a signature of the unmodified POM), this will cause the plugin to produce incorrect metadata. Note that the shade plugin is reported as an issue here, but I'm personally not convinced that it is, since the shade plugin should be modifying the POM to reduce the dependencies list or modify their scope...which should probably happen *ahead* of any eventual dependency interpolation (or, at least, it shouldn't really matter which comes first).

h2.

...

 Some Alternative Solutions

...



  In order to accommodate plugins that need an accurate POM file *before* they execute, it's important that we take one of three approaches:

...

 
  
h3. Plugin-Contributed ArtifactTransformations.

...



  One possibility is to provide a mechanism by which plugins could contribute their own ArtifactTransformation implementations, and guarantee the order of operations for ArtifactTransformations. This would allow the GPG plugin to introduce its own signing transformation, which could attach the signature to the artifact *after* the version-expressions have been transformed by a previous ArtifactTransformer.

...


  
  IMO, this approach has a couple serious problems. Specifically, it would complicate the classloading for any such plugins, forcing them to either declare themselves with <extensions>true</extensions>, or else forcing the ArtifactTransformationManager to somehow consume the classpath of any transformations that are added to it, including those jars in a classloader that could outlive the plugin that contributed it...which in turn means configuration and the component lifecycle of such a transformation could be complicated to maintain.

...

The other major problem with this strategy is that it puts the burden on the plugin developer to understand and manipulate the core build process in Maven. Not only would the GPG plugin have to know how to sign the POM file, but it would have to know about the artifact-transformation process in order to introduce a component to actually execute the plugin. IMO, this puts far too much responsibility on the plugin developer to know the intimate details of Maven's core, but it also violates the assumptions that a plugin developer has about Maven delivering accurate information for it to read of manipulate. In addition, it's also important to remember that Maven 3 is coming down the line, and we need to think twice before introducing new behaviors into plugins that will eventually have to make the transition to the new Maven. Artifact resolution in particular has been a hotspot for discussion related to this switch, so we need to maintain a stable API and set of behaviors in the 2.x code to help interoperability.

Pre-Process All POMs.

Another way we could solve this issue is to pre-process all POMs, interpolating version expressions and writing a modified POM to the target directory for subsequent use in the build process. The project representation in memory would be unaltered except for the MavenProject.file variable pointing to the modified POM instead of the original, the artifact installation/deployment process would be unaltered, and most plugins would be unaffected.

However, any plugin that reads the POM directly for whatever reason (often this is the most reliable way to get at the original, unmodified POM content) would read the original file that still contains the version expressions, which is different from the file that the entire rest of the build uses. As a workaround, it might be possible to add something like a

Code Block
readRawPOM( File )

method to the MavenProjectBuilder component, to allow plugins to read the modified POM seamlessly instead of the original.

IMO, this isn't much of a solution. First, it means that the core build process in Maven - even without a single plugin executing - would modify the project directory by generating a modified POM file. Whether the POM even contains version expressions or not, whether the build needed the modified POM (the clean lifecycle likely wouldn't), this file would be generated. In many cases, this would amount to a performance penalty for IDEs and the like without providing any benefit. Finally, we will still have to re-release any plugin that might get in trouble by reading the POM file directly, in order to take advantage of the newly generated POM file. This has consequences for compatibility, both forward and backward, since Maven < 2.2.0 doesn't provide this file, and (presumably) neither will Maven 3.0.

My Proposed Solution

An alternative to the above, and the approach I much prefer, is to inject a new plugin into the default lifecycle mappings to do the POM transformation during the package phase. Unfortunately, the only manipulation of runtime build files/information that Maven does outside of executing plugins happens before the first plugin is executed. Then, each plugin is executed sequentially with only minimal core-level code running in between. The problem here is that it's no simple matter to inject core-level behavior at the package phase of the build lifecycle. In order to minimize the impact to core components like the LifecycleExecutor while still providing an accurate POM to plugins like GPG that typically execute relatively late in the lifecycle, the simplest solution is to use a plugin.

...


    
  The other major problem with this strategy is that it puts the burden on the plugin developer to understand and manipulate the core build process in Maven. Not only would the GPG plugin have to know how to sign the POM file, but it would have to know about the artifact-transformation process in order to introduce a component to actually execute the plugin. IMO, this puts far too much responsibility on the plugin developer to know the intimate details of Maven's core, but it also violates the assumptions that a plugin developer has about Maven delivering accurate information for it to read of manipulate. In addition, it's also important to remember that Maven 3 is coming down the line, and we need to think twice before introducing new behaviors into plugins that will eventually have to make the transition to the new Maven. Artifact resolution in particular has been a hotspot for discussion related to this switch, so we need to maintain a stable API and set of behaviors in the 2.x code to help interoperability.
    
h3. Pre-Process All POMs.

  Another way we could solve this issue is to pre-process all POMs, interpolating version expressions and writing a modified POM to the target directory for subsequent use in the build process. The project representation in memory would be unaltered except for the MavenProject.file variable pointing to the modified POM instead of the original, the artifact installation/deployment process would be unaltered, and most plugins would be unaffected.
  
  However, any plugin that reads the POM directly for whatever reason (often this is the most reliable way to get at the original, unmodified POM content) would read the original file that still contains the version expressions, which is different from the file that the entire rest of the build uses. As a workaround, it might be possible to add something like a {code}readRawPOM( File ){code} method to the MavenProjectBuilder component, to allow plugins to read the modified POM seamlessly instead of the original.
    
  IMO, this isn't much of a solution. First, it means that the core build process in Maven - even without a single plugin executing - would modify the project directory by generating a modified POM file. Whether the POM even contains version expressions or not, whether the build needed the modified POM (the clean lifecycle likely wouldn't), this file would be generated. In many cases, this would amount to a performance penalty for IDEs and the like without providing any benefit. Finally, we will still have to re-release any plugin that might get in trouble by reading the POM file directly, in order to take advantage of the newly generated POM file. This has consequences for compatibility, both forward and backward, since Maven < 2.2.0 doesn't provide this file, and (presumably) neither will Maven 3.0.

h2. My Proposed Solution
    
  An alternative to the above, and the approach I much prefer, is to inject a new plugin into the default lifecycle mappings to do the POM transformation during the package phase. Unfortunately, the only manipulation of runtime build files/information that Maven does outside of executing plugins happens before the first plugin is executed. Then, each plugin is executed sequentially with only minimal core-level code running in between. The problem here is that it's no simple matter to inject core-level behavior at the package phase of the build lifecycle. In order to minimize the impact to core components like the LifecycleExecutor while still providing an accurate POM to plugins like GPG that typically execute relatively late in the lifecycle, the simplest solution is to use a plugin.
  
    The POM transformation plugin should probably be embedded in the Maven core if possible, and bound into every lifecycle. The binding can be done in the lifecycle-mapping components - which means that any custom lifecycle mapping would have to be modified to include it - or it could be injected just before the lifecycle is executed. IMO, the best-case scenario would allow the plugin manager to load this new plugin from the core classloader, allowing it to be embedded directly in Maven itself. Finally, it would be good if possible to allow POMs to override the execution of this plugin, specifying a skip flag to turn it off, or even changing the <phase/> element to remap it into an earlier lifecycle phase if necessary.