You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Version-Expression Transformations in Maven 2.2+

Relevant JIRA Issues

Main

Similar

Use cases and correct behavior

coordinate expression use cases

jar packaging

The main concern here is the way transitive dependencies will be resolved. dependency POMs will be interpolated during the consumer's build, which could result in invalid artifact references, or at least changed references from how the jar dependency was built.

NOTE: This is a similar scenario to artifacts that don't specify their version in a locked-down range...i.e. they are suggestions, not absolute requirements.

parent POM dependencies

The main concern here is maintaining dynamic artifact coordinates in dependencyManagement, plugins, etc. so that property references can be overridden by child POMs. For example:

  • specifying <mavenVersion> property for a suite of maven-related deps in dependencyManagement, which specify: <version>${mavenVersion}</version>
  • the parent POM may supply a basic default of: <mavenVersion>2.0.9</mavenVersion>
  • If a child POM overrides with: <mavenVersion>2.1.0</mavenVersion>, it should be able to make use of all the maven artifacts specified in the dependencyManagement of the parent with the newly-overridden version.

The pom packaging also shares the same concerns as jar packaging WRT transitive dependencies

envar and other user-specific expressions

These will evaluate to the local value. For instance:

  • os.name
  • java.version
  • user.dir
  • user.home
  • etc.

If a POM uses these in expressions for artifact coordinates, it may result in artifact references that only resolve on certain environments. For example, if built using JDK 1.6 and no corresponding artifact with '1.6' in its coordinate exists, but one does for '1.5' and '1.4', then the build will fail on that 1.6 environment.

properties in profiles

These may change the artifact coordinate(s) according to which profile is active, or whether the default properties (in the POM main section) are used

When loading POMs from the repository:

  • settings profiles are NOT activated here
  • profiles.xml profiles are NOT activated here
  • profiles specified in -P cli options are NOT activated here
  • only those that trigger based on the following criteria will be activated:
    • system properties
    • cli-specified properties
    • other non-property activators

plugin requirements for POM information

release plugin

  • invoked directly from cli, outside of any lifecycle
    • invokes different lifecycle builds on the project, but currently uses separate java processes to do so
  • MUST modify the original pom.xml file as it exists on disk, WITHOUT ANY OTHER MANIPULATIONS
  • must NOT have new files added that are then referenced from project.file
    • this means that transforming the original POM file and writing it to a new location, which is then set on project.file, WILL NOT WORK

enforcer plugin

  • normally bound to the lifecycle
  • MAY require access to unaltered, original POM in order to execute rules

gpg plugin

  • normally bound to the lifecycle
  • requires access to the POM file that will eventually be installed or deployed. This file MUST NOT be changed after GPG runs.

shade plugin

  • normally bound to the lifecycle
  • requires access to project.originalModel, which MUST reflect the information in the POM file that will be installed or deployed.
    • this is necessary for the shade plugin to be able to generate a dependency-reduced POM.

Implementation strategies

2.0.10

Expressions in artifact coordinates are ignored. Users have plenty of rope with which to hang themselves

2.1.0

This was the first attempt to clean up the coordinate values in POMs before installing/deploying. Obviously, we didn't really understand the scope of the problem at this point.

  • attempted to resolve artifact versions to concrete terms during install/deploy process
  • implemented as an ArtifactTransformation, just like snapshot handling.
    • incidentally, snapshot handling violates the requirements for the shade plugin AND the gpg plugin.

Problems:

  • also modifies plugin configurations where the element '<version>' is used
  • fails to account for activated profiles that may supply/change interpolation values
  • only accounts for artifact versions, not artifactId, groupId, classifier, or type
  • modifies the POM after it has been signed by GPG, making the signature worthless
  • modifies the POM without reflecting the new information in the originalModel for the shade plugin to use
    • transformation happens too late for shade plugin anyway, though

This breaks legitimate use cases for expressions in artifact coordinates, like those detailed in the 'pom' and 'jar' packaging scenarios, above

2.2.0-current

In the latest attempt to resolve artifact coordinate expressions, the solution from 2.1.0 has been:

  • generalized to look at all artifact fields (groupId, artifactId, version, classifier, type)
  • moved into DefaultMavenProjectBuilder, to be run just before a project is returned to the build process
    • this makes the transformed artifact coordinate information available in POM-file form to all plugins in the build process, such as GPG
    • it also means that the POM transformation happens AT ALL TIMES

Problems:

  • release plugin tries to add the transformed version of the POM as a new file to SCM, since that's the POM file referenced from the project instance
  • shade plugin still cannot gain access to transformed information, since it's not reflected in project.originalModel
    • since the transformed information isn't original, this may not be appropriate anyway, though...

This breaks legitimate use cases for expressions in artifact coordinates, like those detailed in the 'pom' and 'jar' packaging scenarios, above

2.2.0-final

For this release, we're probably going to have to reverse course and remove all POM transformation code. We need a more comprehensive design review, and much more planning on how to introduce this sort of feature without breaking the use cases above

  • legitimate/safe coordinate expressions should be supported
  • any transformation must be reflected in all locations that plugins look for the information
    • either that, or the plugins must migrate to any new api we put in place to support coordinate transformation

Original Document

Relevant JIRAs

Background

In Maven 2.0.x, POMs that used an expression when specifying a version in a dependency, plugin, or the project itself (or its parent) resulted in those expressions being preserved in the POM that was installed or deployed. This leads to a variety of problems, the most common of which seems to arise when the expression resolves to a value derived from either the command line or the larger build environment (os.name for instance, or user.name). In order to make these POMs truly reflect the build environment from which they were deployed - and also to make them less apt to change in unpredictable ways based on the user's environment - version expressions needed to be resolved in any POM that gets installed or deployed.

This has been a particular problem for version elements, but in principle it's possible that it could be a problem for any element that forms part of an artifact coordinate, including groupId, artifactId, version, type/packaging, and classifier.

Initial Solution

Our initial attempt at solving this problem was to introduce a new ArtifactTransformation that runs each time an artifact is installed or deployed into a repository. The new transformation is managed - alongside things like the SnapshotTransformation - from the ArtifactTransformationManager, and is therefore invisible to basically everything outside of maven-artifact/maven-artifact-manager. This new ArtifactTransformation implementation (called VersionExpressionTransformation) performs a tightly focused interpolation step on the POM that's attached to an artifact via a ProjectArtifactMetadata instance. The first version of this transformation, released in Maven 2.1.0, simply did a string search for <version>*</version>, interpolated the element value, and replaced the original value in the POM content. Finally, the modified POM was written to ${project.build.directory}/pom-transformed.xml and the new file replaced the old in the ProjectArtifactMetadata instance so it would be picked up during artifact installation or deployment.

Because of MNG-4140 (where <version/> elements in plugin configurations were being interpolated inappropriately in places such as parent POMs), the above approach has been revised to use targeted XPath expressions to isolate the element values to interpolate. The essential strategy remains the same, however.

Current Problem

In spite of the aforementioned problems that are solved by the VersionExpressionTransformation, this strategy of modifying POMs on install or deploy has introduced a new bug. Any plugin that produces an artifact or artifact metadata that is derived from the POM will be based on the unmodified file which hasn't had its version expressions interpolated. In cases like the shade plugin (which modifies the dependency specification of the POM) or the gpg plugin (which produces a signature of the unmodified POM), this will cause the plugin to produce incorrect metadata. Note that the shade plugin is reported as an issue here, but I'm personally not convinced that it is, since the shade plugin should be modifying the POM to reduce the dependencies list or modify their scope...which should probably happen ahead of any eventual dependency interpolation (or, at least, it shouldn't really matter which comes first).

Some Alternative Solutions

In order to accommodate plugins that need an accurate POM file before they execute, it's important that we take one of three approaches:

Plugin-Contributed ArtifactTransformations.

One possibility is to provide a mechanism by which plugins could contribute their own ArtifactTransformation implementations, and guarantee the order of operations for ArtifactTransformations. This would allow the GPG plugin to introduce its own signing transformation, which could attach the signature to the artifact after the version-expressions have been transformed by a previous ArtifactTransformer.

IMO, this approach has a couple serious problems. Specifically, it would complicate the classloading for any such plugins, forcing them to either declare themselves with <extensions>true</extensions>, or else forcing the ArtifactTransformationManager to somehow consume the classpath of any transformations that are added to it, including those jars in a classloader that could outlive the plugin that contributed it...which in turn means configuration and the component lifecycle of such a transformation could be complicated to maintain.

The other major problem with this strategy is that it puts the burden on the plugin developer to understand and manipulate the core build process in Maven. Not only would the GPG plugin have to know how to sign the POM file, but it would have to know about the artifact-transformation process in order to introduce a component to actually execute the plugin. IMO, this puts far too much responsibility on the plugin developer to know the intimate details of Maven's core, but it also violates the assumptions that a plugin developer has about Maven delivering accurate information for it to read of manipulate. In addition, it's also important to remember that Maven 3 is coming down the line, and we need to think twice before introducing new behaviors into plugins that will eventually have to make the transition to the new Maven. Artifact resolution in particular has been a hotspot for discussion related to this switch, so we need to maintain a stable API and set of behaviors in the 2.x code to help interoperability.

Pre-Process All POMs.

Another way we could solve this issue is to pre-process all POMs, interpolating version expressions and writing a modified POM to the target directory for subsequent use in the build process. The project representation in memory would be unaltered except for the MavenProject.file variable pointing to the modified POM instead of the original, the artifact installation/deployment process would be unaltered, and most plugins would be unaffected.

However, any plugin that reads the POM directly for whatever reason (often this is the most reliable way to get at the original, unmodified POM content) would read the original file that still contains the version expressions, which is different from the file that the entire rest of the build uses. As a workaround, it might be possible to add something like a

readRawPOM( File )

method to the MavenProjectBuilder component, to allow plugins to read the modified POM seamlessly instead of the original.

IMO, this isn't much of a solution. First, it means that the core build process in Maven - even without a single plugin executing - would modify the project directory by generating a modified POM file. Whether the POM even contains version expressions or not, whether the build needed the modified POM (the clean lifecycle likely wouldn't), this file would be generated. In many cases, this would amount to a performance penalty for IDEs and the like without providing any benefit. Finally, we will still have to re-release any plugin that might get in trouble by reading the POM file directly, in order to take advantage of the newly generated POM file. This has consequences for compatibility, both forward and backward, since Maven < 2.2.0 doesn't provide this file, and (presumably) neither will Maven 3.0.

My Proposed Solution

An alternative to the above, and the approach I much prefer, is to inject a new plugin into the default lifecycle mappings to do the POM transformation during the package phase. Unfortunately, the only manipulation of runtime build files/information that Maven does outside of executing plugins happens before the first plugin is executed. Then, each plugin is executed sequentially with only minimal core-level code running in between. The problem here is that it's no simple matter to inject core-level behavior at the package phase of the build lifecycle. In order to minimize the impact to core components like the LifecycleExecutor while still providing an accurate POM to plugins like GPG that typically execute relatively late in the lifecycle, the simplest solution is to use a plugin.

The POM transformation plugin should probably be embedded in the Maven core if possible, and bound into every lifecycle. The binding can be done in the lifecycle-mapping components - which means that any custom lifecycle mapping would have to be modified to include it - or it could be injected just before the lifecycle is executed. IMO, the best-case scenario would allow the plugin manager to load this new plugin from the core classloader, allowing it to be embedded directly in Maven itself. Finally, it would be good if possible to allow POMs to override the execution of this plugin, specifying a skip flag to turn it off, or even changing the <phase/> element to remap it into an earlier lifecycle phase if necessary.

  • No labels