( short link: https://s.apache.org/reproducible-builds )
Page properties | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Background Color | ||
---|---|---|
| ||
see also the Maven - Guide to Configuring for Reproducible Builds |
Context
https://reproducible-builds.org/ (see mailing list)
Reproducible builds are a set of software development practices that create a verifiable path from human readable source code to the binary code used by computersHow?
First, the build system needs to be made entirely deterministic: transforming a given source must always create the same result. Typically, the current date and time must not be recorded and output always has to be written in the same order.
Second, the set of tools used to perform the build and more generally the build environment should either be recorded or pre-defined.
Third, users should be given a way to recreate a close enough build environment, perform the build process, and verify that the output matches the original build.
...
But Maven plugins in the whole ecosystem (not only provided by Apache Maven team) sometimes adds add some variable parts that adds to the problem: timestamp text or username in MANIFEST.MF, ...
In 2015, reproducible-build-maven-plugin has been created to try to fix issues after packaging, by rewriting the archive and reworking content known for variable parts.
The goal of this proposal started in 2017 is to prepare a set of configuration and practices to have reproducible/verifiable builds at packaging time, both by enhancing java natural build behaviour and by removing some variability introduced by some Maven plugins (core plugins at first, but also in the Maven eco-system).
In parallel to this proposal, "Reproducible Maven Builds" site has been created to work on prototypes.
...
Sources of unreproducible bits
- Timestamps:
- Timestamps in ZIP/JAR files: file last modification time/date in central directory and file entry headers + possible optional fields "X5455_ExtendedTimestamp" (plexus-archiver PR #121)
- Timestamp in pom.properties generated by maven-archiver (MSHARED-494 )
- Timestamp in plugin.xml and plugin-help.xml descriptors generated by maven-plugin-tools-generator (MPLUGIN-326 )
- Timestamp in MANIFEST.MF (Bnd-LastModified) generated by Felix maven-bundle-pluginTimestamps in ZIP/JAR files: file last modification time/date in central directory and file entry headers + possible optional fields "X5455_ExtendedTimestamp" (PLEXUS-ARCHIVER-48)
- Timestamps in generated javadoc HTML files (can be disabled with javadoc options "notimestamp" and "bottom")
- Timestamps in bytecode generated from Groovy code (added by GroovyClassLoader.addTimeStamp())
- Username:
- UID/GID in tar file entries
- Username in MANIFEST.MF (Built-By) generated by maven-archiver (MSHARED-661 )
- UID/GID in tar file entries
- Ordering:
- Order of the file entries in a ZIP/JAR file (depends on file system order)
- Order of the entries in the MANIFEST (MSHARED-511 )
- Order of goals in plugin.xml generated by maven-plugin-tools (MPLUGIN-261)
- Order of the methods of the ObjectFactory.java file generated by JAXB/xjc (JAXB-598)
- Order of components in META-INF/plexus/components.xml generated by plexus metadata
...
- (issue #8)
- Tools Versions:
- exact JDK version used to build in MANIFEST.MF (Build-Jdk) generated by maven-archiver (MSHARED-797)
Notice that keeping the major version of the JDK used still makes sense, since it has an influence on generated bytecode: with the same source code and defined -target version, javac from JDK 6, 7, 8, ... do not produce the same bytecode. If we want to isolate the generated binary from JDK used, the compiler used will have to not be javac provided by running JDK (see Using Non-Javac Compilers) - exact Maven version used to build in MANIFEST.MF (Created-By) generated by maven-archiver (MSHARED-799)
- exact Maven version used to build in META-INF/.../pom.properties generated by maven-archiver (MSHARED-800)
- exact JDK version used to build in MANIFEST.MF (Build-Jdk) generated by maven-archiver (MSHARED-797)
Line endings is also a problem, and even if we could force given line endings for build-generated text files (MANIFEST, pom.properties...), it would be hazardous to try to change the line endings of the resource files.
Out of scope
Given the variety of sources of unreproducible builds and a balance between their impact and the complexity of fixing, a few ones are considered out of scope of this proposal: once reproducible builds works well with chosen limitations, and if it has success against users, these limitations can be reworked later:
- version ranges in pom.xml: version ranges makes version resolution unstable over time. This proposals start from a stable build.
Notice that some nice strategies have been discussed on how to introduce stability while maintaining version ranges: see the discussion on Maven dev mailing list... - line ending (Windows CRLF vs Unix LF): updating plugins that generate content can be easy, but this will require normalizing line endings of resource files, which may be hazardous
Notice that building with-Dline.separator='\n'
is an easy first step - JDK version: from initial tests, only major version has an impact, which is manageable to get an environment for reproducible build.
Future strategies on easing rebuild management could consider using another compiler than javac, that could be downloaded as a plugin dependency...
Output Archive Entries Timestamp
Packaging plugins, that create zip or tar archives, will require a parameter to define the value of a timestamp to use for archive entries, independantly from effective build timestamp. This is something equivalent to Reproducible Build's SOURCE_DATE_EPOCH environment variable.
Life would become easier if there was a dedicated POM element like ${project.build.outputTimestamp}
(with an ISO 8601 formatted date+time) which could be used to specify the timestamp value once per entire project. Every plugin could use it as default value, like it has been done with source files encoding:
|
Adding this element to the POM structure without breaking backward compatibility can only happen in a future version, yet to be defined (at least after Maven 3.0, see POM Model Version 5 proposal):
|
For Maven 2.x and 3.x, the value can be defined as an equivalent property:
|
Thus plugins could immediately be modified to use ${project.build.outputTimestamp
} default value, whatever Maven version is used.
MSHARED-837 issue has been created to provide to plugins a shared API to parse the timestamp and configure reproducible archive creation in a uniform way.
MRELEASE-1029 has been created to update the timestamp value during release:prepare.
Rebuilding
The underlying problem is that the pom file does not capture all the configuration of the build environment: it includes the plugins used during the build with their version number, but it does not include the version of Maven and the JDK, the Operating System and the architecture used to produce the artifact, etc...
...
What are the issues to solve?
issue tracking | description | ||||
---|---|---|---|---|---|
MSHARED-661 ( | fixed inmaven-archiver | -3. | 34. | 10) |
|
MSHARED-796 ( maven-archiver | adds "Built-By" and "Built-Jdk" Manifest entries-3.4.0) | META-INF/MANIFEST.MF | |||
MSHARED-494 ( | fixed inmaven-archiver 3.1.0) | Timestamp inMETA-INF/maven/$groupId/$artifactId/pom.properties | |||
support SOURCE_DATE_EPOCH environment variable or equivalent: see https://reproducible-builds.org/docs/timestamps/ | |||||
MSHARED-800 | META-INF/maven/$groupId/$artifactId/pom.properties | sort zip entries to make zip entries order reproducible||||
MPLUGIN-261 ( | fixed inmaven-plugin-plugin 3.3) | generatedMETA-INF/maven/plugin.xml | is non-deterministic|||
MPLUGIN-326 ( | fixed inmaven-plugin-plugin 3.5.1) | Timestamp inMETA-INF/maven/plugin.xml | and
META-INF/maven/$groupId/$artifactId/plugin-help.xml | ||
plexus-containers issue #8 ( plexus-component-metadata 2.0. | xml descriptors generated by maven-plugin-tools-generator0) | META-INF/plexus/components.xml | |||
plexus-containers issue #27 ( plexus-component-metadata 2.1.0) | META-INF/plexus/components.xml | ||||
bnd-maven-plugin #3521 ( bnd-maven-plugin configuration) | META-INF/MANIFEST.MF | ||||
FELIX-6269 ( maven-bundle-plugin:manifest & bundle 4.2.2) | META-INF/MANIFEST.MF | ||||
FELIX-6203 ( maven-bundle-plugin:bundle 4.2.2) | META-INF/maven/$groupId/$artifactId/pom.properties | ||||
sisu-maven-plugin 5e2377c ( sisu.inject 0.3.4) | META-INF/sisu/javax.inject.Named | ||||
sisu-maven-plugin annotation processor 570e81 ( sisu.inject 0.9.0.M1) | META-INF/sisu/javax.inject.Named | ||||
MRRESOURCES-114 ( maven-remote-resources-plugin 1.7.0) | projectTimespan, as often printed in META-INF/NOTICE | ||||
JDK-8240734 (JDK 15) | module-info.class | ||||
zip entries timestamp and order | |||||
COMPRESS-485 ( commons-compress 1.19) | keep entries order when gathering ParallelScatterZipCreator | ||||
plexus-archiver | codehaus-plexus/plexus-archiver issue #48avoid timestamp issues in archives created by plexus-archiver (widely used in Maven plugins creating jar, zip, war, tar... archives) | codehaus||||
plexus-archiver issue #114 ( plexus-archiver 4.2.0) | To enable reproducible builds `AbstractArchiver#addFileSet` should add the files in order | ||||
MSHARED-837 ( maven-archiver 3.5.0) | support => see "Output Archive Entries Timestamp" section of the proposal | ||||
plexus-archiver #271 ( plexus | -containers-archiver 2.7.0) | remove variation based on user's umask on Unixes (particularly group write) | |||
plexus-archiver #124 ( plexus-archiver 4.2.0) | remove variation based on uid/gid & userName/groupName in tar | ||||
MSOURCES-120 ( maven-source-plugin 3.2.0) | apply reproducible zip (entries order and timestamp) to maven-source-plugin | ||||
MSOURCES-137 ( maven-source-plugin 3.3.1) | apply 022 umask (to ease RB check: independent from env) | ||||
MASSEMBLY-921 ( maven-assembly-plugin 3.2.0) | apply reproducible archive (entries order and timestamp) to maven-assembly-plugin | ||||
MASSEMBLY-989 ( maven-assembly-plugin 3.6.0) | apply 022 umask (to ease RB check: independent from env) | ||||
MJAR-263 ( maven-jar-plugin 3.2.0) | apply reproducible zip (entries order and timestamp) to maven-jar-plugin | ||||
MSITE-851 ( maven-site-plugin 3.9.0) | apply reproducible zip (entries order and timestamp) to site:jar | ||||
MJAVADOC-627 ( maven-javadoc-plugin 3.2.0) | apply reproducible zip (entries order and timestamp) to javadoc:*jar | ||||
MSHADE-347 ( maven-shade-plugin 3.2.2) | apply reproducible zip (entries order and timestamp) to shade:shade | ||||
MSHADE-352 ( maven-shade-plugin 3.2.3) | keep reproducible timestamp when shading with transformer | ||||
MSHADE-420 | mtime from extra field data from shaded dependency makes result builder's timezone sensitive | ||||
ARCHETYPE-590 ( maven-archetype-plugin 3.2.0) | apply reproducible zip (entries order and timestamp) to archetype:jar | ||||
MWAR-432 ( maven-war-plugin 3.3.0) | apply reproducible zip (entries order and timestamp) to war:jar | ||||
MACR-53 ( maven-acr-plugin 3.2.0) | |||||
MEAR-280 ( maven-ear-plugin 3.1.0) | |||||
MEJB-128 ( maven-ejb-plugin 3.1.0) | |||||
MRAR-86 ( maven-rar-plugin 3.0.0) | |||||
MJLINK-75 ( maven-jlink-plugin 3.0.0) | apply reproducible zip to the zip file created by the plugin | ||||
issues fixed in maven-archiver will have been picked by 9 other plugins managed by Apache Maven team (acr, ear, ejb, jlink, rar), probably other plugins creating zip/jar/tar archives managed outside Apache Maven team will require to do the same | |||||
FELIX-6304 ( maven-bundle-plugin:bundle 5.1.1) | order and timestamp of jar entries for bundle goal | ||||
FELIX-6404 ( maven-bundle-plugin 5.1.3) | regression in 5.1.2 | ||||
FELIX-6495 ( maven-bundle-plugin 5.1.4) | bundle:manifest adds Bnd-LastModified even if outputTimestamp is defined | ||||
FELIX-6496 ( maven-bundle-plugin 5.1.5) require bnd #5021(6.2) | non-reproducible Export-Package and Private-Package values | ||||
FELIX-6602 ( maven-bundle-plugin 5.1.9) | non-reproducible Include-Resource entry | ||||
FELIX-6681 | non-reproducible Export-Service entry | ||||
spring-boot-maven-plugin:repackage #20176 ( 2.3.0-M4) | timestamp | ||||
org.jboss.jandex:jandex-maven-plugin #26 ( 1.1.1) | unsorted files (notice: not sure that sorting files at plugin level is sufficient: it seems Jandex indexer itself produces non-reproducible output) | ||||
org.jboss.jandex:jandex-maven-plugin #35 1.2.3 replaced by io.smallrye:jandex-maven-plugin #286 ( 3.1.0) | jandex.idx output is not stable/reproducible | ||||
MJAR-275 (maven-jar-plugin 3.3.0) | outputTimestamp not applied to module-info.class | ||||
moditect-maven-plugin:add-module-info #185 ( 1.0.0.Final) | creates module-info.class with non-reproducible timestamp | ||||
moditect #222 | timestamp is dependant on timezone | ||||
SM-5021 ServiceMix depends-maven-plugin:generate-depends-file (1.5.0) | timestamp in generated depends.txt | ||||
XBEAN-335 Geronimo maven-xbean-plugin | timestamp in generated /META-INF/spring-* (Properties format) | ||||
KARAF-7367 karaf-maven-plugin:kar #1492 ( 4.3.7) | does not take outputTimestamp into account | ||||
quarkus-extension-maven-plugin #38364 | timestamp in generated properties META-INF/quarkus-extension.properties, META-INF/quarkus-javadoc.properties and sources *.jdp | ||||
tracking of plugins with issues and fixes is now also done in artifact:check-buildplan |
Debian approach
Debian has a strong reproducible builds structure working on the topic for a few years: see BuildinfoFiles for environment info recording.
...
- Maven core management of SOURCE_DATE_EPOCH for
maven.build.timestamp
property: https://githubsalsa.debian.comorg/Debianjava-team/maven/blob/master/debian/patches/reproducible-build-timestamp.patchfix for MPLUGIN-326: https://sources.debian.net/src/maven-plugin-tools/3.5-4/debian/patches/04-reproducible-plugin-descriptor.patch/ - dates in javadoc footer: https://sources.debian.net/src/maven-javadoc-plugin/2.10.4-1/debian/patches/reproducible-footer.patch/plexus containers issue #8: https://sources.debian.net/src/plexus-containers1.5/1.7.1-4/debian/patches/03-reproducible-metadata.patch/ and old https://sources.debian.net/src/plexus-maven-plugin/1.3.8-10/debian/patches/0005-reproducible-metadata.patch/
- sisu-inject: https://sources.debian.net/src/sisu-ioc/2.3.0-9/debian/patches/reproducible-index.patch/
...