This page describes the release, validation, and other aspects related to SDKHarness images.
Overview
The idea is to build a set of public SDKHarness pre-built images, that users can utilize to run their portable pipelines without having to manually build them, or use these images as base images for customization.
Background information
- SDKHarness architecture / design docs ?
- Image structure ?
- Source ?
Location/Naming
This section describes the naming scheme and location for publication of the containers.
Proposed repository:
gcr.io/beam , created under apache-beam-testing project, artifacts accessible publicly
Proposed tagging scheme:
java-SNAPSHOT
java-2.10.1
python2-SNAPSHOT
python2-2.10.1
go-SNAPSHOT
go-2.10.1
For example:
gcr.io/beam:java-SNAPSHOT
gcr.io/beam:java-2.10.1
Gcr.io
Things to know about gcr.io:
- Quotas:
- ?
- Permissions:
- download access - public
- publish - limited to authorized accounts that have correct permissions under apache-beam-testing:
- publishing the snapshots nightly might be feasible similar to how we currently publish nightly maven snapshots, by creating a Jenkins job;
- publishing at release time can be another Jenkins job that is triggered manually by the release owner;
- what's the process of triggering a job in such case?
- GCP Project:
- apache-beam-testing
- Troubleshooting:
- If something goes wrong with the release process:
- ping dev@ ?
- If something goes wrong with a customer pipeline from using the prebuilt images:
- ping dev@ ?
- If something goes wrong with the release process:
Publication Schedule
Snapshots
- how do we publish the snapshots, what's the frequency?
- automatic, when HEAD is built, nightly similar to maven snapshots, or manually?
- when and how do we cleanup the snapshot images:
- when you publish an image you have a new version/hash and still can use any previous versions. They take space and will count towards a quota. We need to clean them up periodically:
- make it part of the publish job to look and delete the versions that are more than X days (or versions) old?
- when you publish an image you have a new version/hash and still can use any previous versions. They take space and will count towards a quota. We need to clean them up periodically:
Release Images
- how should we build and publish the images for release versions of Beam?
- manually, as part of Beam release;
- or by automatically triggering a job at some step of the release?
- should it be a blocker for the release?
- should we make it part of the release and not mark release as complete until the images are published?
- or can we publish them in a separate process later/earlier?
- should it be done by the same release owner?
- should the validation be part of the release validation?
Commands
This section describes the commands that are used to build, publish, run tests and examples for the images.
Prerequisites
- docker;
Build
To build an image X run:
$ ./gradlew .... ?
This produces a local image named X. It can be examined by running docker commands:
$ docker image ls
$ docker ...
Run a test against locally built container
Portability tests are executed with command Y:
$ ./gradlew ... ?
They run against container Z. To use the prebuilt container run this command:
$ ./gradlew ... ?
Publish
Publishing an image to gcr.io/beam requires permissions in apache-beam-testing project.
This is the command that published an image X: $ ./gradlew ... ?
To publish an image Y to a custom repository run this command:
$ ./gradlew ... ?
Release Images Validation
This section describes how to validate a built and/or published image.
Automated test suites
We have these test suites in Beam that utilize portability:
- ...
- ...
To execute a test suite X against container Y run this command:
$ ./gradlew ... ?
Manual testing
To run a custom pipeline against an image
:
$ ./gradlew ... ?
Backwards compatibility
- ?
Other verification
- hash, signature ?
Support Story
How do we allow the images to be used?
What's the process of reporting issues?
Any special licensing?