Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Transform service allows Beam pipeline authors to upgrade specific transforms within their pipelines to a newer Beam version without upgrading the full pipeline. Please see the Beam programming guide for more details regarding this feature and supported SDKs and Beam versions.

...

Code Block
export BEAM_VERSION=<Beam version> # Needs to be Beam 2.53.0 or later.

mvn archetype:generate \
    -DarchetypeGroupId=org.apache.beam \
    -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
    -DarchetypeVersion=$BEAM_VERSION \
    -DgroupId=org.example \
    -DartifactId=beam-transform-upgrade \
    -Dversion="0.1" \
    -Dpackage=org.apache.beam.examples \
    -DinteractiveMode=false

cd beam-transform-upgrade


  • Add the Beam examples Java dependency to the <dependencies> section of the Maven pom.xml file.

    Code Block
    <dependency>
    <groupId>org.apache.beam</groupId>
    <artifactId>beam-examples-java</artifactId>
    <version>${beam.version}</version>
    </dependency>


  • Execute the pipeline using a portable Beam runner. Following example uses Dataflow Runner v2.

Code Block
export GCP_PROJECT=<GCP project>
export GCP_BUCKET=<GCP bucket>
export GCP_REGION=<GCP region>
export OUTPUT_BIGQUERY_TABLE=<A BigQuery table to write the output to>
export TRANSFORM_BEAM_VERSION=<Beam version to upgrade transforms to> # Needs to be Beam 2.53.0 or later.

mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.cookbook.BigQueryTornadoes -Dexec.args="--runner=DataflowRunner --project=$GCP_PROJECT \
                 --region=$GCP_REGION \
                 --tempLocation=gs://$GCP_BUCKET/transform-upgrade/tmp \
                 --experiments=use_runner_v2 --output=$OUTPUT_BIGQUERY_TABLE \
                 --transformsToOverride=beam:transform:org.apache.beam:bigquery_read:v1,beam:transform:org.apache.beam:bigquery_write:v1 \
                 --transformServiceBeamVersion=$TRANSFORM_BEAM_VERSION" -Pdataflow-runner

...