THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
- Setup the environment according to the Beam Java quickstart guide.
- Install Docker if not already available in the system.
- Checkout the Beam examples Maven archetype for the relevant Beam version.
Code Block |
---|
export BEAM_VERSION=<Beam version> # Needs to be Beam 2.53.0 or later. mvn archetype:generate \ -DarchetypeGroupId=org.apache.beam \ -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \ -DarchetypeVersion=$BEAM_VERSION \ -DgroupId=org.example \ -DartifactId=beam-transform-upgrade \ -Dversion="0.1" \ -Dpackage=org.apache.beam.examples \ -DinteractiveMode=false cd beam-transform-upgrade |
...
Add the Beam examples Java dependency to the <dependencies> section of the Maven pom.xml file.
Code Block <dependency> <groupId>org.apache.beam</groupId> <artifactId>beam-examples-java</artifactId> <version>${beam.version}</version> </dependency>
- Execute the pipeline using a portable Beam runner. Following example uses Dataflow Runner v2.
Code Block |
---|
export GCP_PROJECT=<GCP project> export GCP_BUCKET=<GCP bucket> export GCP_REGION=<GCP region> export OUTPUT_BIGQUERY_TABLE=<A BigQuery table to write the output to> export TRANSFORM_BEAM_VERSION=<Beam version to upgrade transforms to> # Needs to be Beam 2.53.0 or later. mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.cookbook.BigQueryTornadoes -Dexec.args="--runner=DataflowRunner --project=$GCP_PROJECT \ --region=$GCP_REGION \ --tempLocation=gs://$GCP_BUCKET/transform-upgrade/tmp \ --experiments=use_runner_v2 --output=$OUTPUT_BIGQUERY_TABLE \ --transformsToOverride=beam:transform:org.apache.beam:bigquery_read:v1,beam:transform:org.apache.beam:bigquery_write:v1 \ --transformServiceBeamVersion=$TRANSFORM_BEAM_VERSION" -Pdataflow-runner |
...