Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • You can download, build and install CPython from sources.
  • If you are an Ubuntu user, you could add a third-party repository 'Deadsnakes' and install the missing versions via apt. If you install from Deadsnakes, make sure to also install python#.#-dev, python#.#-venv and python#.#-distutils packages.
  • You can use PyEnv to download and install Python versions (Recommended).
    Installation steps may look  as follows:
    1. Follow the steps below in How to setup pyenv.
    2. Install Python intepreter for each supported Python minor version. For example:

      Code Block
      languagebash
      pyenv install 3.7.10
      pyenv install 3.8.9
      pyenv install 3.9.4 
      pyenv install 3.10.7 
      pyenv install 3.11.3

      For major.minor.patch versions currently used by Jenkins cluster, see  Current Installations.

    3. Make installed interpreters available in your shell by running

      Code Block
      languagebash
      pyenv global 3.8.9 3.7.10 3.9.4 3.10.7 3.11.3


    4. (OPTIONAL) Pyenv will sometimes fail to make these interpreters directly available without a local configuration. If you see errors trying to use python3.x , then run also pyenv local  

      Code Block
      languagebash
      pyenv local 3.8.9 3.7.10 3.9.4 3.10.7 3.11.3


After these steps, all python3.x  interpreters should be available in your shell. The first version in the list passed to pyenv global will be used as default python / python3 interpreter if the minor version is not specified.

...

  1. Use the following code:

    Code Block
    languagebash
    # Initialize virtual environment called "env" in ~/.virtualenvs or any other directory. (Consider using pyenv, to manage the python version as well as installed packages in your virtual environment)
    $ python3 -m venv ~/.virtualenvs/env
    
    # Activate virtual environment.
    $ . ~/.virtualenvs/env/bin/activate
    
    # Upgrade other tools. (Optional)
    pip install --upgrade pip
    pip install --upgrade setuptools
    
    # Install setup.py requirements.
    (env) $ pip install -r build-requirements.txt
    
    # Install Apache Beam package in editable mode.
    (env) $ pip install -e .[gcp,test]
    
    

    For certain systems, particularly Macs with M1 chips, this installation method may not generate urns correctly. If running python gen_protos.py  doesn't resolve the issue, consult https://github.com/apache/beam/issues/22742#issuecomment-1218216468 for further guidance.

...

How to setup pyenv (with pyenv-virtualenv plugin)

  1. Install prerequisites for your distribution.

  2. curl https://pyenv.run | bash
  3. Add the required lines to ~/.bashrc (as returned by the script).
  4. Note (12/10/2021): You may have to manually modify .bashrc as described here: https://github.com/pyenv/pyenv-installer/issues/112#issuecomment-971964711. Remove this note if no longer applicable.
  5. Open a new shell. If pyenv command is still not available in PATH, you may need to restart the login session. 

...

Code Block
languagebash
# Install pyenv deps
sudo apt-get install -y build-essential libssl-dev zlib1g-dev libbz2-dev \
libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev \
xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git

# Install pyenv, and pyenv-virtualenv plugin
curl https://pyenv.run | bash

# Run the outputted commands to initialize pyenv in .bashrc

Example: How to Run Unit Tests with PyCharm Using Python 3.

...

8.9 in a virtualenv

  1. Install Python 3.78.9 and create a virtualenv
    • pyenv install 3.78.9
    • pyenv virtualenv 3.78.9 ENV_NAME
    • pyenv activate ENV_NAME
  2. Upgrade packages (recommended)

    Code Block
    pip install --upgrade pip setuptools


  3. Set up PyCharm
    1. Start by adding a new project interpreter (from the bottom right or in Settings).
    2. Select Existing environment and the interpreter, which should be under ~/.pyenv/versions/3.78.9/envs/ENV_NAME/bin/python or ~/.pyenv/versions/ENV_NAME/bin/python.
    3. Switch interpreters at the bottom right.

...

Code Block
languagebash
cd sdks/python/
python setup.py pip install build && python -m build --sdist

We will use the tarball built by this command in the --sdk_location parameter.

...

Code Block
languagebash
# Build portable worker
./gradlew :runners:google-cloud-dataflow-java:worker:build -x spotlessJava -x rat -x test
./gradlew :runners:google-cloud-dataflow-java:worker:shadowJar

# Build portable Pyhon SDK harness and publish it to GCP
./gradlew -Pdocker-repository-root=gcr.io/dataflow-build/$USER/beam -p sdks/python/container docker
gcloud docker -- push gcr.io/dataflow-build/$USER/beam/python:latest

# Initialize python
cd sdks/python
virtualenv env
. ./env/bin/activate

# run pipeline
python -m apache_beam.examples.wordcount   --runner DataflowRunner   --num_workers 1   --project <gcp_project_name>   --output <gs://path>   --temp_location <gs://path>   --workersdk_harness_container_image gcr.io/dataflow-build/$USER/beam/python:latest   --experiment beam_fn_api   --sdk_location build/apache-beam-2.12.0.dev0.tar.gz  --debug

...

  1. Click on a recent `Build python source distribution and wheels job` that ran successfully on the github.com/apache/beam master branch from this list
  2. Click on List files on Google Cloud Storage Bucket on the right-side panel.
  3. Expand List file on Google Cloud Storage Bucket in the main panel.
  4. Locate and Download the ZIP file. For example, apache-beam-2.4852.0.dev0.zip from tar.gz from GCS.
  5. Install the downloaded zip file. For example:

    Code Block
    languagebash
    titleSimpleTest
    pip install apache-beam-2.4852.0.dev0.tar.zipgz
    # Or, if you need extra dependencies:
    pip install apache-beam-2.4852.0.dev0.tar.zipgz[aws,gcp]


  6. When you run your Beam pipeline, pass in the --sdk_location flag pointed at the same ZIP file. 


    Code Block
    languagebash
    titleSimpleTest
    --sdk_location=apache-beam-2.2552.0.dev0.tar.zipgz


How to update dependencies that are installed in Python container images 

...

Code Block
CFLAGS="-O2" pyenv install 3.78.129

There have been issues with older Python versions. See here for details.