Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Input PCollection that is fed into the cross-language transform in Python side uses a Python specific type. In this case, the PTranform that produced the input PCollection has to be updated to produce outputs that use Beam's Standard Coders to make sure Java side can interpret such coders.
  • Input PCollection uses standard coders but Python type inferencing results in picking up the PickleCoder. This can usually be resolved by annotating the predecessor Python transform with the correct type annotation using the with_output_types tag. See this Jira for an example.


"Unknown Coder URN beam:coder:pickled_python:v1" when running a Java pipeline that uses Python cross-language transforms

This usually means that Python SDK was not able to properly determine a portable output PCollection type when expanding the cross-language transform. So it ended up picking the default PickleCoder.

But Java SDK is unable to interpret this, so it will fail when trying to parse the expansion response from the Python SDK.

The solution is to provide a hint to the Python SDK regarding the element type(s) of the output PCollection(s) of the cross-language transform. This can be provided using the withOutputCoder or withOutputCoders methods of the PythonExternalTransform API.


How to set Java log level from a Python pipeline that uses Java transforms

...