...
- State API between the Runner and the SDK harness which could be used for state access in the Python user-defined function. (Note: The state communication uses a separate channel from the data channel.
It has defined five kinds of states in the proto message in the State API (Refer to StateKey for more details) and three types of operations for state access in the State API: Get / Append / Clear:
State Type
Usecase
Runner
Remote references (and other Runner specific extensions)
IterableSideInput
Side input of iterable
MultimapSideInput
Side input of of values of map
MultimapKeyedSideInput
Side input of of keys of map
BagUserState
User state with primary key (value state, bag state, combining value state)
Among them, only BagUserStage is dedicated for state access in Beam. The others are used for data access in Beam.
- Building on the proto message of BagUserState, it has supported four kinds of user-facing API in Beam’s Python SDK harness: BagRuntimeState, SetRuntimeState, ReadModifyWriteRuntimeState and CombiningValueRuntimeState.
...