Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Next, we will propose the operator implementation for the mapPartition API . 

In our implementation, the operator will execute the MapPartitionFunction while receiving records, rather than waiting for the entire window of records to be collected. To achieve this, we add a seperate UDFExecutionThread inside the operator.

The TaskMainThread will cyclically add records to a fixed-size queue. The UDFExecutionThread will invoke user-defined MapPartitionFunction and cyclically poll records from the queue in the Iterator parameter of MapPartitionFunction. If there is no records in the queue, the UDFExecutionThread blocks and waits on the hasNext() and next() methods of the Iterator. Once the UDFExecutionThread has processed all the data, the operator completes its execution. 

...