Proposers

@<proposer1 JIRA username>
@<proposer2 JIRA username>
...

Approvers

@<approver1 JIRA username> : [APPROVED/REQUESTED_INFO/REJECTED]
@<approver2 JIRA username> : [APPROVED/REQUESTED_INFO/REJECTED]
...

Status

Current state: [One of "Under Discussion", "Accepted", "Rejected"]

Discussion thread: here

JIRA: here

Released: <Hudi Version>

Abstract

Currently, in terms of computing engines, Hudi has mainly integrated deeply with Spark. Apache Flink is a popular streaming processing engine. Integrating Hudi with Flink is a valuable work. This will enable Hudi to embrace more computing engines, and the pluggable design will also make its architecture more flexible and open.

Background

The current design of Hudi is highly dependent on Spark in four modules that depend on Spark, as shown below:

Implementation

<Describe the new thing you want to do in appropriate detail, how it fits into the project architecture. Provide a detailed description of how you intend to implement this feature.This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgement based on the scope of the change.>

Rollout/Adoption Plan

<What impact (if any) will there be on existing users?>
<If we are changing behavior how will we phase out the older behavior?>
<If we need special migration tools, describe them here.>
<When will we remove the existing behavior?>

Test Plan

<Describe in few sentences how the RFC will be tested. How will we know that the implementation works as expected? How will we know nothing broke?>

Space shortcuts

Page tree

RFC - 13: Integrate Hudi with Flink