Table of Contents

Proposers

@Yanjia Gary Li

Approvers

@<approver1 JIRA username> Vinoth Chandar : [APPROVED/REQUESTED_INFO/REJECTED]@<approver2 JIRA
username> lamber-ken : [APPROVED/REQUESTED_INFO/REJECTED]
...

...

Decouple Hudi related logic from existing HoodieParquetInputFormat, HoodieRealtimeInputFormat, HoodieRealtimeRecordReader, e.t.c
Create new classes to use org.apache.hadoop.mapreduce APIs and warp Hudi related logic into it.
Warp the FileInputFormat from the query engine to take advantage of the optimization. As Spark SQL for example, we can create a HoodieParquetFileFormat by wrapping ParquetFileFormat and ParquetRecordReader<Row> from Spark codebase with Hudi merging logic. And extend the support for OrcFileFormat in the future.

Image Added

Implementation

<Describe the new thing you want to do in appropriate detail, how it fits into the project architecture. Provide a detailed description of how you intend to implement this feature.This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgement based on the scope of the change.>WIP

Rollout/Adoption Plan

<What impact (if any) will there be on existing users?>
<If we are changing behavior how will we phase out the older behavior?>
<If we need special migration tools, describe them here.>
<When will we remove the existing behavior?>

Test Plan

...

No impact on the existing users because the existing Hive related InputFormat won't be changed, except some methods was relocated to HoodieInputFormatUtils class. Will test this won't impact the Hive query.
New Spark Datasource support for Merge on Read table will be added

Test Plan

Unit tests
Integration tests
Test on the cluster for a larger dataset.

Space shortcuts

Page tree

Versions Compared

Old Version 3

New Version 4

Key

Proposers

Approvers

Implementation

Rollout/Adoption Plan

Test Plan

Test Plan

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 3

New Version 4

Key

Proposers

Approvers

Implementation

Rollout/Adoption Plan

Test Plan

Test Plan