Decouple Hudi related logic from existing HoodieParquetInputFormat, HoodieRealtimeInputFormat, HoodieRealtimeRecordReader, e.t.c
Create new classes to use org.apache.hadoop.mapreduce APIs and warp Hudi related logic into it.
Warp the FileInputFormat from the query engine to take advantage of the optimization. As Spark SQL for example, we can create a HoodieParquetFileFormat by wrapping ParquetFileFormat and ParquetRecordReader<Row> from Spark codebase with Hudi merging logic. And extend the support for OrcFileFormat in the future.

Image RemovedImage Added

Implementation

Image Added

Rollout/Adoption Plan

...