Div | ||
---|---|---|
| ||
RFC-1 : CSV Source Support for Delta Streamer |
Table of Contents | ||||
---|---|---|---|---|
|
Proposer
- @rahuledavalath
- Ethan Guo
Approver
- Vinoth Chandar : [APPROVED/REQUESTED_INFO/REJECTED]
- Balaji Varadarajan : REQUESTED_INFO
Status
Current state: "Under Discussion"
Discussion thread: here
JIRA: here
Released: N/A
Prior doc link : https://docs.google.com/document/d/1bj-xpkRomVtbzvLb_4BRngDIGkkMR5yzxXRRzkA7QVo/edit#heading=h.di66rda5xhp2
...
Hudi delta Streamer does not have direct support for pulling data in csv CSV format from kafakafka/HDFS-logs. The only possible alternative to ingesting Csv to ingesting CSV data to hudi Hudi dataset is to first convert them into json/avro before pulling in through delta-streamer. This HIP RFC proposes a mechanism to directly support sources in csv in CSV format.
Background
Introduce any much background context which is relevant or necessary to understand the feature and design choices.
...