Hudi has custom input format implementation to work with Hive tables. These classes are also affected by the change in the package namespace. In addition, these input format names are renamed to note that they work primarily on Parquet dataset.

Please find the relocation details name changes below

View Type	Pre v0.5.0 Input Format Class	v0.5.0 Input Format Class
Read Optimized View	com.uber.hoodie.hadoop.HoodieInputFormat	org.apache.hudi.hadoop.HoodieInputFormatHoodieParquetInputFormat
Realtime View	com.uber.hoodie.hadoop.HoodieRealtimeInputFormat	org.apache.hudi.hadoop.realtime.HoodieRealtimeInputFormatHoodieParquetRealtimeInputFormat

Changes in Spark DataSource Format Name:

With the package renaming, Hudi’s Spark Data Source will now be accessed for reading and writing using the format name “org.apache.hudi”

Data Source Type	Pre v0.5.0 Format (e.g in scala)	v0.5.0 Format (e.g in scala)
Read	spark.read.format(“com.uber.hoodie”).xxxx	spark.read.format(“org.apache.hudi”).xxxx
Write	spark.write.format(“com.uber.hoodie”).xxxx	spark.write.format(“org.apache.hudi”).xxxx

Migrating Existing Hudi Datasets:

...

Space shortcuts

Page tree

Versions Compared

Old Version 6

New Version 7

Key

Changes in Spark DataSource Format Name:

Migrating Existing Hudi Datasets:

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 6

New Version 7

Key

Changes in Spark DataSource Format Name:

Migrating Existing Hudi Datasets: