Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Rebootstrap the table using the new writer and new configs. The advantage of this approach is that it is a well understood process.
  • Create a new rewrite tool in Hoodie. This tools would read existing Hudi tables and write a new version of each file_id, with the new format. There is more effort involved here, however this tool might be useful for future Hudi changes!

Contingency - Rollback 

If we are forced to roll back, we may have a big issue because the newly written parquet files will no longer have the _hoodie_record_key fields, and the older clients may not be able to read these. To address this, I believe we should continue writing the _hoodie_record_key fields to disk for some weeks. We should have a config that tells the reader to ignore the _hoodie_record_key field and to instead use the virtual key. Doing this will also help independent rollout of reader and writer clients.

Test Plan

<Describe in few sentences how the RFC will be tested. How will we know that the implementation works as expected? How will we know nothing broke?>

...