Approvers

Balaji Varadarajan: [APPROVED/REQUESTED_INFO/REJECTED]
Nishith Agarwal: [APPROVED/REQUESTED_INFO/REJECTED]
Vinoth Chandar: [APPROVED/REQUESTED_INFO/REJECTED] APPROVED

Status

Current state:

Current State

Status

title	Under Discussion

Status

colour	Yellow
title	In Progress

Status


colour	Red
title	ABANDONED

Status

colour	Green
title	Completed

Status


colour	Blue
title	INactive

...

For tables or partitions, where the majority of records change every cycle, it is inefficient to do upsert or merge. We want to provide hive like 'insert overwrite' API to ignore all the existing data and create a commit with just new data provided. Doing this in Hoodie will provide better snapshot isolation than Hive because of atomic commits. These These API can also be used for certain operational tasks to fix a specific corrupted partition. We can do 'insert overwrite' on that partition with records from the source. This can be much faster than restore and replay for some data sources.

Implementation

Hoodie supports multiple write operations such as insert, upsert, bulk_insert on the target table. At a high level, we like to add two new operations:

...

Fits in well with existing hudi implementation. Because the same set of fileIds are used, it is easy to reason about versioning. Rollback and restore will continue to work as is.

Disadvantages:

This means new parquet file size can be smaller than previous version for same file. We saw this happen earlier because parquet version incompatibility. We added guarding checks against that. So we may have to add exceptions for these rules.
In some cases, we create empty parquet files. In my testing, smallest size we could create was 400KB. This is because we store metadata including schema in empty parquet file. These empty parquet files can be reused if the partition grows in subsequent writes. Otherwise, we need a strategy to delete these empty file groups cleanly to reclaim this space.
- One option to reclaim this space is to extend BaseFileOnlyView to inspect small parquet files and ignore them when listing splits. (Appreciate any other suggestions here as I do not have a lot of experience reading this code)
- Other option is to extend metadata to mark these file groups as invalid. We update all operations to read metadata first to discard these empty files. There can be race conditions if this is not done carefully. (More details can be discussed after we finalize an approach)
For MOR tables, this is somewhat awkward.

We have to schedule compaction before writing new data.
For ‘insert overwrite’, we are forced to write parquet files as opposed to writing inserts also into log files for performance reasons. There is additional work required to support inserts go into log files.

External index such as HBase Index needs to be updated to ‘delete’ keys that are no longer present in the partitions. Otherwise, index can get inconsistent and may effect record distribution.

...

After weighing trade-offs, we are inclined towards going with Option 2. Option 1 . has lot of complexity especially with MOR tables. Option 3 would add a lot of complexity . Option 2 looks like a cleaner approach, but is introducing new behavior of ignoring file groups. Option1 is more practical, fits in with existing framework and is relatively easier to implement. in general. With Option2, we can take multiple steps.

Before consolidated metadata launches, in the initial version, we keep list of file groups to be filtered out in active commit metadata. We change reader to query active commits and filter appropriate file groups. We also make changes to cleaner to archive active commits only after removing corresponding file groups from disk. (This approach can also be used for RFC-19)
After consolidated metadata launches, we can change metadata lookup to leverage that instead of reading from active commit files.

Rollout/Adoption Plan

What impact (if any) will there be on existing users? None these are new API
If we are changing behavior how will we phase out the older behavior? NA
If we need special migration tools, describe them here. NA
When will we remove the existing behavior? NA

...

Space shortcuts

Page tree

Versions Compared

Old Version 13

New Version Current

Key

Approvers

Status

Implementation

Rollout/Adoption Plan

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 13

New Version Current

Key

Approvers

Status

Implementation

Rollout/Adoption Plan