THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
We might need a way to standardize the way users think about querying hudi tables using Spark/Hive/Presto. At the moment, I'm proposing another VIEW_TYPE to be introduced in Spark POINT_IN_TIME which will work in conjunction with an already present config END_INSTANTTIME_OPT_KEY to provide the snapshot view of a table at a particular instant in time (similar to select * from table where _hoodie_commit_time <= timeAsOf (commit_time))
Caveats/Open Items
- The number of versions to keep should match the number of commits the client wants to travel. Need a way to enforce this.
- Proposed approach pushed the client to perform some work and enforces some limitations
- Can only time-travel based on the commit times of a hudi dataset. The clients have to figure out a way to map the timestamp they want to travel against the commit time that matches closes to it.
- Clients have to get a list of the valid timestamps (hudi commit times) to time travel against
...