We have seen a lot of interest for an efficient and reliable solution to provide the mutation and transaction capability into the data lakes. In the data lake, it is very common that users generate reports based on a single set of data. As various types of data flow into data lake, the state of data cannot be immutable. Various use cases requiring mutating data includes data changes with time, late arriving data, balancing real time availability and backfilling, state changing data like CDC, data snapshotting, data cleansing etc, While generating reports, these will result in write/update the same set of tables.


Read the complete blog here.

  • No labels