You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Current »


Abstract



Background



System overview

Bytelake is a metadata management system compatible with Hive metastore, and can manage the data lake related metadata. It is designed to store metadata by using pluggable storage, which can be a distributed file system, kv store or a database.

Features

  • Pluggable: The storage layer is designed to be pluggable without depending on a particular system. It can be a distributed file system, kv store, database or any storage system.
  • Snapshot Management:
  • Concurreny Control: Each hudi table has a version which is atomic and self-increasing to archive snapshot isolation and optimistic concurrency control. The version is maintained in the metastore and persisted in the storage layer.
  • Multiple Conflict Resolution Strategies: conflict check strategy at partition, file group and column level.
  • Compatible with Hive:
  • Lightweight and Easy to expand: The metastore server is a stateless service that can be scaled horizontally.



Design




Implementation


  • No labels