Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The system also have to be horizontally scalable. To achieve scalability, the data will be partitioned using hash or range partitioning method. Each partition is a chunk of parallelism.   The exact partitioning method is not important for the purpose of this document. We treat a data partition here as a synonym for a data shard. Each partition is assigned to a cluster node and replicated to a predefined number of replicas to achieve high availability. Adding more nodes increases a number of partitions in the cluster (or reduces a number of partitions per node), thus increasing the scalability. A distributed transaction protocol makes cross-partition transactions preserve ACID guaranties guaranties.

Note that a correct data partitioning is a key factor for cluster efficiency and scalability. 

Providing atomicity on a commit is an additional difficulty in distributed environment. Typically this is achieved by using two-phase commit protocol or it's improved consensus based version https://www.microsoft.com/en-us/research/uploads/prod/2004/01/twophase-revised.pdf.

...

We aim to reuse common replication infrastructure. This means partition records will be durably replicated using a consensus based protocol, like RAFT. This approach tolerates f failed nodes from n total nodes, where n >= 2f + 1. Other products can do better, for example FoundationDB tolerates f failed nodes from n, where n >= f + 1 (but the consensus is still required). A CC protocol is not tied to the underliying replication protocol.

...