You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 19 Next »

IDIEP-91
AuthorAlexey Scherbakov 
SponsorAlexey Scherbakov 
Created

 

Status

DRAFT


If I have seen further it is by standing on ye sholders of Giants

Isaac Newton

Motivation

One of the major features of AI3, as a database, is the ability to execute multiple table operations as single atomic operation, known as transaction. We need to design modern and robust transaction protocol, taking into account current best practices. Both key-value and SQL database access methods will rely upon it. Comparing to AI2, we aim to support transactional SQL from the beginning and remove limitations like size of transaction. Our transactions may span several nodes in the cluster, making them distributed.

Definitions

In this section I'll give some definitions encountered though the text, for easier understanding.

Record (aka Row, Tuple, Relation) - a collection of attribute-value pairs.

Transaction - a sequence of logically related partially ordered actions (reads or writes) over the database objects.

Atomicity - a transaction property which declares: either all actions are carried out or none are.

Consistency - a property which moves a database from one consistent state to another after finish. A meaning of the consistent state is defined by a user.

Isolation - a measure of mutual influence between interleaved transactions.

Durability - a transaction property which guarantees that database state remains unchanged after a transaction is committed, despite any failures.

Schedule - a way of executing interleaved transactions.

Serial schedule - a schedule where all transactions are executed sequentially.

Serializable schedule - a schedule which is equivalent to some serial execution of interleaved transactions.

Concurrency control (CC) - a technique to preserve database consistency in case of interleaved transactions.

Multi-version concurrency control (MVCC) - a family of concurrency control techniques based on writing multiple record versions (copy-on-write).

Recoverable schedule - a schedule which is not affected by aborting some of involved transactions. A transaction reads only committed values to achieve this.

Interactive transaction - a transaction whose operation set is not known apriory. Can be aborted at any time, if not committed yet.

Cascading abort - a situation in which the abort of one transaction causes the abort of another dependent transaction to avoid inconsistency.

Design Goals

To define key points of the protocol design, let's look at some features, which can be provided by the product, and value them from 1 to 3, where 3 means maximum importance for product success.

  1. Strong transaction isolation
  2. Avoid cascading aborts
  3. Support for interactive transactions
  4. Avoid tx restarts
  5. Long lived lightweight read-only transactions
  6. Consistent replica reads
  7. Optimized for fast path execution
  8. Geo-distribution friendly when replicas are in different regions
  9. Unlimited or very large transaction size
  10. Transactional DDL
  11. How many node failures we can tolerate without data loss

Let's take a look at each feature in detail and give it a value.

Strong transaction isolation

Here we take into account the isolation property of a transaction. The strongest isolation is known to be Serializable, implying all transactions pretend to execute sequentially. This is very convenient to user because can prevent hidden data corruptions and avoid security issues TBD link to paper. The price for this can be reduced throughput/latency due to increased overhead from CC protocol. Another options is to allow a user to choose multiple isolation levels. I measure it with 2.

Avoid cascading aborts

- 1

Support for interactive transactions

- 3

Avoid tx restarts (serialization conflicts, unstable topology)

- 1

Long lived lightweight read-only transactions (enough to build some complex report - several minutes duration maybe - good for OLAP cases - guaranteed to commit on a stable topology)

- 3

Consistent replica reads

- 3

Optimized for fast path execution (short transactions, low contention, whatever ?)

- 1

Geo-distribution friendly when replicas are in different regions - reduce a number of cross region IO

- 2

Unlimited or very large transaction size

- 3

Transactional DDL

- 1

How many node failures we can tolerate without data loss

- 1

Description

// Provide the design of the solution.

Risks and Assumptions

// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.

Discussion Links

// Links to discussions on the devlist, if applicable.

Reference Links

// Links to various reference documents, if applicable.

Tickets

// Links or report with relevant JIRA tickets.

  • No labels