ID	IEP-91
Author	Alexey Scherbakov
Sponsor	Alexey Scherbakov
Created	24 May 2022
Status	DRAFT

If I have seen further it is by standing on ye sholders of Giants

Isaac Newton

Motivation

One of the major features of AI3, as a distributed database, is the ability to execute multiple table operations as single atomic operation, known as transaction. We need to design modern and robust distributed transaction protocol, taking into account current best practices. Both key-value and SQL database access methods will rely upon it. Comparing to AI2, we aim to support transactional SQL from the beginning and remove limitations like size of transaction.

Definitions

In this section I'll give some definitions encountered though the text, for easier understanding.

Record (aka Row, Tuple, Relation) - a collection of attribute-value pairs.

Transaction - a sequence of logically related partially ordered actions (reads or writes) over the database objects.

Atomicity - a transaction property which declares: either all actions are carried out or none are.

Consistency - a property which moves a database from one consistent state to another after finish. A meaning of the consistent state is defined by a user.

Isolation - a measure of mutual influence between interleaved transactions.

Durability - a transaction property which guarantees that database state remains unchanged after a transaction is committed, despite any failures.

Schedule - a way of executing interleaved transactions.

Serial schedule - a schedule where all transactions are executed sequentially.

Serializable schedule - a schedule which is equivalent to some serial execution of interleaved transactions.

Concurrency control (CC) - a technique to preserve database consistency in case of interleaved transactions.

Multi-version concurrency control (MVCC) - a family of concurrency control techniques based on writing multiple record versions (copy-on-write).

Recoverable schedule - a schedule which is not affected by aborting some of involved transactions. A transaction reads only committed values to achieve this.

Interactive transaction - a transaction whose operation set is not known apriory. Can be aborted at any time, if not committed yet.

Cascading abort - a situation in which the abort of one transaction causes the abort of another dependent transaction to avoid inconsistency.

Design Goals

To define key points of the protocol design, let's look at some features, which can be provided by the product, and value them from 1 to 3, where 3 means maximum importance for product success.

Strong transaction isolation
ascading aborts avoidance
Support for interactive transactions
Avoid tx restarts
Long lived lightweight read-only transactions
Consistent replica reads
Optimized for fast path execution
Geo-distribution friendly when replicas are in different regions
Unlimited or very large transaction size
Transactional DDL
How many node failures we can tolerate without data loss

Let's take a look at each feature in detail and give it a value.

Strong transaction isolation

Here we take into account the isolation property of a transaction. The strongest isolation is known to be Serializable, implying all transactions pretend to execute sequentially. This is very convenient to a user, because it prevents hidden data corruptions https://pmg.csail.mit.edu/papers/adya-phd.pdf and avoid security issues http://www.bailis.org/papers/acidrain-sigmod2017.pdf. The price for this can be reduced throughput/latency due to increased overhead from CC protocol. Another options is to allow a user to choose a weaker isolation level, like SNAPSHOT. The ultimate goal is to implement Serializability without sacrificing performance too much, having Serializable as default isolation level. I measure it with 2

Cascading aborts avoidance

This is a useful thing to have, reducing the number of transaction restarts. I measure it with 1

Support for interactive transactions

This is the most intuitive way to use transactions. I measure it with 3

Restart avoidance

This is a general property of a transactional protocol, defining how many transactions will be restarted, causing a work loss, in case of serialization conflict. For example, optimistic CC causes more frequent restarts, because a conflict check is delayed until commit. I measure it with 1

Read-only long lived transactions

Such transactions can be used to build complex OLAP reports, without affecting concurrent OLTP load. Any SQL read query is naturally mapped to this type of a transaction. Very useful feature, I measure it with 3

Consistent replica reads

Very useful feature for load-balancing. I measure it with 3

Optimized for fast path execution (short transactions, low contention, whatever ?)

- 1

Geo-distribution friendly when replicas are in different regions - reduce a number of cross region IO

- 2

Unlimited or very large transaction size

- 3

Transactional DDL

- 1

How many node failures we can tolerate without data loss

- 1

There are two main things - CC and atomic commitment.

High level overview

Description

// Provide the design of the solution.

Risks and Assumptions

// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.

Discussion Links

// Links to discussions on the devlist, if applicable.

Reference Links

// Links to various reference documents, if applicable.

Tickets

// Links or report with relevant JIRA tickets.

Page tree

IEP-91: Transaction protocol