Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Version Coordinator recovery

When MVCC coordinator node fails, a new one is elected among the live nodes – usually the oldest one.

The main goal of the MVCC coordinator failover is to restore an internal state of the previous coordinator in the new one. The internal state of MVCC coordinator consists of two main parts:

  • Active transactions list.
  • Active queries list.

Due to Ignite partition map exchange design all write transactions should be finished before topology version is changed. Therefore there is no need to restore active transactions list on the new coordinator because all old transactions are either committed or rolled back during topology changing.

The only thing we have to do – is to recover the active queries list. We need this list to avoid old versions cleanup when there are any old queries are running over this old data because it could lead to query result inconsistency. When all old queries are done we can safely continue cleanup old versions.

To restore active queries at the new coordinator the MvccQueryTracker object was introduced. Each tracker is associated with a single query. The purpose of the tracker is:

  • To mark each query with an unique id for a solid query tracking.
  • To hold a query MVCC snapshot.
  • To report to the new MVCC coordinator about the associated active query in case of old coordinator failure.
  • To send acks to the new coordinator when the associated query is completed.

Active queries list recovery on the new coordinator looks as follows:

  1. When old coordinator fails, an exchange process started and the new coordinator is elected.
  2. During this process each node sends a list of active query trackers to the new coordinator.
  3. New coordinator combine all those lists to the global one.
  4. When an old query finishes, the associated query tracker sends an ack to the new coordinator.
  5. Coordinator removes this tracker from the global list when ack is received.
  6. When global list becomes empty, this means that all old queries are done and we do not have to hold old date versions in our store – cleanup process begins.

Read (get and SQL)

Each read operation outside an active transaction or in scope of an optimistic transaction gets or uses a previously received Query Snapshot (which considered as read version for optimistic Tx. Note: optimistic transactions cannot be used in scope of DML operations).

...