...
When MVCC coordinator node fails, a new one is elected among the live nodes – usually the oldest one.
The main goal of the MVCC coordinator failover is to restore an internal state of the previous coordinator in the new one. The internal state of MVCC coordinator consists of two main parts:
Due to Ignite partition map exchange design all write transactions should be finished before topology version is changed. Therefore there is no need to restore active transactions list on the new coordinator because all old transactions are either committed or rolled back during topology changing.
The only thing we have to do – is to recover the active queries list. We need this list to avoid old versions cleanup when there are any old queries are running over this old data because it could lead to query result inconsistency. When all old queries are done we can safely continue cleanup old versions.
To restore active queries at the new coordinator the MvccQueryTracker object was introduced. Each tracker is associated with a single query. The purpose of the tracker is:
Active queries list recovery on the new coordinator looks as follows:
Each read operation outside an active transaction or in scope of an optimistic transaction gets or uses a previously received Query Snapshot (which considered as read version for optimistic Tx. Note: optimistic transactions cannot be used in scope of DML operations).
...