Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Fix dead link of "DISSECTING ZAB"; UPDATE -> UPTODATE

...

Here are the types of the QuorumPackets:

type

zxid

data

notation

meaning

FOLLOWERINFO(11)

acceptedEpoch

LearnerInfo

FOLLOWERINFO(acceptedEpoch)

The follower has accepted epoch acceptedEpoch.

DIFF(13)

last committed zxid

n/a

DIFF(lastCommittedZxid)

lastCommittedZxid is the last zxid committed by the leader.

TRUNC(14)

truncZxid

n/a

TRUNC(truncZxid)

Truncate the history to truncZxid

SNAP(15)

lastZxid

n/a

SNAP

A state transfer (aka snapshot) will be sent to the follower. this will be a fuzzy state transfer that may include zxids being sent to the follower. The state transfer will immediately follow this packet.

OBSERVERINFO(16)

last zxid learned

LearnerInfo

OBSERVERINFO(lastZxid)

The observer has accepted up to lastZxid.

LEADERINFO(17)

proposed epoch

protocol

LEADERINFO(e)

The new proposed epoch

ACKEPOCH(18)

lastZxid

currentEpoch

ACKEPOCH(lastZxid, currentEpoch)

Acknowledge the acceptance of the new epoch. If the follower has already acknowledged the given epoch, it passes -1 as the currentEpoch to signal that the new epoch is not acknowledged. (We still need to send lastZxid for syncing if necessary)

NEWLEADER(10)

e << 32

n/a

NEWLEADER(e)

Accept this leader as the leader of the epoch e.

UPTODATE(12)

n/a

n/a

UPTODATE

The follower is now uptodate enough to begin serving clients.

PROPOSAL(2)

zxid of proposal

proposed message

PROPOSAL(zxid, data)

Propose a message. (Request that it be accepted into a followers history.)

ACK(3)

zxid of proposal to ack

n/a

ACK(zxid)

Everything sent to the follower by the leader up to zxid has been accepted into its history (logged to disk).

COMMIT(4)

zxid of proposal to commit

n/a

COMMIT(zxid)

Everything in the followers history up to zxid should be committed (aka delivered).

INFORM(8)

zxid of proposal

data of proposal

INFORM(zxid, data)

Deliver the data. (Only used with observers.)

Zab servers have the following state:

Name

Meaning

history

an on disk log of proposals accepted

lastZxid

zxid of the last proposal in the history

acceptedEpoch

the epoch number of the last NEWEPOCH packet accepted

currentEpoch

the epoch number of the last NEWLEADER packet accepted

Implementation assumptions

...

There are two noticeable differences between this description of phase 1 and the one found in http://research.yahoo.com/files/YL-2010-007.pdf:DISSECTING ZAB

  1. the leader does not sync with the most up-to-date follower
  2. the followers do not send their histories in the ACK of the NEWEPOCH message.

...

  1. l The leader does the following with each follower connected to it:
    1. adds the follower to the list of connections to send new proposals, so while the server is performing the next steps, it is queuing up any new proposals sent to the follower.
    2. does one of the following:
      • SNAP if the follower is so far behind that it is better to do a state transfer than send missing transactions.
      • TRUNC(zxid) if the follower has transactions that the leader has chosen to skip. The leader sets zxid to the last zxid in its history for the epoch of the follower. The leader then sends the transactions that the follower is missing.
      • DIFF if the leader is sending transactions that the follower is missing. The leader sends missing messages to the follower.
    3. sends a NEWLEADER(e);
    4. The leader releases any queued messages to the follower.
  2. f The follower syncs with the leader, but doesn't modify its state until it receives the NEWLEADER(e) packet. Once it receives NEWLEADER(e) it atomically applies the new state and sets f.currentEpoch = e. It then sends ACK(e << 32).
  3. l Once the leader has received an acknowledgements from a quorum of followers, it takes leadership of epoch e and queues UPTODATE to all followers.
  4. f When a follower receives the UPDATE UPTODATE message, it starts accepting client connections and serving new state.
  5. l Leader starts accepting connections from followers again. The variable nextZxid is set to (e << 32) + 1.

...