Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Currently, the first LeaderAndIsrRequest sent to a broker by a controller contains all TopicPartitions that a broker is a replica for. We will formalize this behavior by also including an is_every_partition flag to denote when this is one such request. When is_every_partition = true, the broker can reconcile its local state on disk with the request, and safely stage deletions for any partitions that are present on disk and are not contained in the request. This may include cases where a TopicPartition is not present in the LeaderAndIsrRequest, or it may be due to a topic partition containing a topic ID that does not match the local topic partition stored on the broker. Such reconciliation may also be necessary if is_every_partition = false, and the topic ID set on a partition does not match the ID contained in the request.

Deletions Statle partition deletions resulting from LeaderAndIsrRequests LeaderAndIsrRequest(s) will:

  1. Log a warning
  2. Move the partition's directory to log.dir/deleting/{topic_id}_{partition}.
  3. A final deletion event will be scheduled for X ms after the LeaderAndIsrRequest was first received. This will clear the deleting directory of the partition's files.contents

LeaderAndIsrResponse v3

LeaderAndIsr Response (Version: 3) => error_code [partitions]
  error_code => INT16
  partitions => topic topic_id* partition error_code
    topic => STRING
    topic_id* => UUID
    partition => INT32
    error_code => INT16

...

To avoid issues where requests are made to stale partitions, a topic_id field will be added to fence reads for from deleted topics. Note that the leader epoch is not sufficient for preventing these issues, as the leader epoch will be reset after a topic is deleted and recreated.

...

To avoid issues where requests are made to stale partitions, a topic_id field will be added to fence reads for from deleted topics.

ListOffsetsRequest v6

...

To avoid issues where requests are made to stale partitions, a topic_id field will be added to fence reads for from deleted topics.

OffsetForLeaderRequest v4

...

We need the changes to FetchRequest/ListOffsetRequest/OffsetsForLeaderEpochRequest described earlier above to make the above scenario safe. By including the topic ID in these requests, we can prevent a broker from accidentally replicating a stale version of the topicfrom a deleted topic with the same name.

Scenario 2:

  1. Broker B1 is a replica for A_p0_id0.
  2. Topic A id0 is deleted.
  3. B1 and has not does not receive a StopReplicaRequest for A_p0_id0.
  4. Topic A id1 is created.
  5. Broker B1 receives a LeaderAndIsrRequest containing partition A_p0_id1.

...