Table of Contents

Status

Current state: Under Discussion Discarded in favour of KIP-112 and KIP-113

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

JIRA: here [Change the link from KAFKA-1 to your own ticket]

...

1. Define logs and partitions which were stored in the unavailable directory

2. Abort and pause all future cleaning for defined partitions

3. Update recovery checkpoints list to remove the respective directory3. Abort cleaning for defined partitions

4. Remove defined logs from the logs pool and update logDirs (so that scheduled jobs - kafka-log-retention, kafka-log-flusher and kafka-recovery-point-checkpoint are not executed on logs put to offline)

Note: currently scheduled jobs are not executed in lock and logs pool is not protected by lock, so with these changes data races are possible. It should be considered how changing jobs (executing them in lock) may affect performance.

Partitions Restart

Partitions restart means re-electing leader, in-sync replicas and assigned replicas so that partitions that were lost on some broker due to an IO error were re-replicated on that broker.

...

All edge cases (like new isr set is empty) are handled similarly to offlinePartionLeaderSelector.

Open questions

1. Disk availability check operation

...

Does it makes sense to retry operation before firing restart partitions?

Compatibility, Deprecation, and Migration Plan

No public interfaces changes. Users won't have to restart brokers on IO errors (e.g. after disk becomes unavailable).

...

Space shortcuts

Child pages

Versions Compared

Old Version 4

New Version Current

Key

Partitions Restart

Open questions

Compatibility, Deprecation, and Migration Plan

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 4

New Version Current

Key

Partitions Restart

Open questions

Compatibility, Deprecation, and Migration Plan