You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Status

Current state:  "Under Discussion"

Discussion thread:

JIRAKAFKA-7362

Motivation

When partition reassignment removes topic partitions from a offline broker, those removed partitions become orphan partitions to the broker. When the offline broker comes back online, it is not able to clean up both data and folders that belong to orphan partitions.  Log manager will scan all log dirs during startup, but the time based retention policy on a topic partition will not be kicked out until the replicaHighWatermark  has been set. replicaHighWatermark of a partition is set when the broker becomes either leader or follower of the partition.  Orphan partitions will never have chance to get replicaHighWatermark set.  In addition, we do not have logic to delete folders that belong to orphan partition today.  This KIP provides a mechanism for brokers to remove orphan partitions automatically. 

Public Interfaces

  • No change in public Interface.  This feature will be turned on automatically.


Proposed Changes

a) Provide a mechanism to remove orphan partition automatically. 

The orphan partitions removal works in three phases. 

  1. Initialize phase
    During a broker startup, broker calculates the initial set of orphan partitions based on the partition information from the first leaderandISR request. 
  2.  timeout/correction phase (such as 2 hours of timeout, defined by an internal timer)
    During timeout phase, a broker updates its knowledge about partitions over time. The firstleaderandISR request the broker receives might be outdated (due to dual controllers, network partitions, etc.). However, during the timeout phase , the broker will receive more leaderandISR requests and use partitions information from leaderandISR requests to remove partitions that the broker is responsible for from the initial orphan partition set.   We rely on the fact that, until timeout timer expires, there is at least one valid leaderandISR for any given partition hosted by the broker. 

  3. Deletion phase.
    The broker removes orphan partitions (including partition folders) whose log segments are all older than the broker default retention period.  Broker will not distinguish between the log compacted topic and time-retention topics for those partitions in orphan partition set. The default retention period of a broker is used for all orphan partitions. Broker only removes orphan partitions whose log segments are all older than the default retention period. This is to ensure broker will not try to delete new data. If some orphan partitions cannot be removed immediately because the retention period has not been reached, a new deletion will be scheduled again to perform deletion.

b) Adding metrics to keep track of the number of orphan partitions and the size of these orphan partitions. 

  • kafka.log:type=LogManager,name=OrphanLogPartitionCount  
    type = gauge
    value = the number of orphan partitions

  • kafka.log:type=LogManager,name=OrphanLogPartitionSize
    type = gauge
    value = the size of orphan partitions

Compatibility, Deprecation, and Migration Plan

  • There is no compatibility issue.  

Rejected Alternatives

1 ) Manual deletion of orphan partitions via provided API.  Kafka provides an API that the user can specify what topic partitions that he wants to delete and what is the time retention rule. Kafka only remove a partition if all the following conditions are met:

  • The current broker is neither a follower nor a leader for the partition.
  • The partition satisfied the specify time retention rule. For example, all log segments are more than certain days old. 
  • The current broker has received the first leaderAndIsr request. 

The problem of this solution is it requires human interaction.  We need to use admin tool to find out what are orphan partitions and then use provided API to perform partition deletion.

  • No labels