You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Status

Current state: Under Discussion

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Since the early days of Apache Kafka the way to obtain information about a running cluster was to query zookeeper state. One of the pieces of information available from zookeeper is the timestamp of each kafka broker, indicating the time at which the broker started. This information is useful when building automation that provides functionality such as rolling restarts, to determine whether a broker has successfully restarted.

Public Interfaces

We propose adding an additional field ‘timestamp’ to the Node class that is returned in the DescribeClusterResult return value from AdminClient.describeCluster(). This would be a completely backwards compatible change and a logical evolution of the interface requiring no changes to existing code using AdminClient. A new method

public long timestamp();

would be introduced to the Node class returning the start time, in milliseconds since the start of the Unix Epoch, of the corresponding broker. If a client with this feature implemented connects to a cluster that doesn’t yet have this functionality implemented, the special value 0L would be returned.

Proposed Changes

The current unix timestamp is currently being written to the BrokerIdZNode in zookeeper but is not currently read back by kafka code. We propose to make the following changes to propagate this piece of information.

To complicate things a bit, the broker information that gets returned by the describeCluster() API call is read from the metadata cache on the broker responding to the request. To be able to provide the timestamp value, this information needs to be propagated from zookeeper to the metadata cache. This means that the UpdateMetadataBroker message that is part of the UpdateMetadataRequest message needs to be updated to include the timestamp as well as the Broker class, so that a version containing the timestamp is cached on all brokers.

The MetadataResponseBroker message part of the MetadataResponse message also needs to be updated to hold a timestamp field, as well as the Node class that is exposed by the AdminClient. An implication of these changes would be that the versions of the affected protocol message pairs affected would be incremented.

Compatibility, Deprecation, and Migration Plan

This is a completely backwards compatible extension of the existing API. The only compatibility consideration that needs to be taken into account is that a client with this change included connecting to an older cluster needs to handle this condition according to the public interface description above, with the timestamp() accessor returning the special value 0L.

Rejected Alternatives

It is certainly possible to use mechanisms outside of Kafka to determine when a broker was started, using for example the operating system process table. However, such solutions would be very specific to their execution environment and it would take a lot of work to have them perform similarly well as the solution outlined above.

  • No labels