You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 266 Next »

This page describes a proposed Kafka Improvement Proposal (KIP) process for proposing a major change to Kafka.

To create your own KIP, click on "Create" on the header and choose "KIP-Template" other than "Blank page".

Purpose

We want to make Kafka a core architectural component for users. We also support a large number of integrations with other tools, systems, and clients. Keeping this kind of usage health requires a high level of compatibility between releases — core architectural elements can't break compatibility or shift functionality from release to release. As a result each new major feature or public api has to be done in a way that we can stick with it going forward.

This means when making this kind of change we need to think through what we are doing as best we can prior to release. And as we go forward we need to stick to our decisions as much as possible. All technical decisions have pros and cons so it is important we capture the thought process that lead to a decision or design to avoid flip-flopping needlessly.

Hopefully we can make these proportional in effort to their magnitude — small changes should just need a couple brief paragraphs, whereas large changes need detailed design discussions.

This process also isn't meant to discourage incompatible changes — proposing an incompatible change is totally legitimate. Sometimes we will have made a mistake and the best path forward is a clean break that cleans things up and gives us a good foundation going forward. Rather this is intended to avoid accidentally introducing half thought-out interfaces and protocols that cause needless heartburn when changed. Likewise the definition of "compatible" is itself squishy: small details like which errors are thrown when are clearly part of the contract but may need to change in some circumstances, likewise performance isn't part of the public contract but dramatic changes may break use cases. So we just need to use good judgement about how big the impact of an incompatibility will be and how big the payoff is.

What is considered a "major change" that needs a KIP?

Any of the following should be considered a major change:

  • Any major new feature, subsystem, or piece of functionality
  • Any change that impacts the public interfaces of the project

What are the "public interfaces" of the project?

All of the following are public interfaces that people build around:

  • Binary log format
  • The network protocol and api behavior
  • Any class in the public packages under clients
    • org/apache/kafka/common/serialization

    • org/apache/kafka/common

    • org/apache/kafka/common/errors

    • org/apache/kafka/clients/producer

    • org/apache/kafka/clients/consumer (eventually, once stable)

  • Configuration, especially client configuration
  • Monitoring
  • Command line tools and arguments

Not all compatibility commitments are the same. We need to spend significantly more time on log format and protocol as these break code in lots of clients, cause downtime releases, etc. Public apis are next as they cause people to rebuild code and lead to compatibility issues in large multi-dependency projects (which end up requiring multiple incompatible versions). Configuration, monitoring, and command line tools can be faster and looser — changes here will break monitoring dashboards and require a bit of care during upgrades but aren't a huge burden.

For the most part monitoring, command line tool changes, and configs are added with new features so these can be done with a single KIP.

What should be included in a KIP?

A KIP should contain the following sections:

  • Motivation: describe the problem to be solved
  • Proposed Change: describe the new thing you want to do. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences, depending on the scope of the change.
  • New or Changed Public Interfaces: impact to any of the "compatibility commitments" described above. We want to call these out in particular so everyone thinks about them.
  • Migration Plan and Compatibility: if this feature requires additional support for a no-downtime upgrade describe how that will work
  • Rejected Alternatives: What are the other alternatives you considered and why are they worse? The goal of this section is to help people understand why this is the best solution now, and also to prevent churn in the future when old alternatives are reconsidered.

Who should initiate the KIP?

Anyone can initiate a KIP but you shouldn't do it unless you have an intention of getting the work done to implement it (otherwise it is silly).

Process

Here is the process for making a KIP:

  1. Create a page which is a child of this one. Take the next available KIP number and give your proposal a descriptive heading. e.g. "KIP 42: Allow Infinite Retention With Bounded Disk Usage".
  2. Fill in the sections as described above
  3. Start a [DISCUSS] thread on the Apache mailing list. Please ensure that the subject of the thread is of the format [DISCUSS] KIP-{your KIP number} {your KIP heading} The discussion should happen on the mailing list not on the wiki since the wiki comment system doesn't work well for larger discussions. In the process of the discussion you may update the proposal. You should let people know the changes you are making. When you feel you have a finalized proposal 
  4. Once the proposal is finalized call a [VOTE] to have the proposal adopted. These proposals are more serious than code changes and more serious even than release votes. The criteria for acceptance is lazy majority.
  5. Please update the KIP wiki page, and the index below, to reflect the current stage of the KIP after a vote. This acts as the permanent record indicating the result of the KIP (e.g., Accepted or Rejected). Also report the result of the KIP vote to the voting thread on the mailing list so the conclusion is clear.

KIP round-up

Next KIP Number: 97

Use this number as the identifier for your KIP and increment this value.

Adopted KIPs

KIPRelease
KIP-1 - Remove support of request.required.acks0.9.0.0
KIP-2 - Refactor brokers to allow listening on multiple ports and IPs0.9.0.0
KIP-3 - Mirror Maker Enhancement0.9.0.0
KIP-4 - Command line and centralized administrative operations 
KIP-4 - Metadata Protocol Changes0.10.0.0
KIP-8 - Add a flush method to the producer API0.9.0.0
KIP-11 - Kafka Authorizer design0.9.0.0
KIP-12 - Kafka Sasl/Kerberos and SSL implementation0.9.0.0
KIP-13 - Quota Design0.9.0.0
KIP-15 - Add a close method with a timeout in the producer0.9.0.0
KIP-16 - Automated Replica Lag Tuning0.9.0.0
KIP-19 - Add a request timeout to NetworkClient0.9.0.0
KIP-20 Enable log preallocate to improve consume performance under windows and some old Linux file system0.9.0.0
KIP-21 - Dynamic Configuration 
KIP-22 - Expose a Partitioner interface in the new producer0.9.0.0
KIP-25 - System test improvements0.9.0.0
KIP-26 - Add Kafka Connect framework for data import/export0.9.0.0
KIP-28 - Add a processor client0.10.0.0
KIP-31 - Move to relative offsets in compressed message sets0.10.0.0
KIP-32 - Add timestamps to Kafka message0.10.0.0
KIP-33 - Add a time based log index0.10.1.0
KIP-35 - Retrieving protocol version0.10.0.0
KIP-36 - Rack aware replica assignment0.10.0.0
KIP-38: ZooKeeper Authentication0.9.0.0
KIP-40: ListGroups and DescribeGroup0.9.0.0
KIP-41: Consumer Max Records0.10.0.0
KIP-42: Add Producer and Consumer Interceptors0.10.0.0
KIP-43: Kafka SASL enhancements0.10.0.0
KIP-45 - Standardize all client sequence interaction on j.u.Collection.0.10.0.0
KIP-50 - Move Authorizer to o.a.k.common package0.10.1.0
KIP-51 - List Connectors REST API0.10.0.0
KIP-52: Connector Control APIs0.10.0.0
KIP-55: Secure Quotas for Authenticated Users0.10.1.0
KIP-56: Allow cross origin HTTP requests on all HTTP methods0.10.0.0
KIP-57 - Interoperable LZ4 Framing0.10.0.0
KIP-58 - Make Log Compaction Point Configurable0.10.1.0
KIP-60 - Make Java client classloading more flexible0.10.1.0
KIP-62: Allow consumer to send heartbeats from a background thread0.10.1.0
KIP-63: Unify store and downstream caching in streams0.10.1.0
KIP-65: Expose timestamps to Connect0.10.1.0
KIP-67: Queryable state for Kafka Streams0.10.1.0
KIP-70: Revise Partition Assignment Semantics on New Consumer's Subscription Change0.10.1.0
KIP-71: Enable log compaction and deletion to co-exist0.10.1.0
KIP-73: Replication Quotas0.10.1.0
KIP-74: Add Fetch Response Size Limit in Bytes0.10.1.0
KIP-75 - Add per-connector Converters0.10.1.0
KIP-78: Cluster Id0.10.1.0
KIP-79 - ListOffsetRequest/ListOffsetResponse v1 and add timestamp search methods to the new consumer0.10.1.0
KIP-77: Improve Kafka Streams Join Semantics0.10.2.0
KIP-85: Dynamic JAAS configuration for Kafka clients0.10.2.0
KIP-89: Allow sink connectors to decouple flush and offset commit 

KIPs under discussion

KIPState
KIP-6 - New reassignment partition logic for rebalancingNeeds more detail
KIP-10 - Running Producer, Consumers and Brokers on MesosDiscuss
KIP-14 - Tools standardizationDiscuss
KIP-17 - Add HighwaterMarkOffset to OffsetFetchResponseDiscuss
KIP-18 - JBOD SupportDiscuss
KIP-23 - Add JSON/CSV output and looping options to ConsumerGroupCommandDiscuss
KIP-27 - Conditional PublishDiscuss
KIP-30 - Allow for brokers to have plug-able consensus and meta data storage sub systemsDiscuss
KIP-37 - Add Namespaces to KafkaDiscuss
KIP-39: Pinning controller to brokerDiscuss
KIP-44 - Allow Kafka to have a customized security protocolDiscuss
KIP-46 - Self HealingDiscuss
KIP-47 - Add timestamp-based log deletion policyDiscuss
KIP-48 Delegation token support for KafkaDiscuss
KIP-49 - Fair Partition Assignment StrategyDiscuss
KIP-53 - Add custom policies for reconnect attempts to NetworkdClientDiscuss
KIP-54: Sticky Partition Assignment StrategyDiscuss
KIP-59: Proposal for a kafka broker commandDiscuss
KIP-64 -Allow underlying distributed filesystem to take over replication depending on configurationDiscuss
KIP-66: Add Kafka Connect Transformers to allow transformations to messagesDiscuss
KIP-68 Add a consumed log retention before log retentionDiscuss
KIP-69 - Kafka Schema RegistryDraft
 KIP-72: Allow putting a bound on memory consumed by Incoming requests 
Discuss
KIP-76 Enable getting password from executable rather than passing as plaintext in config filesDiscuss
KIP-81: Bound Fetch memory usage in the consumerDiscuss
KIP-82 - Add Record HeadersDiscuss
KIP-83 - Allow multiple SASL authenticated Java clients in a single JVM processDiscuss
KIP-84: Support SASL SCRAM mechanismsDiscuss
KIP-86: Configurable SASL callback handlersDiscuss
KIP-87 - Add Compaction Tombstone FlagDiscuss
KIP-88: OffsetFetch Protocol UpdateDiscuss
KIP-90 - Remove zkClient dependency from StreamsDiscuss
KIP-91 Provide Intuitive User Timeouts in The ProducerDiscuss
KIP-92 - Add per partition lag metrics to KafkaConsumerDiscuss
KIP-93: Improve invalid timestamp handling in Kafka StreamsDiscuss
KIP-94 Session WindowsDiscuss
KIP-95: Incremental Batch Processing for Kafka StreamsDiscuss
KIP-96 - Add per partition metrics for in-sync and assigned replica countDiscuss
KIP-97: Improved Kafka Client RPC Compatibility PolicyDiscuss

Discarded KIPs

KIP Discussion Recordings

Date (link to recording)Summary
2016-10-19
  • KIP-82 - add record header: We agreed that there are use cases for third-party vendors building tools around Kafka. We haven't reached the conclusion whether the added complexity justifies the use cases. We will follow up on the mailing list with use cases, container format people have been using, and details on the proposal.
2016-09-13
  • KIP-54 (Sticky Partition Assignment): aims to minimise partition movement so that resource reinitialisation (e.g. caches) is minimised. It is partially sticky and partially fair. Some concerns around the fact that user code for partitionsRevoked and partitionsAssigned would have to be changed to work correctly with this assignment strategy. Good: more complex usage of an assigner that takes advantage of the user data field. Vahid will start the vote.

  • KIP-72 (Allow Sizing Incoming Request Queue in Bytes): large requests can kill the broker, no control over how much memory is allocated. Client quotas don't help as damage may already have been done by the time they kick in. There was a discussion on whether it was worth it to avoid the immediate return from select when there was no memory available in the pool. Radai will update the KIP to describe this aspect in more detail as well as the config validation that is performed.

  • KIP-79 (ListOffsetRequest/ListOffsetResponse v1 and add timestamp search methods to the new consumer): we discussed the option of passing multiple timestamps for the same partition in the same request. Becket thinks it's a rare use case and not worth supporting. Gwen said that it would be nice to have, but not essential. We talked about validation of duplicate topics. Becket will check the approach taken by the create topics request and evaluate if it can be adopted here too. PR will be available today and Jason will evaluate if it's feasible to include it in the next release once it's available.

2016-08-30
  • KIP48 (delegation tokens): Harsha will update the wiki with more details on how to use delegation tokens and how to configure it.
  • KIP-78 (cluster id): There was discussion on adding human readable tags later. No major concerns.
2016-08-23
  • time-based release: No one seems to have objections. Ismael will follow up with a release wiki.
  • KIP-4: We discussed having separate ACL requests of add and delete. No one seems to object to it. We discussed the admin client. Grant will send a PR. We discussed how KStream can use the ACL api.  It seems that we will need some kind of regex or namespace support in ACL to make the authorization convenient in KStream.
  • KIP-50: There is some discussion for further changes in the PR. Ashish will reply to the KIP email thread with the recommended changes. Ashish/Grant plan to look into whether it's possible to make the authorizer api change backward compatible. However, it seems that people are in general ok with a non-compatible api change.
  • KIP-74: No objections on the current proposal.
  • Java 7 support timeline: The consensus is to defer dropping the Java 7 support until the next major release (which will be next year). Ismael will follow up on the email thread.
  • KIP-48 delegation token : Ashish will ping Harsh to see if this is still active.
  • Some of the KIPs have been idle. Grant will send a proposal on tagging them properly (e.g., blocked, inactive, no resource, etc).
2016-05-24
  • KIP-58 - Make Log Compaction Point Configurable: We want to start with just a time-based configuration since there is no good usage for byte-based or message-based configuration. Eric will change the KIP and start the vote.
  • KIP-4 - Admin api: Grant will pick up the work. Initially, he plans to route the write requests from the admin clients to the controller directly to avoid having the broker forward the requests to the controller.
  • KIP-48 - Delegation tokens: Two of the remaining issues are (1) how to store the delegation tokens and (2) how token expiration works. Since Parth wasn't able to attend the meeting. We will follow up in the mailing list.
2016-04-05
  • KIP-4: There is a slight debate on the metadata request schema, as well as the internal ZK based implementation, which we will wait for Jun to comment on the mailing list thread.
  • KIP-52: We decided to start a voting process for this.
  • KIP-35: Decided on renaming ApiVersionQuery api to ApiVersion. Consensus on using the api in java client to only check for availability of current versions. ApiVersion api's versions will not be deprecated. Update KIP-35 wiki will be updated with latest info and vote thread will be initiated.
2016-03-15
  • KIP-33 - Add a time based log index to Kafka: We decided NOT to include this in 0.10.0 since the changes may have performance risks.
  • KIP-45 - Standardize all client sequence interaction on j.u.Collection: There is no consensus in the discussion. We will just put it to vote.
  • KIP-35 - Retrieving protocol version: This gets the longest discussion. There is still no consensus. Magnus thinks the current proposal of maintaining a global protocol version won't work and will try to submit a new proposal.
  • KIP-43 - Kafka SASL enhancements: Rajini will modify the KIP to only support native SASL mechanisms and leave the changes to Login and CallbackHandler to KIP-44 instead.
2016-02-23
  • KIP-33 and KIP-47: No issues. Will start the voting thread.
  • KIP-43: We discussed whether there is a need to support multiple SASL mechanisms at the same time and what's the best way to implement this. Will discuss this in more details in the email thread.
  • KIP-4: Grant gave a comprehensive summary of the current state. We have gaps on how to make the admin request block on the broker, how to integrate admin requests with ACL (especially with respect to client config changes for throttling and ACL changes), how to do the alter topic request properly. Grant will update the KIP with an interim plan and a long term plan.
  • KIP-43: We briefly discussed on to support multiple sasl mechanisms on the broker. Harsha will follow up with more details on the email thread.
  • Everyone seems to be in favor of making the next major release 0.10.0, instead of 0.9.1.
2016-01-26
  • KIP-42: We agreed to leave the broker side interceptor for another KIP. On the client side, people favor the 2nd option in Anna's proposal. Anna will update the wiki accordingly.
  • KIP-43: We discussed whether there is a need to support multiple SASL mechanisms at the same time and what's the best way to implement this. Will discuss this in more details in the email thread.
  • Jiangjie brought up an issue related to KIP-32 (adding timestamp field in the message). The issue is that currently there is no convenient way for the consumer to tell whether the timestamp in a message is the create time or the server time. He and Guozhang propose to use a bit in the message attribute to do that. Jiangjie will describe the proposal in the email thread.
2016-01-12
  • KIP-41: Discussed whether the issue of long processing time between poll calls is a common issue and whether we should revisit the poll api. Also discussed whether the number of records returned in poll calls can be made more dynamic. In the end, we feel that just adding a config that controls the number records returned in poll() is the simplest approach at this moment.
  • KIP-36: Need to look into how to change the broker JSON representation in ZK w/o breaking rolling upgrades. Otherwise, ready for voting.
2015-10-20
  • KIP-38: No concerns with this KIP. Flavio will initiate the voting on this.
  • KIP-37: There are questions on how ACL, configurations, etc will work, and whether we should support "move" or not. We will discuss the details more in the mailing list.
  • KIP-32/KIP-33: Jiangjie raised some concerns on the approach that Jay proposed. Guozhang and Jay will follow up on the mailing list.
2015-10-13
  • 0.9.0 release: We discussed if KAFKA-2397 should be a blocker in 0.9.0. Jason and Guozhang will follow up on the jira.
  • KIP-32 and KIP-33: We discussed Jay's alternative proposal of just keeping CreateTime in the message and having a config to control how far off the CreateTime can be from the broker time. We will think a bit more on this and Jiangjie will update the KIP wiki.
  • KIP-36: We discussed an alternative approach of introducing a new broker property to designate the rack. It's simpler and potentially can work in the case when the broker to rack mapping is maintaining externally. We need to make sure that we have an upgrade plan for this change. Allen will update the KIP wiki
2015-10-06
  • We only had the time to go through KIP-35. The consensus is that we will add a BrokerProtocolRequest that returns the supported versions for every type of requests. It's up to the client to decide how to use this. Magnus will update the KIP wiki with more details.
2015-09-22
  • KIP-31: Need to figure out how to evolve inter.broker.protocol.version with multiple protocol changes within the same release, mostly for people who are deploying from trunk. Becket will update the wiki.
  • KIP-32/KIP-33: Having both CreateTime and LogAppendTime per message adds significant overtime. There are a couple of possibilities to improve this. Becket will follow up on this.
  • LinkedIn has been testing SSL in MirrorMaker (SSL is only enabled in the producer). So far, MirrorMaker can keep up with the load. LinkedIn folks will share some of the performance results.
2015-09-14
  • KIP-28: Discussed the improved proposal including 2 layers of API (the higher layer is for streaming DSL), and stream time vs processor time. Ready for review.
  • KIP-31, KIP-32: (1) Discussed whether the timestamp should be from the client or the broker. (2) Discussed the migration path and whether this requires all consumers to upgrade before the new message format can be used. (3) Since this is too big a change, it will NOT be included in 0.9.0 release. Becket will update the wiki.
2015-08-18
  • client-side assignment strategy: We discussed concerns about rebalancing time due to metadata inconsistency, especially when lots of topics are subscribed. Will discuss a bit more on the mailing list. 
  • CopyCat data api: The discussions are in KAFKA-2367 for people who are interested.
  • 0.8.2.2: We want to make this a low risk bug fix release since 0.8.3 is coming. So, will only include a small number of critical and small fixes.
  • 0.8.3: The main features will be security and the new consumer. We will be cutting a release branch when the major pieces for these new features have been committed.
2015-08-11
 
  • No labels