Table of Contents |
---|
This page is meant as a template for writing a KIP. To create a KIP choose Tools->Copy on this page and modify with your content and replace the heading with the next KIP number and a description of your issue. Replace anything in italics with your own description.
Status
Current state: Under DiscussionAccepted
Discussion thread: https://confluentlists.slackapache.comorg/archives/C021W3ULE2X/p1690905387380699thread/mwcljxdnyobthsszy4n2qr2tqcf9cxcf
JIRA: KMETA-1086
Jira | ||||||
---|---|---|---|---|---|---|
|
Motivation
Some users run KRaft controllers and brokers on the same machine (not containerized, but through tarballs, etc) and with ZooKeeper, they can independently start/. Prior to KRaft, when running ZooKeeper and Kafka on the same machine, users could independently stop the ZooKeeper controller node and Kafka broker since there were specific shell scripts for each (zookeeper-server-stop and kafka-server-stop, respectively).
However in KRaft mode, they can't start and stop the KRaft controller controllers independently from the Kafka brokerbrokers or any independent process because there is just a single script that doesn't distinguish between processes and signals all of them. We need to provide a way for users to kill either controllers or brokers, and even better, any individual process.
Public Interfaces
Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed. The purpose of this section is to concisely call out the public contract that will come along with this feature.
A public interface is any change to the following:
Binary log format
The network protocol and api behavior
Any class in the public packages under clientsConfiguration, especially client configuration
org/apache/kafka/common/serialization
org/apache/kafka/common
org/apache/kafka/common/errors
org/apache/kafka/clients/producer
org/apache/kafka/clients/consumer (eventually, once stable)
Monitoring
Command line tools and arguments
- Anything else that will likely break existing users in some way when they upgrade
Proposed Changes
The command line for stopping Kafka nodes will include a pair of optional and mutually exclusive parameters "[--process-role]" OR "[--node-id]" to support identifying a set of processes or a specific process to stop based on the contents of the node's configuration file.
Instead of simply running
Code Block |
---|
./bin/kafka-server-stop.sh |
The new script will accept an optional parameter [--process-role=value], with the value indicating whether to kill Kafka broker processes or controller processes.
OR
The new script will accept an optional parameter [--node-id=value], with the value indicating the specific process, identified by its node id, that the user wishes to kill.
Example 1: the new command to kill all the broker processes will look like:
Code Block |
---|
./bin/kafka-server-stop.sh --process-role=broker |
Example 2: the command to kill the process with node ID = 1 will look like:
Code Block |
---|
./bin/kafka-server-stop.sh --node-id=1 |
If both parameters are provided, the value for node-id parameter will take precedence, i.e, the process with node id specified will be killed, no matter what's the process role provided. Also, a log message will be shown to indicate that precedence.
When neither a "process-role" nor a "node-id" field is provided, the behavior remains unchanged -- all Kafka processes are stopped.
Proposed Changes
I'm changing the "kafka-server-stop.sh" file to accept an optional field, either "process-role" or "node-id". When the user specifies a process role or a node id, I will retrieve the absolute path to the configuration file, and search for the value of that field. If the value retrieved from the configuration file matches the input, that process will be killed, otherwise it will be skippedDescribe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgement based on the scope of the change.
Compatibility, Deprecation, and Migration Plan
- What impact (if any) will there be on existing users?
- If we are changing behavior how will we phase out the older behavior?
- If we need special migration tools, describe them here.
- When will we remove the existing behavior?
Test Plan
Describe in few sentences how the KIP will be tested. We are mostly interested in system tests (since unit-tests are specific to implementation details). How will we know that the implementation works as expected? How will we know nothing broke?
Rejected Alternatives
Existing users won't need to change any behavior if they want to continue killing both controller and broker processes. Otherwise, if they want to kill only the broker/controller, or an individual broker/controller, they'll need to specify a "process-role" or a "node-id" field in the script.
Test Plan
The change can be tested through command line.
Rejected Alternatives
One rejected alternatives is using an optional argument "[--required-config <name = value>]" to allow users to stop Kafka processes with any arbitrary name = value based on the node's configuration file. This argument gives the users freedom to choose any field in the configuration file to indicate the process they want to kill. However, upon discussion, we think this approach based on configs is a bit too open-ended and not very user friendly. Therefore, we decided to simply provide flags for the things a user may care about the most, i.e, two specific and mutually exclusive parameters process-role
and node-id
. We agree that it is more user-friendly than the --required-config
argument, and it comes at the possible expense of generalityIf there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.