Table of Contents

This page is meant as a template for writing a KIP. To create a KIP choose Tools->Copy on this page and modify with your content and replace the heading with the next KIP number and a description of your issue. Replace anything in italics with your own description.

Status

Current state: Under DiscussionAccepted

Discussion thread: Image Removedhttps://confluentlists.slackapache.comorg/archives/C021W3ULE2X/p1690905387380699thread/mwcljxdnyobthsszy4n2qr2tqcf9cxcf

JIRA: KMETA-1086

Jira

server	ASF JIRA
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b
key	KAFKA-15471

Motivation

Some users run KRaft controllers and brokers on the same machine (not containerized, but through tarballs, etc) and with ZooKeeper, they can independently start/. Prior to KRaft, when running ZooKeeper and Kafka on the same machine, users could independently stop the ZooKeeper controller node and Kafka broker since there were specific shell scripts for each (zookeeper-server-stop and kafka-server-stop, respectively).

However in KRaft mode, they can't start and stop the KRaft controller controllers independently from the Kafka brokerbrokers or any independent process because there is just a single script that doesn't distinguish between processes and signals all of them. We need to provide a way for users to kill either controllers or brokers, and even better, any individual process.

Public Interfaces

The command line for stopping Kafka nodes will include a pair of optional and mutually exclusive parameters "[--process-role]" OR "[--node-id]" to support identifying a set of processes or a specific process to stop based on the contents of the node's configuration file.

killing controllers/brokers with change. Instead of simply running

Code Block
./bin/kafka-server-stop.sh

Binary log format
The network protocol and api behavior
Any class in the public packages under clientsConfiguration, especially client configuration
- org/apache/kafka/common/serialization
- org/apache/kafka/common
- org/apache/kafka/common/errors
- org/apache/kafka/clients/producer
- org/apache/kafka/clients/consumer (eventually, once stable)
Monitoring
Command line tools and arguments
Anything else that will likely break existing users in some way when they upgrade

Proposed Changes

The new script will accept an optional parameter [--process-role=value], with the value indicating whether to kill Kafka broker processes or controller processes.

OR

The new script will accept an optional parameter [--node-id=value], with the value indicating the specific process, identified by its node id, that the user wishes to kill.

Example 1: the new command to kill all the broker processes will look like:

Code Block
./bin/kafka-server-stop.sh --process-role=broker

Example 2: the command to kill the process with node ID = 1 will look like:

Code Block
./bin/kafka-server-stop.sh --node-id=1

If both parameters are provided, the value for node-id parameter will take precedence, i.e, the process with node id specified will be killed, no matter what's the process role provided. Also, a log message will be shown to indicate that precedence.

When neither a "process-role" nor a "node-id" field is provided, the behavior remains unchanged -- all Kafka processes are stopped.

Proposed Changes

I'm changing the "kafka-server-stop.sh" file to accept an optional field, either "process-role" or "node-id". When the user specifies a process role or a node id, I will retrieve the absolute path to the configuration file, and search for the value of that field. If the value retrieved from the configuration file matches the input, that process will be killed, otherwise it will be skippedDescribe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgement based on the scope of the change.

Compatibility, Deprecation, and Migration Plan

What impact (if any) will there be on existing users?
If we are changing behavior how will we phase out the older behavior?
If we need special migration tools, describe them here.
When will we remove the existing behavior?

Test Plan

Describe in few sentences how the KIP will be tested. We are mostly interested in system tests (since unit-tests are specific to implementation details). How will we know that the implementation works as expected? How will we know nothing broke?

Rejected Alternatives

Existing users won't need to change any behavior if they want to continue killing both controller and broker processes. Otherwise, if they want to kill only the broker/controller, or an individual broker/controller, they'll need to specify a "process-role" or a "node-id" field in the script.

Test Plan

The change can be tested through command line.

Rejected Alternatives

One rejected alternatives is using an optional argument "[--required-config <name = value>]" to allow users to stop Kafka processes with any arbitrary name = value based on the node's configuration file. This argument gives the users freedom to choose any field in the configuration file to indicate the process they want to kill. However, upon discussion, we think this approach based on configs is a bit too open-ended and not very user friendly. Therefore, we decided to simply provide flags for the things a user may care about the most, i.e, two specific and mutually exclusive parameters process-role and node-id . We agree that it is more user-friendly than the --required-config argument, and it comes at the possible expense of generalityIf there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.

Space shortcuts

Child pages

Versions Compared

Old Version 4

New Version Current

Key

Status

Motivation

Public Interfaces

Proposed Changes

Proposed Changes

Compatibility, Deprecation, and Migration Plan

Test Plan

Rejected Alternatives

Test Plan

Rejected Alternatives

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 4

New Version Current

Key

Status

Motivation

Public Interfaces

Proposed Changes

Proposed Changes

Compatibility, Deprecation, and Migration Plan

Test Plan

Rejected Alternatives

Test Plan

Rejected Alternatives