Currently we maintain per-topic configuration in our server.properties configuration file. This unfortunately requires bouncing the server every time you make a change. This wiki is a proposal for making some of these configurations dynamic.
Scope
The proposed scope is just for per-topic configurations. One could argue that we should do this globally for all configuration. I think that is not wise. The reason is because we currently assume configurations are immutable and pass them around in plain scala variables. This immutability is really nice. Many things cannot be changed dynamically (i.e. socket and other i/o buffer sizes) and for other things making them dynamic is just really hard.
I would argue that having server-level defaults be statically configured is not a major problem, as these change rarely. Furthermore configuration management systems that maintain versions, track changes, handle permissions and notifications only work with text files so moving away from this is not necessarily a good thing.
However maintaining topic-level settings in this way is a huge pain. These are set potentially every time you add a topic, and with hundreds of topics and users there are lots of changes. So having these all in a giant properties file on every server and bouncing each time is not a good solution. This proposal is just to move topic-level configuration out of the main server configuration.
Per-Topic Settings
The current pattern is that we maintain default settings and sometimes have per-topic overrides. Here are the relevant settings:
Default Setting |
Per-Topic Setting |
Notes |
---|---|---|
log.segment.bytes |
log.segment.bytes.per.topic |
|
log.roll.hours |
log.roll.hours.per.topic |
|
log.retention.hours |
log.retention.hours.per.topic |
|
log.retention.bytes |
log.retention.bytes.per.topic |
|
log.cleanup.policy |
topic.log.cleanup.policy.per.topic |
in KAFKA-631 |
log.cleaner.min.cleanable.ratio |
|
in KAFKA-631 |
log.index.interval.bytes |
|
|
log.flush.interval.messages |
|
|
log.flush.interval.ms |
log.flush.interval.ms.per.topic |
|
log.compression.type |
|
proposed |
Proposal
The proposed approach is that we no longer have the "per.topic" version of configurations. Instead the server will be configured with a default value for each of these settings. When a topic is created the user can either specify the value for each setting, or, if they don't they will inherit the default. In this proposal each topic will have a complete copy of the configuration with any missing values taking whatever was the default at the time the topic was created.
Already in KAFKA-631 Log.scala has been changed so that it takes a single LogConfig argument that holds all configurations. As part of this proposal we will add a new setter for updating this config. Log has been changed so that it no longer maintains a local copy of any of these values, so swapping in a new config object
The configuration itself will be maintained in zookeeper.