You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

This page summarizes our past feature proposals and discussions in Kafka Streams. Promoted ideas will be proposed as KIPs. 

Public API Improvements

Currently, the public API of Kafka Streams is not perfect. This is a summary of knows issues, and we want to collect user feedback to improve the API.

IssueUser Impact / ImportancePossible SolutionSolution User Impact
TopologyBuilder and KStreamBuilder
  • leak internal methods
  • no clean separation of abstractions

Might be hard for users to understand concept.

User might be confused by verbose API (and leaking methods) they should never see.

Importance: high

KIP-120

medium

  • need to use different imports
  • change pattern to create topology with KStreamBuilder
Too many overloads for method of KStreamBuilder, KStream, KGroupedStream, KTable, and KGroupedTableMany methods have more than 6 overloads and it's hard for the users to understand which one to use. Furthermore, with the verbose generics, compiler errors might be confusing and not helpful if a parameter is specified wrong (ie, I want to use overlaod X, does the compiler pick the correct overlaod? and if yes, which parameter did I get wrong? and if no, which parameter do I need to change to the compiler picks the correct overload?)

As we add more feature, this is getting more severe. 
Change to Builder Pattern

high

  • need to rewrite large parts of their code

 

Non consistent overloadsSome API have non-consistent overloaded methods that might be confusing to the user (why do I need to specify this for overload A, but not for overload B? – why does overload X allow me to do this, but not overload Y)Relates to "Too many overlaods" – could be resolved with a clean builder abstraction.

medium

  • user might need to rewrite parts of the code if we deprecate some confusing overloads
  • user code might get cleaner

 

DSL limits access to records and/or record metadata

Some interface like ValueJoiner only provide the values of both records to be joined, but user might want to read the key, too. For adding the key, we loose the guarantee, that the key is not modified though. (There are more similar examples, where the key is not accessible.)

Record metadata (like offset, timestamp, partition, topic) is not accessible in DSL interfaces.

Change interfaces, RichFunctions,

Use process/transform 

low

  • this is more about improving the API and/or adding new features

 

Missing public API

Some very helpful classes, that are currently in package internal could get added to public API. For example, windows and some serde classes.

Move classes to different package.

low

  • we only add new stuff

 

Window(s) API
  • get rid of minimum retention time (that is a performance improvement that confuses many users).
  • remove some leaking internal APIs
 low
Improve StreamsConfig APIAPI is verbose and with intermixed consumer and producer configs hard to use correctly.Builder pattern

medium

  • users need to rewrite the config code
ProcessorContext to verboseProcessorContext give access to method that cannot be called. This is hard to reason about for users.Split ProcessorContext and extract RecordContext

low

  • most user are expected to use mainly DSL
low-level API integration into DSLCurrently, low-level API is integrated into DSL via process()/transform() and transformValues(). Those abstraction are not perfectly defined and confusing to users.Complete redesign

medium

  • most user are expected to use mainly DSL
Low-level API in DSL vs. "advanced DSL"

Currently, low-level API is used to empower the user to do anything within DSL. This approach is questionable to some extends. For example, if a user wants to do a stateful 1:1 transformation of records, she must implement Transformer interface, thus has a lot of boiler plate code to access the actual state via the context and need to implement non related methods like punctuate(). A DSL method like statefulMap with interface #map(K key, V Value, S state) might be easier to use. The question is, if DSL can provide more DSL like methods to allow more advance compuations without forcing the user to to too low-level.

Major redesign

medium

  • it's about adding new method so existing code should not be affected

 

 

Many of the above issues are related to each other and/or overlap. This, also reflects in a bunch of JIRAs that are all related to API changes:

 

Thus, to tackle this issue, it seems to be a good idea to break it down into groups of issues, and do a KIP per group to get a overall sound design.

 

  • No labels