Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. As mentioned briefly in the 'Motivation' section, an alternative to enforcing the static partitioning condition would be to implement some kind of manual data migration via historical reprocessing. While this would enable the feature to be used for all applications, it would have extreme downsides and make the feature itself far less useful by requiring long downtimes, as well as being incredibly resource heavy. Thus we feel it's more practical to introduce the best version of the feature by imposing a constraint: static partitioning.
  2. A second important possibility to consider here is whether to open up the feature to work with applications that are stateless, rather than requiring strict static partitioning. Technically, stateless applications would continue to produce correct results in the face of repartitioning like this, since they by definition do not care about the key's history. However, as mentioned during the KIP discussion, we worry about users getting into trouble should we allow this, if for example the application feeds into a stateful application downstream. Thus we are leaning towards enabling autoscaling for static partitioning cases only (though future versions could expand to include stateless apps as well).
  3. In the 'Proposed Changes' section, we described at a high level how Streams will react to and handle partition expansion events. One of the trickiest aspects of this feature will be getting the synchronization right: in other words, what does Streams do when it detects changes to the partition count of some topics but the expansion has not yet percolated throughout the topology?
    1. One option is to end the rebalance immediately and continue processing until the last (input) topic change is detected. However since we can only detect changes to the partition count of input topics, this wouldn't provide a sure-fire method of ensuring that the entire topology has caught up to the new number of partitions. Furthermore, due to cooperative rebalancing Kafka Streams actually does continue to process during a rebalance, so all StreamThreads except for the leader – who will be waiting for the partition expansion to complete inside the StreamPartitionAssignor's callback – will be able to remain processing existing tasks during this time. So ending the rebalance and triggering a new one will, we believe, be more of a hassle than any benefit it might provide.
    2. Another option is to wait for some constant or configured amount of time and then shut down the entire application, rather than retrying for a while and then ending the rebalance.
    3. We could introduce another new config for the wait/retry time within a rebalance, however we feel that the existing api timeout config is sufficient and given that it is used elsewhere in the rebalance/assignment logic and that exceeding this timeout will not result in a full shutdown but rather a new rebalance attempt, it is of less importance to ensure the partition expansion completes within this timeout
  4. API alternatives:
    1. One possibility was to expose this explicitly via an 'enable/disable' feature flag, rather than implicitly via the static.partitioner.class config. However because of the static partitioning constraint we felt it was best to tie the feature's operation directly to this constraint. In addition to helping Streams ensure the user is doing it, adding a config for the default partitioner makes it considerably easier for the user to implement static partitioning, as the alternative would be to pass in the partitioner all over the topology and hope not to miss anywhere it's needed (which can be difficult to figure out).
    2. Another option we considered was whether or not to enforce that only the default partitioner passed in via config be used, or to allow it to be overridden – though only by other static partitioners – for specific operators. Although there's no technical reason to require the same partitioner be used across the topology, we believe it to be unlikely for inconsistent partitioning to be very common and easier for users to reason about a single partitioner. Of course, the partitioner implementation may itself incorporate multiple different strategies for different topics and/or parts of the topology, thus it is still possible to achieve the same outcome if truly desired.