Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagejava
titleRepartitioned.java
public class Repartitioned<K, V> implements NamedOperation<Repartitioned<K, V>> {

	protected final String name;

	protected final Serde<K> keySerde;

	protected final Serde<V> valueSerde;

	protected final Integer numOfPartitions;

	protected final StreamPartitioner<? super K, ? super V> partitioner;

	private Repartitioned(String name,
	                      Serde<K> keySerde,
	                      Serde<V> valueSerde,
	                      Integer numOfPartitions,
	                      StreamPartitioner<? super K, ? super V> partitioner) {
		this.name = name;
		this.keySerde = keySerde;
		this.valueSerde = valueSerde;
		this.numOfPartitions = numOfPartitions;
		this.partitioner = partitioner;
	}

	public static Repartitioned as(final String name) {
		return new Repartitioned<>(name, null, null, null, null);
	}

	@Override
	public Repartitioned<K, V> withName(final String name) {
		return new Repartitioned<>(name, keySerde, valueSerde, numOfPartitions, partitioner);
	}

	public Repartitioned<K, V> withNumOfPartitions(final int numOfPartitions) {
		return new Repartitioned<>(name, keySerde, valueSerde, numOfPartitions, partitioner);
	}

	public Repartitioned<K, V> withKeySerde(final Serde<K> keySerde) {
		return new Repartitioned<>(name, keySerde, valueSerde, numOfPartitions, partitioner);
	}

	public Repartitioned<K, V> withValueSerde(final Serde<V> valueSerde) {
		return new Repartitioned<>(name, keySerde, valueSerde, numOfPartitions, partitioner);
	}

	public Repartitioned<K, V> withStreamPartitioner(final StreamPartitioner<? super K, ? super V> partitioner) {
		return new Repartitioned<>(name, keySerde, valueSerde, numOfPartitions, partitioner);
	}
}


New KStream#groupBy shall be introduced in order to give the user control over parallelism for sub-topologies.

...

Code Block
languagejava
titleKStream.java
public interface KStream<K, V> {

	// .. //
	
	/**
	 * Kafka Streams creates and manages repartition topic. Partitioner and num of partitions configuration will be inherited from source topic.
	 */
	KStream<K, V> repartition();
	
	/**
	 * Kafka Streams creates and manages repartition topic. Partitioner and num of partitions configuration will be inherited from source topic.
	 */
	<KR> KStream<KR, V> repartition(final KeyValueMapper<? super K, ? super V, KR> selector);

	/**
	 * Kafka Streams creates and manages repartition topic. Partitioner and num of partitions configuration is applied according to {@param repartitioned} specification.
	 */
	KStream<K, V> repartition(final Repartitioned<K, V> repartitioned);

	/**
	 * Kafka Streams creates and manages repartition topic. Partitioner and num of partitions configuration is applied according to {@param repartitioned} specification.
	 */
	<KR> KStream<KR, V> repartition(final KeyValueMapper<? super K, ? super V, KR> selector,
	                                final Repartitioned<KR, V> repartitioned);
	// .. //
}

...

1) If we enhance Produced class with this configuration, this will also affect KStream#to. Since KStream#to is the final sink of the topology, it seems to be reasonable assumption that user needs to manually create sink topic in advance. And in that case, having num of partitions configuration doesn’t make much sense.

2) Looking at Produced class, based on API contract, seems like Produced is designed to be something that is explicitly for producer (key serializer, value serializer, partitioner those all are producer specific configurations) and num of partitions is topic level configuration. And mixing topic and producer level configurations together in one class doesn't look semantically correct.3) Looking at KStream interface, seems like Produced class is for operations that work with non-internal (e.g topics that are created and managed internally by Kafka Streams) topics and this contract will break if we gonna enhance it with number of partitions configuration. Current API contract is very clear in that respect - every operation (KStream#to, KStream#through) that use Produced class work with topics that are explicitly managed by the end user.

...