Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The source will use the FLIP-27: Refactor Source Interface to to integrate it with Flink and support both bounded and unbounded jobs.

This proposal does not include any changes to existing public interfaces. A new MultiClusterKafkaSource builder will serve as the public API and all other APIs will be marked as Internal in this proposal

The new source will go into the Kafka connector module and follow any connector repository changes of Kafka Source.

An example of building the new Source in unbounded mode

Code Block
languagejava
titleHybridSource
MultiClusterKafkaSource.<String>builder()
  // some default implementations will be provided (file based, statically defined streams)
  .setKafkaMetadataService(new KafkaMetadataServiceImpl())
  .setStreamIds(List.of("my-stream-1", "my-stream-2"))
  .setGroupId("myConsumerGroup")
  .setDeserializer(KafkaRecordDeserializationSchema.valueOnly(StringDeserializer.class))
  .setStartingOffsets(OffsetsInitializer.earliest())
  .setProperties(properties)
  .build();


Basic Idea

There needs to exist a coordination mechanism to manage how the underlying KafkaSourceEnumerators and KafkaSources interact with multiple clusters and multiple topics. This design leverages the source event protocol to sync metadata between source components.

...