Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For use case (2), the current KafkaSource cannot support this since automatic partition removal is not supported. First, users cannot delete topics from Kafka directly since that would break the Flink jobs referring to the deleted topics. Second, users would need to change Kafka Source uid and redeploy with allowNonRestoreState to remove a Kafka topic from Flink subscription. In addition, the mailing list feedback describes difficulties to remove a topic since there is a risk of accidental data loss due to human operations. These issues are highlighted for users who maintain multiple Flink jobs, in especially latency sensitive applications.

...

Other required functionality leverages and composes the existing KafkaSource implementation for discovering Kafka topic partition offsets and round robin split assignment per Kafka cluster. The diagrams below depicts how the source will reuse the code of the KafkaSource in order to achieve the requirements.

Image Added

To the source more user friendly, a MultiClusterKafkaSourceBuilder will be provided (e.g. batch mode should not turn on KafkaMetadataService discovery, should only be done at startup).

...