Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Metadata is first discovered on job start and new metadata is discovered via polling mechanism (see the source configuration in next section) for streaming jobs. The metadata update is designed with eventual consistency–repeated metadata polling will eventually get the correct metadata and reconcile the job accordingly. For batch jobs, this polling mechanism should be disabled.

Image Modified

NOTE: there is a need from the readers to send the GetMetadataUpdateEvent at startup because the reader state may reflect outdated metadata. Thus, the reader should not start without fresh metadata. With fresh metadata, the reader can filter splits from state--this filtering capability is ultimately how we solve the common issue of "I re-configured my Kafka source and removed some topic, but it refers to the old topic due to state".

...