Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

ScenarioUse CasesExpected Behavior / RequirementsAvailable Tooling

Step-by-Step Guideline/

Best Practice

Limitations/

Known Issues

External Resources
Data reprocessing from scratch
  • development and testing
  • rollback after bug fixes in production
  • A/B testing
  • demoing for customers or other stakeholders
  • replay for new business logic (Kappa architecture)
  • After running and stopping an application you want to reset your application back to "zero".
  • Thus, on restart, the application reprocesses your data the same way as in its original run (assuming that the original input data still exists in Kafka in its entirety).

 

Requirements:

  • Application must start consuming input topics from scratch (no committed offsets)
  • The application's internal state must be empty
  • Auto-created created topics must be empty (or deleted)
  1. stop all running application instances
  2. if required:
    1. delete and re-create output topics manually
    2. use different/new output topics
  3. run application reset tool
  4. before restart, make sure to call KafkaSteams#cleanUp() for each application instance
  • all data from input topics must still be available
    (i.e., no input data is lost due to log retention or compaction)
  • no support to handle output topics:
    • by default, new application run appends data to originally used output topics
    • manual fixed:
      • delete and recreate output topic manually
      • change application and use different/new output topics
Data reprocessing with specific starting point
(reprocessing from scratch; i.e., empty state) 
  • partial rollback after bug fixes in production
  • A/B testing

Similar to "Data Reprocessing from Scratch". However, instead of restarting the application at offsets zero, the user wants to specify a specific starting point.

 

Requirement:

  • Same as "Data Reprocessing from Scratch"
  • Allow user to specify a (valid/consistent) starting point (offsets?, timestamp?)

Missing:
API/tooling to set starting point. 

Similar to "Data reprocessing from scratch".

Manual workaround:

Use a consumer client to seek() to desired starting offsets and commit() than. This step must be done after the reset tool was used and before the application gets restarted.

  • see "Data Reprocessing from Scratch"

 

Data reprocessing using old application state
  • A/B testing with stateful start
  • rollback after bug fix in production (application was redeployed include a bug at time X, go back to X and reprocess data with fixed application)

Requirement:

  • New application needs (historical) state of old application at point X.
   
Processing cold data
  • development
  • A/B testing
  • processing cold/old/offline topics (i.e., process topics that do not have active producers)

  • application stops automatically after it processed all available data

Requirement:

  • application should have an auto-stop feature (KIP-95)
 

Workaround

Manual stop required at the moment:

  1. monitor consumer lag via bin/kafka-consumer-groups.sh
  2. when consumer lag is zero, stop application manually

 

  
Incremental processing (time driven)
  • "batch like" processing
  •  start start application in regular intervals (like cron job) and application automatically stops processing after a processing data for a specific time (wall-clock)
Not required.
  • Put a sleep() after application startup and close application after sleep-time passed. To make it robust for failure restart, sleep() should not get a hard coded parameter passed in, but rather the difference to endTime - startupTime.

or

  • Run app "forever" as for regular stream processing case and terminate application from outside when "stop time" is reached.
  • not very precise with regard to event-time processing (i.e., stopping point is not related to application progress)
 
Incremental processing (data driven)
  • "batch like" processing
  • start application in regular intervals (like cron job)
  • application stops automatically at some point
  • on application restart, it resumes from previous run
  • while application is running, new data might be appended to input topics

Requirement:

  •  application must have an auto-stop feature (KIP-95)
 

 Workaround

  • follow approach for "Incremental processing (time driven)"
  • processing elapse time must be shorter than startup interval (i.e., start processing each hour, processing takes less than an hour)
Offline application upgrade
  • application bug fixes / improvements in production
  • an application should be replaced with a newer version
  • new version resumes where old version left off
  • no reprocessing of old data
Not required.
  • stop all running application instances
  • start new version of your application (same application.id)

New and old application must be "compatible".

Compatible changes:

  • changing a filter condition
  • inserting a new filters/map (record-by-record operation)

Incompatible changes:

  • changing the structure of topology DAG
  • changing data types of stateful operations (like aggregations / joins)
  • works only if application downtime is acceptable
  • new application must have similar structure than old one
  • Only newly produced output is "fixed"
 

Online application upgrade

  • application bug fixes / improvements in production
  • downstream application consumer data live and are not interesting in "correcting" previous result (because computation happened already and there is no interest in "correcting" old stuff)
  • an application should be replaced with a newer version
  • new application is deployed in parallel
  • when the new application is "ready to take over", the old application is stopped
  • new application might start from an older offset and reprocess some data (w/ or w/o initial state)
    
Reprocessing of "historical" data
  • reprocess all data from yesterday / last week / April
  • "batch like" processing
  • old data should be reprocessed (new version of application or completely different application)
  • result must be exact with regard to even-time (i.e., not include any older data and also take late arrivals into account)
  • new result might replace old results (i.e., update downstream database)
    

...