Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

Current state:   "Under DiscussionAccepted"

Discussion threadhttp://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discuss-Proposing-FLIP-25-Support-User-State-TTL-Natively-in-Flink-td20912.html#a22097

JIRA

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-3089

Released: -1.6

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

  • At time 0, a keyed user state A is created with A.lastModifiedTs=0, a timer is registered for A at time 16
  • At time 2, A’s value is updated, and thus update A.lastModifiedTs=2
  • At time 15, A’s value is read, and thus updateA.lastModifiedTs=15
  • At time 16, the registered timer is invoked, it sees A.lastModifiedTs=15, so it does nothing but registers a new timer at (15+16) = 31

Pros and Cons:

Pros:

...


  1. TTL implementation is independent of state backends, it only relies on the abstraction of TimerService
  2. Because of (1), migration and compatibility can reuse existing solution
  3. This design will support both event time and processing time natively
  4. It has at most one timer for each keyed user state with TTL all the time

...

  1. Where should the code live in? Should it live with operators, state backends, or both? We may also need something in operators or state backends (e.g. operators/StreamingRuntimeContext/KeyedStateStore) to have access to TimerService

    We may need to experiment and prototype before finalizing this decision.

  2. How to distinguish TTL timers from user timers so that user timers are always invoked before TTL timers?

    Upon discussion, we believe that the user timer should always be invoked first. We need a good strategy to ensure that, when a user timer and a TTL timer are configured for the same time, user timer is always triggered first. Currently, timer service does not support timer tags. We have two kinds of timers now and we need a better way of maintaining both of them. 

    Option1: We can add a tag for each timer to InternalTimer

    Option2: let InternalTimerService maintains user timers and TTL timers separately. Implementation classes of InternalTimerService should add two new sets of timers,  e.g. processingTtlTimeTimersQueue and eventTtlTimeTimersQueue for HeapInternalTimerService. Upon onProcessingTime() and advanceWatermark(), they will first iterate through ProcessingTimeTimer and EventTimeTimer before iterating the ProcessingTtlTimeTimers and EventTtlTimeTimer.by adding to it the following new internal APIs:

    We'll also add the following new internal APIs to register TTL timers:
    Code Block
    languagejava
    @Internal
    public void registerTtlProcessingTimeTimer(N namespace, long time);
    
    @Internal
    public void registerTtlEventTimeTimer(N namespace, long time);


    The biggest advantage, compared to option 1, is that it doesn't impact existing timer-related checkpoint/savepoint, restore and migrations.
     


Test Plan

Normal unit, integration, and end-to-end tests

...

  • TtlDB only supports TTL on record creation, not able to extend/refresh TTL upon read and/or write

  • TtlDB only supports one TTL for all records creation in all column families creation in the a single db opening. Flink currently stores one type of keyed user state in one column family, besides it’s inapproperiate to open and close TtlDB too frequently.

  • TtlDB only supports processing time TTL, not event time.

  • The TTL logic heavily rely on state backend implementation. If we add more state backends and rely on their internal feature for TTL, it’s hard to unify their behaviors and will introduce unnecessary system complexity
  • Using external feature to build Flink’s TTL makes migration and compatibility hard

 

...