Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To satisfy these requirements, we propose a new purgatory implementation based on Hierarchical Timing Wheels and a new watcher list implementation based on WeakReference.

Hierarchical Timing Wheel

...

A simple implementation may use a thread that wakes up every unit time and do the ticking, which checks if there is any task in the bucket. This can be wasteful if requests are sparse. We want the thread to wake up only when when there is a non-empty bucket to expire. We will do so by using java.util.concurrent.DelayQueue similarly to the current implementation, but we will enqueue task buckets instead of tasks. This design has a performance advantage. The number of items in DelayQueue is capped by the number of buckets, which is usually much smaller than the number of tasks, thus the number of offer/poll operations to the priority queue inside DelayQueue will be significantly smaller.

WeakReference based Watcher List

Removing completed requests from watcher lists is a major pain. The problem is that completed requests are not garbage-collected if we don't remove them from the list. Thus, we monitor the number of outstanding watchers and trigger purge operation by it. Unfortunately tuning of it is not trivial for a large system.

We propose to use a list implementation that uses WeakReference, where a list element is pointed by a weak reference from the list entry. Since it is weak references, a request pointed by one can be reclaimed by GC if there are no strong references in the system. while a request is waiting for completion, there is a strong reference to it from the timer. There shouldn't be any other strong reference. Once completed the strong reference will be cleared, and the request will be reclaimed eventually. The cleared reference will be seen as a null element.

Although completed requests are garbage-collected automatically, there remain list entries in watcher lists. How do we know when to purge the lists? One idea is to use ReferenceQueue. JVM puts a reference object into reference queue  when the reference is cleared. By pulling from the reference queue, we know how many cleared weak references are potentially in the lists. We can do the pulling and counting in ExpiredOperationReaper and tryCompleteWatched.

ExpiredOperationReaper pulls weak references from the reference queue and increment the counter until the queue is emptied. If the final count is greater than some threshold, ExpiredOperationReaper reset the counter to zero and traverses the list to remove all null entries. tryCompleteWatched, on the other hand, drains the reference queue at the beginning and set the counter to zero. Then, just like the current implementation, tryCompleteWatched traverses through the list and removes null entries and entries of completed requests. After a traversal, the list is clean except for potential newly cleared entries. Such new cleared entries will be detected through the reference queue.

Parameters

  • the tick size (the minimum time unit)
  • the wheel size (the number of buckets per wheel)

...

In the new design, we use Hierarchical Timing Wheels for the timeout timer, WeakReference based list for watcher lists, and timer and DelayQueue of timer buckets to advance the clock on demand. The advantage is that a completed request is removed from the timer queue immediately and becomes GCable. When a request is completed we will immediately remove it from a timer bucket with O(1) cost. A strong reference to a request is cleared at this point. All remaining references are weak references from watcher lists, so completed requests become GCable right away. The buckets remain in the delay queue. However, the number of buckets is bounded. And, in a healthy system, most of the requests are satisfied before timeout. Many of the buckets become empty before pulled out of the delay queue. Also, the timer should rarely have the buckets of the lower interval. We expect a improvement in GC behavior and lower CPU usage.

...