You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Current Design

The request purgatory consists of a timeout timer and a hash map of watch-lists for event driven processing. A request is put into a purgatory when it is not immediately satisfiable because of unmet conditions. A request in the purgatory is completed later when the conditions are met or is forced to be completed (timeout) when it passed beyond the time specified in the timeout parameter of the request. Currently (0.8.x) it uses Java DelayQueue to implement the timer and Java LinkedList for a watch list.

When a request is completed, the request is not deleted from the timer or watch-lists immediately. Instead, completed requests are deleted as they were found during condition checking. When the deletion does not keep up, the server may exhaust JVM heap and cause OutOfMemoryError. To alleviate the situation, the reaper thread purges completed requests from the purgatory when the number of requests in the purgatory (including both pending or completed requests) exceeds the configured number. The purge operation scans the timer queue and all watch-lists to find completed requests and deletes them.

By setting this configuration parameter low, the server can virtually avoid the memory problem. However, the server must pay a significant performance penalty if it scans all lists too frequently.

New Design

The goal of the new design is to allow immediate deletion of a completed request and eliminate the expensive purge process. It requires cross referencing of entries in the timer and watch-lists. Also it is strongly desired to have O(1) insert/delete cost since insert/delete operation happens for each request/completion.

To satisfy these requirements, we propose a purgatory implementation based on Hierarchical Timing Wheels and own doubly linked list.

Hierarchical Timing Wheel

A simple timing wheel is a circular list of buckets of timer tasks. Let u be the time unit. A timing wheel with size n has n buckets and can hold timer tasks in n * u time interval. Each bucket holds timer tasks that fall into the corresponding time range. At the beginning, the first bucket holds tasks for [0, u), the second bucket holds tasks for [u, 2u), …, the n-th bucket for [u * (n -1), u * n). Every interval of time unit u, the timer ticks and moved to the next bucket after expiring all timer tasks in the current bucket. The emptied bucket is then available for the next round, so if the current bucket is for the time t, it becomes the bucket for [t + u * n, t + (n + 1) * u) after expiration. A timing wheel has O(1) cost for insert/delete (start-timer/stop-timer) whereas priority queue based timers, such as java.util.concurrent.DelayQueue and java.util.Timer, have O(log n) insert/delete cost.

A major drawback of a simple timing wheel is that it assumes that a timer request is within the time interval of n * u from the current time. If a timer request is out of this interval, it is an overflow. A hierarchical timing wheel deals with such overflows. It is a hierarchically organized timing wheels. The lowest level has the finest time resolution. As moving up the hierarchy, time resolutions become coarser. If the resolution of a wheel at one level is u and the size is n, the resolution of the next level should be n * u.  At each level overflows are delegated to the wheel in one level higher. When the wheel in the higher level ticks, it reinsert timer tasks to the lower level. A overflow wheel can be created on-demand.

The insert (start-timer) cost is O(m) where m is the number of wheels, which is usually small, and the delete (stop-timer) cost is still O(1). 

Doubly Linked List in Timer and Watch-Lists

In this design, we propose to use our own implementation of doubly linked list for the buckets in a timing wheel. The advantage of doubly linked list that it allows O(1) insert/delete of a list item if we have access link cells in a list.
A timer task saves a link cell in itself when enqueued to a timer. When a task is completed or canceled, the list of updated using the link cell saved in the task itself.
We propose to use the same doubly linked list implementation for watch-lists and to make a timer task save multiple link cells. This allows us to save all dependencies in one place and clean up a timer queue and watch lists with ease.

Parameters

  • the tick size (the minimum time unit)
  • the wheel size (the number of buckets per wheel)
  • No labels