Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Superseded by: N/A

Related: N/A

Problem
Jira
showSummaryfalse
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyGEODE-7121

The Geode system requires AEQs to be configured before regions are created. If an AEQ listener is operating on a secondary region, this could cause listener to operate on a region which is not yet created or fully initialized (for region with co-located regions) which could result in missing events or dead-lock scenario between region creation threads (creating regions and its co-located region) creation threadsregions in the listener). This scenario is likely to happen during persistence recovery; when AEQs are created in the start, the recovered AEQ events are dispatched immediately, thus invoking the AEQ listeners.

Anti-Goals

None

Solution

The proposed solution is to provide a way to control dispatching AEQ events to the AEQ Listeners, this could be done by adding "pause"  and "resume" capability to the AEQ, which will allow application to decide when to dispatch events to the listeners.

The proposal is similar to existing "pause" and "resume" behavior on the GatewaySender, on which the AEQ is based on (AEQ implementation is a wrapper around GatewaySender).

Changes and Additions to Public Interfaces

The proposed APIs are:

On "AsyncEventQueueFactory" interface -

AsyncEventQueue AsyncEventQueueFactory pauseEventDispatchToListener();  // This causes AEQ to be created with paused state.

On "AsyncEventQueue" interface -

...

"It should be kept in mind that the events will still be getting queued into the queue. The scope of this operation is the VM on which it is invoked. In case the AEQ is parallel, the AEQ will be paused on individual node where this API is called and the AEQ on other VM's can still dispatch events. In case the AEQ is not parallel, and the running AEQ on which this API is invoked is not primary then primary AEQ will still continue dispatching events."

Performance Impact

This will have similar performance and resource implication as with the "GatewaySender.pause()" functionality. If the AEQ is not resumed or kept in "pause" state for long, it may start consuming the configured memory and overflow it into disk and may cause disk full scenario.

Backwards Compatibility and Upgrade Path

Impact with rolling upgrade:

...

None. If needed, the existing application can be modified to control event dispatch with AEQ listener.

Prior Art

Without this, the AEQ listeners operating on other regions could experience missing events or dead lock, if there are co-located regions.

This approach is simple and can take advantage of the existing functionality that is already supported in GatewaySender on which AEQ is based on.

FAQ

Answers to questions you’ve commonly been asked after requesting comments for this proposal.

Errata

What are minor adjustments that had to be made to the proposal since it was approved?