You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Avoid the queuing of dropped events by the primary gateway sender when the gateway sender is stopped

To be Reviewed By: July 9th, 2020

Authors: Alberto Gomez (alberto.gomez@est.tech)

Status: Draft | Discussion | Active | Dropped | Superseded

Superseded by: N/A

Related: N/A

Problem

Gateway senders drop all events received when they are stopped. Nevertheless, primary gateway senders, while stopped, store all events received in the tmpDroppedEvents member variable of the AbstractGatewaySender class. These events are stored so that they can be sent later (when the primary gateway sender is started) to the secondary gateway senders in order for them to remove those events from their queues. If it were not so, secondary gateway senders could have events in their queues that would never be removed.

This feature was implemented in Unable to render Jira issues macro, execution error.   as a solution to avoid secondary gateway senders to leave un-drained events after GII.

This solution works well when stopped gateway senders are not to remain in that state for a long time, e.g., when they are stopped but in the process of starting. But, if a gateway sender is stopped (for example using gfsh) to be left in that state for some time, the incoming events reaching the primary gateway senders will be stored in the mentioned member variable of AbstractGatewaySender and eventually will provoke a heap exhaustion error. Moreover, dropped events stored while the gateway sender is stopped will not be queued by secondary gateway senders which makes the storing of the dropped events in the primary gateway sender unnecessary.

Stopping a gateway sender is an action that may be used to avoid the filling of gateway sender queues in long lasting split brain situations. But, given the current status of the implementation, it would not be effective because incoming events will still be stored by the primary gateway senders with higher memory consumption than the events when the sender is running (these may be overflown to disk) and with a very high risk of heap memory exhaustion.

Anti-Goals

As described above, dropped events in the primary gateway sender are stored in a member variable. It is out of the scope of this RFC to change how those events are stored.

Solution

The solution proposed aims at not storing dropped events when a gateway sender is stopped and not in the starting process, given that these events could never end in the queue of any secondary gateway sender and will use memory unnecessarily.

In order to do so, it is proposed to add a new boolean member variable in the AbstractGatewaySender that will tell if the primary gateway sender must store dropped events or not.

This flag will be set to false (do not store dropped events) in all gateway sender instances (primary and secondaries) after a stop gateway sender command using gfsh has successfully completed. And this flag will be set to true in all gateway sender instances (primary and secondaries) as a prior step to the start gateway sender gfsh command.

A draft PR of the solution can be found here: https://github.com/apache/geode/pull/5348

Changes and Additions to Public Interfaces

No changes to public interfaces are proposed.

Performance Impact

As the proposal implies changing the implementation of the gfsh start gateway sender and  the stop gateway sender commands to be done in two steps, these commands may be slightly slower but not significantly.

Backwards Compatibility and Upgrade Path

The proposal does not affect the rolling upgrade and has not impacts in the regular rolling upgrade process.

Prior Art

-

FAQ

Answers to questions you’ve commonly been asked after requesting comments for this proposal.

Errata

What are minor adjustments that had to be made to the proposal since it was approved?

  • No labels