Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

To be Reviewed By:

Authors: Alberto Bustamante Reyes (alberto.bustamante.reyes@est.tech)

Status: Draft | Discussion | Active | Dropped | Superseded Development

Superseded by: N/A

Related: N/A

...

There is a problem with Geode WAN replication when GW receivers are configured with the same hostname-for-senders and port on all servers. The reason for such a setup is deploying Geode cluster on a Kubernetes cluster where all GW receivers are reachable from the outside world on the same VIP and port. Other kinds of configuration (different hostname and/or different port for each GW receiver) are not cheap from OAM operation & maintenance and resources perspective in cloud native environments and also limit some important use-cases (like scaling).

...

Example, using "cluster-1" and "cluster-2", both with one locator and two servers.  :

Cluster-1 gfsh>list members
Member Count : 3

Name      | Id
--------- | ------------------------------------------------------------
server-0  | 172.17.0.4(server-0:65)<v1>:41000
locator-0 | 172.17.0.6(locator-0:25:locator)<ec><v0>:41000 [Coordinator]
server-1  | 172.17.0.8(server-1:47)<v1>:41000

Cluster-1 gfsh>list gateways
GatewaySender Section

GatewaySender Id | Member                            | Remote Cluster Id | Type     | Status                | Queued Events | Receiver Location
---------------- | --------------------------------- | ----------------- | -------- | --------------------- | ------------- | --------------------------------------------------------------
sender-to-2      | 172.17.0.4(server-0:65)<v1>:41000 | 2                 | Parallel | Running and Connected | 0             | receiver-site2-service.geode-cluster-2.svc.cluster.local:32000
sender-to-2      | 172.17.0.8(server-1:47)<v1>:41000 | 2                 | Parallel | Running and Connected | 0             | receiver-site2-service.geode-cluster-2.svc.cluster.local:32000



Cluster-2 gfsh>list members
Member Count : 3

Name      | Id
--------- | ------------------------------------------------------------
server-0  | 172.17.0.5(server-0:65)<v1>:41000
locator-0 | 172.17.0.7(locator-0:25:locator)<ec><v0>:41000 [Coordinator]
server-1  | 172.17.0.9(server-1:46)<v1>:41000



Cluster-2 gfsh>list gateways
GatewayReceiver Section

Member                            | Port  | Sender Count | Senders Connected
--------------------------------- | ----- | ------------ | -----------------------------------------------------------------------------------------------------------------------------------------------

172.17.0.5(server-0:65)<v1>:41000 | 32000 | 6            | 172.17.0.4(server-0:65)<v1>:41000, 172.17.0.8(server-1:47)<v1>:41000, 172.17.0.8(server-1:47)<v1>:41000,

                                                                                                                                172.17.0.8(server-1:47)<v1>:41000, 172.17.0.8(server-1:47)<v1>:41000, 172.17.0.4(server-0:65)<v1>:41000

172.17.0.9(server-1:46)<v1>:41000 | 32000 | 8            | 172.17.0.8(server-1:47)<v1>:41000, 172.17.0.4(server-0:65)<v1>:41000, 172.17.0.4(server-0:65)<v1>:41000,

                                                                                                                               172.17.0.8(server-1:47)<v1>:41000, 172.17.0.4(server-0:65)<v1>:41000, 172.17.0.4(server-0:65)<v1>:41000,

                                                                                                                               172.17.0.4(server-0:65)<v1>:41000, 172.17.0.8(server-1:47)<v1>:41000


If one server is stopped on "cluster-2", both senders in "cluster-1" are disconnected:

...

And some minutes later, all connections are lost:

Cluster-1 gfsh>list gateways
GatewaySender Section

GatewaySender Id | Member                            | Remote Cluster Id | Type     | Status                 | Queued Events | Receiver Location
---------------- | --------------------------------- | ----------------- | -------- | ---------------------- | ------------- | --------------------------------------------------------------

sender-to-2      | 172.17.0.4(server-0:65)<v1>:41000 | 2                 | Parallel | Running, not Connected | 0             | receiver-site2-service.geode-cluster-2.svc.cluster.local:32000
sender-to-2      | 172.17.0.8(server-1:47)<v1>:41000 | 2                 | Parallel | Running, not Connected | 0             | receiver-site2-service.geode-cluster-2.svc.cluster.local:32000


Cluster-2 gfsh>list gateways
GatewayReceiver Section

Member                            | Port  | Sender Count | Senders Connected
--------------------------------- | ----- | ------------ | -----------------
172.17.0.5(server-0:65)<v1>:41000 | 32000 | 0            |
172.17.0.9(server-1:46)<v1>:41000 | 32000 | 0            |


Checking the logs again, we can see new logs from the ClientHealthMonitor:

...

Cluster-2 gfsh>list gateways
GatewayReceiver Section

Member                            | Port  | Sender Count | Senders Connected
--------------------------------- | ----- | ------------ | -------------------------------------------------------------------------------------------------------
172.17.0.5(server-0:65)<v1>:41000 | 32000 | 0            |
172.17.0.9(server-1:51)<v1>:41000 | 32000 | 3            | 172.17.0.8(server-1:46)<v1>:41000, 172.17.0.8(server-1:46)<v1>:41000, 172.17.0.8(server-1:46)<v1>:41000

Now ClientHealthMonitor is closing connections in server-1, but in this time it does not seem to be related to a ping problem:

...

root@server-1:/# grep ClientHealthMonitor server-1/server-1.log

[info 2020/03/10 14:02:34.275 GMT <main> tid=0x1] ClientHealthMonitorThread maximum allowed time between pings: 60000

[warn 2020/03/10 14:15:30.846 GMT <ServerConnection on port 32000 Thread 4> tid=0x4a] ClientHealthMonitor: Unregistering client with member id identity(172.17.0.4(server-0:69)<v1>:41000,connection=1 due to: The connection has been reset while reading the header

Anti-Goals

N/A

Solution

Gw sender failover

Solution consists on refactoring some maps on LocatorLoadSnapshot class. They use ServerLocation objects as key, this has to change due to it will not be unique for each server. We changed the maps to use InternalDistributedMember objects as key for the map entries. The ServerLocation information is not lost, as it is contained in the entry value for all the maps.

The same refactoring is done in EndPointManager, as it holds a map of endpoints that also uses ServerLocation objects as key.

Check this commit for a draft of the proposed solution: https://github.com/apache/geode/pull/4824/commits/b180869c73095e7a810ba2e1c92e243a0220e888

Gw sender pings not reaching gw receivers

When PingTask are run by LiveServerPinger, they call PingOp.execute(ExecutablePool pool, ServerLocation server). PingOp only uses hostname and ip (ServerLocation) to get the connection to send the ping message. As all receivers are sharing the same host and port, it is not guaranteed that the connection is really pointing to the server we want to connect to.

Solution consists on the modification of the ping messages to include info about the server they want to reach. If the messages are received by other server, they can be sent to the proper server.

Other alternative is the addition of a retry mechanism to PingOp to be able to discard a connection if the endpoint of that connection is not the server we want to connect to.

Anti-Goals

N/A

Solution

Current status of the solution is located on this PR:

There is just one test failing (testExecuteOp from ConnectionPoolImplJUnitTest) that causes integration test and stress test tasks to fail. We are working to fix it.

Gw sender failover

Solution consists on refactoring some maps on LocatorLoadSnapshot class. They use ServerLocation objects as key, this has to change due to it will not be unique for each server. We changed the maps to use InternalDistributedMember objects as key for the map entries. The ServerLocation information is not lost, as it is contained in the entry value for all the maps.

The same refactoring is done in EndPointManager, as it holds a map of endpoints that also uses ServerLocation objects as key.

Gw sender pings not reaching gw receivers

When PingTask are run by LiveServerPinger, they call PingOp.execute(ExecutablePool pool, ServerLocation server). PingOp only uses hostname and ip (ServerLocation) to get the connection to send the ping message. As all receivers are sharing the same host and port, it is not guaranteed that the connection is really pointing to the server we want to connect.

We have added a new method PingOp.execute(Executable pool, Endpoint endpoint) to solve this. In this way, if the connection obtained is not pointing to the required Endpoint, it can be discarded an ask for a new one.

Other alternatives to the retry mechanism that we have not explored could be:

  • Add the option for deactivating the ping mechanism for gw sender/gw receivers communication
  • Send the ping using just existing connections, not creating new ones.

Changes and Additions to Public Interfaces

...