Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Geode uses UDP messaging through jgroups JGroups for it's peer to peer (P2P) messaging related to membership. Unfortunately , jgroups JGroups does not support rolling upgrades, which makes it more difficult to to upgrade jgroups versionsJGroups. Configuring encryption for UDP adds an additional layer of complexity for securing P2P communications.

We would like to replace the UDP messaging for membership with a TCP based messaging system.

Requirements

  • Set us up for removing jgroups JGroups as dependency in future versions by moving to a protocol that does not require jgroupsjroups
  • Support rolling upgrades from the old jgroups JGroups protocol to the new TCP based protocol
  • Use the existing SSL settings to control how membership messages are encrypted
  • Be as reliable in the face of networking failures as the previous protocol

...

All of messaging related to membership is handled by the JGroupsMessenger class, which implements the Messenger interface. We will create a new implementation of Messenger that uses TCP sockets, rather than jgroups JGroups and UDP sockets.

The key methods on messenger are below

...

The netty server will use the existing cluster SSL configuration, so if cluster SSL is enabled no additional properties will be required. See https://geode.apache.org/docs/guide/latest/managing/security/implementing_ssl.html for information on the relevant properties.

When sending a message , the Messenger will create a connection to all destinations if no connection exists yet. Once a connection is established, the connection will remain open as long as that member is still in the view. If IO errors or timeouts happen when writing messages, the Messenger will close the connection, reestablish a new connection, and retry. Messages will continue to be retried until we receive a view indicating that the member is no longer in the view. We will therefore guarantee the order of message delivery.??? What about data that was already in the socket write buffer but never transmitted? We need to make sure we transmit these messages as well. Does this mean we need our own ack protocol outside of TCP?

Because we are retrying messages, we can end up delivering messages more than once. The old JGroups protocol ensured only once delivery. ??? Do messaging messages need to be sent in the order they are generated, even with retries??only once delivery? What if we end up sending regular messages over this channel?

One concern about switching to a TCP based protocol is that network outages may result in TCP sockets hanging on read or write operations. We will use non-blocking IO whenever possible and test the new protocol with tools to simulate network outagesneed to ensure that if a connection to one members is blocked, that we still send messages to the other members. Each destination will have its own queue of messages to be sent, and adding to one queue will not block the others.

Rolling upgrade concerns

We need to be able to do a rolling upgrade from the old jgroups JGroups based protocol to the new protocol. We will need to continue to support rolling upgrades for a certain range of versions before we can drop jgroupsJGroups.

In order to accomplish this a member will actually need to be listening for connections on both protocols when it initially starts up. We will create a delegating Messenger that contains both a JgroupsMessenger JGroupsMessenger and a NettyMessenger. It can install handlers in both of them, and decide which Messenger to use when sending a message based on the version of the recipient. If a member receives a view that contains no old members that don't support the old protocol it could shut down the jgroups JGroups based Messenger.

There is an issue here with the need to listen on two separate ports, because InternalDistributedMember currently only has support for a single port field. There are a few options we are evaluating:

  1. Just use the same port for jgroups JGroups and netty. Since one is a UDP port and the other is TCP port, they can both be open at the same time.
  2. Encode the second port in some other field in InternalDistributedMember, for example by using part of the UUID bytes. This is kindy hacky
  3. Pass the new membership port around outside of InternalDistributedMember. This would probably involve sending as part of the FindCoordinatorResponse, as well as including it in the NetView.