You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

IDIEP-67
Author
Sponsor
Created

  

Status

ACTIVE


Motivation


According to IEP-61 we need a group membership/discovery service with likely less strict consistency guarantees than the current Discovery SPI. It is pretty convenient to extract such features to the extra low-level module and provide some API for interaction with the network.


There are several problems to have the different interfaces for network communication(DiscoverySPI, CommunicationSPI) including undefined behavior when one port works perfectly while another is not,  duplication of interfaces for sending messages, duplication of the code for network interaction. So for resolving these problems, it is an idea to mix discovery and communication modules into one networking module with one port, API, and implementation.

Description

Terminology

  • endpoint - host(IP)/port which should be configured in discovery configuration. This is the analog 'node' but 'node' is not an appropriate word on such low-level and it represents only host/port rather than the whole node.
  • instanceId/setupId - unique identification of every endpoint which should be bounded with a local process(this identification should be generated on every restart). Something like current nodeId but belongs to network endpoint rather than the whole node.
  • network member/member - connection + instanceId + messages? It can be used for sending messages. It represents a cluster participant which is able to send or receive some messages.

Responsibility

  • Discover and automatically gather all found network endpoints to group membership.
  • Detect the network events - new endpoints appears, the connection fails, etc.
  • Provide cluster membership changes events for subsystems that do not require strict consistency guarantees.
  • Retry to establishing connection if it was lost.
  • Send and receive messages(including custom messages) to any member of the group.
  • Serialize/deserialize messages.
  • Provide API for sending messages to a specific member with/without a delivery guarantee.

Lifecycle

  • Start the process.
  • Bind to the configured local port
  • Iterate over configured ip/port(IP finder?) and connect to endpoint if possible
  • Start the handshake with a successfully connected endpoint for exchange the initial information(instanceId, recoveryMessageId)
  • Notify subscribers about establishing connection to the cluster(Generate network started event)
  • Reconnect to endpoint if connection unexpectedly fails (without any events to subscribers)
  • Notify subscribers about the appearance of a new network member if it finds one.
  • Notify subscribers about the disappearance of network member if connection unrecoverably fails.
  • Stop the process.

API

Draft API
public interface NetworkService {
    static NetworkService create(NetworkConfiguration cfg);

    void shutdown() throws ???;

    NetworkMember localMember();
    
	Collection<NetworkMember> remoteMembers();
    
	void weakSend(NetworkMember member, Message msg);

    Future<?> guaranteedSend(NetworkMember member, Message msg);
    
	void listenMembers(MembershipListener lsnr);
    
	void listenMessages(Consumer<MessagingEvent> lsnr);
}

public interface MembershipListener {
    void onAppeared(NetworkMember member);
    void onDisappeared(NetworkMember member);
    void onAcceptedByGroup(List<NetworkMember> remoteMembers);
}

public interface NetworkMember {
    UUID id();
}


Implementation

  • It needs to implement the group membership over netty.
  • Integrate direct marshaling to netty(via netty handlers).
  • Implement sending idempotent messages with very weak guarantees:
    • no delivery guarantees required;
    • multiple copies of the same message might be sent;
    • no need to have any kind of acknowledgement;
  • Implement sending messages with delivery guarantees::
    • message must be sent exactly once with an acknowledgement that it has actually been received (not necessarily processed);
    • messages must be received in the same order they were sent.
    • These types of messages might utilize current recovery protocol with acks every 32 (or so) messages. This setting must be flexible enough so that we won't get OOM in big topologies.
  • Add ssl support to this module
  • Implement connection retrying algorithms.


Dependencies

Nettyis an asynchronous event-driven network application framework for rapid development of maintainable high performance protocol servers & clients. It will be used for interaction with network(read/write, serialize/deserialize).

Scalecube - is a lightweight decentralized cluster membership, failure detection, and gossip protocol library. It will be used for gathering open network interfaces to cluster membership via gossip protocol.


Risks and Assumptions


Discussion Links

// N/A

Reference Links

// N/A

Tickets

key summary type created updated due assignee reporter priority status resolution

JQL and issue key arguments for this macro require at least one Jira application link to be configured

  • No labels