Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Peer 认证已经构建在2.2.9 JGroups 堆栈之上, 通过引入自定义的认证协议, 拦截'加入'请求', 同时要求认证在允许请求到达GMS 成员关系协议之前来检查发送.

In the future Geode servers were also going to rely on JGroups for reliable UDP transmission of messages that are broadcast to the whole membership set, such as StartupMessage未来 Geode 服务器也会响应JGroups, 对于可靠的 UDP 消息传输, 这个 UDP 消息用来广播整个成员关系集合, 例如StartupMessage, ShutdownMessage, CreateRegionMessage and PDX registrations.  Sending these messages over TCP/IP stream connections is a barrier to increasing the size of the distributed system, especially at startup time when we must create 4M of these connections (M=member count) just to join the distributed system.

未来 Geode 服务器也会响应JGroups, 对于可靠的 UDP 消息传输, 这个 UDP 消息用来广播整个成员关系集合, 例如StartupMessage, ShutdownMessage, CreateRegionMessage 和 PDX 注册.

跨 TCP/IP 流连接发送这些消息有一个障碍是增加分布式系统的成员数量.

Reliable UDP communication is also needed for out-of-band low-priority communication, such as sending alerts to management nodes.  Creating TCP/IP connections to send alerts can block operations during periods when there are already bad things happening.  We recently saw this in a large production system, where an alert that members weren’t acknowledging a membership view change blocked operations because the management node that was to receive the alert was sick and not accepting connections.

Geode integrates a JGroups GossipServer into the Locator service.  GossipServer is used to provide information on who is in the distributed system when a new member is joining the distributed system.

注册.

跨 TCP/IP 流连接发送这些消息有一个障碍是增加分布式系统的成员数量.特别是在启动时, 我们必须创建4个成员的连接来加入到分布式系统中.

可靠的 UDP 通信在带外的, 低优先级通信的环境下也需要, 如管理节点的告警. 创建 TCP/IP  连接发送告警能够阻塞故障或错误的操作发生.我们最近也看到了一些大型的生产系统中, 当一个告警阻塞了操作 — 成员还没有确认一个成员关系视图的变化, 因为管理节点接收到的告警是有问题的, 不接受连接.

Geode 集成了一个 JGroups GossipServer 到  Locator 服务当中.  GossipServer 用于提供一些信息, 谁在分布式系统中, 当一个新成员正加入到分布式系统中时.

最后, Geode 客户端使用成员关系系统的类来形成 IDs, 这些包含了一个 Finally, Geode clients use the membership system's classes to form IDs, and these contain a JGroups IpAddress.

 

...

需求

In brief, the membership service must

简要来讲, 成员关系服务必须

  1. 投递成员变化的通告消息到 分布式系统的成员关系管理器

  2. 允许新的成员关系加入, 而无系统停机的情况

  3. 在分布式系统里为每个成员提供一个身份, 同时允许客户端有类似的身份.  对于 Peer 来说, 此 ID 必须是唯一的, 老的身份不应该被重用 (至少不是非常快)

  4. 在成员 ID中, 传输信息包括每个成员的 DistributedSystem 特征

  5. deliver notification of membership changes to the DistributedSystem’s MembershipManager

  6. allow new members to join without taking down the system

  7. provide identity for each peer in the distributed system and allow clients to have a similar identity.  The identity must be unique for the peer and old identities should not be reused (at least not very quickly)

  8. transmit information about each member’s DistributedSystem characteristics (VM type, DirectChannel port, Groups, Name, etc) in the member’s ID

  9. efficiently and quickly detect loss of a member (failure detection)

  10. support the notion of an Elder member for Geode’s Distributed Lock Service

  11. support Geode’s model of handling network partitions (winning/losing partitions)

  12. allow Geode to give advice on which members might be sick or out of action

  13. support rolling upgrade (old members can’t rejoin once upgrade has begun & the service itself must support backward compatibility)

  14. integrate with Geode’s authentication service and require authentication before allowing a new member to join

A UDP messaging services must

  1. Be compatible with the membership service’s IDs  (an ID from membership identifies endpoings in the UDP messaging service)

  2. Support rolling upgrade (on-wire compatibility across releases)

...

  1.  

  2. 高效, 快速地检测到一个成员丢失情况 (故障检测)

  3. 对于分布式锁服务, 支持老成员的想法

  4. 支持 Geode 模型来处理网络分区 (获得/丢失分区)

  5. 允许 Geode 给出通知, 哪个成员可能存在问题

  6. 支持滚动升级 (一旦升级开始, 老的成员不能重新加入 & 服务自身支持向后兼容)

  7. Geode 认证服务集成, 在允许新的成员加入前进行认证

 UDP 消息服务必须

  1. 兼容成员关系服务 IDs  (, 成员关系的一个 ID 标识了 UDP 消息服务中的端点)

  2. 支持滚动升级 (跨发布版本的on-wire 兼容性)



替换 JGroups v2.2.9

...

的选项


对于我们来说有一些选项可以选择.  选项如下:

迁移到新的JGroups版本

使用 Zookeeper

使用 Akka

创建一个自定义解决方案

Move to a newer version of JGroups

Use Zookeeper

Use Akka

Create a custom solution


One of the nice things about Geode is that membership management is dynamic and has no single point of failure.  Its weakest link is the Locator services, where if all Locators take a nose-dive clients cannot get info about servers and new servers can't be added to the cluster until one of the locators is available again.  Even if the locators are down the server cluster remains viable and available to clients that are already connected to servers.

...