You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

This document introduces the topology of major classes in the cluster module, i.e., how they are connected and their main functionality. The main purpose of the document is to help new comers of the cluster module grasp how each class works, so they would be able to find where to modify if they want to contribute their code. However, any other advice or questions about the organization are welcomed, you may comment within this page or send mail to our mail list to discuss.


General Topology

Fig.1 shows the class diagram of important classes in the cluster module. For simplicity, class members and class methods are omitted here as they are just too many. Helper classes and utility classes are also ignored, so you would find much more classes in the cluster module.

Class Functionalities

Program Entry Point

ClusterMain is the main class in the cluster module, which performs some developer-level customization and start a MetaClusterServer, which will further initialize the Raft logics and the underlying IoTDB.

Servers

Servers are implementations of Thrift RPC servers, their main concerns are network transport and protocol parsing. Each of them listens on a port and an ip address, accept connection from clients, wrap connections with certain types of Thrift transports, read requests from the transports, parse them with specific protocols, and send the requests to associated RaftMembers. Each node only have one each server.

  • MetaClusterServer: listens to internal_meta_port, receives MetaGroup-related Raft requests, and forwards them to MetaMember. It also starts the underlying IoTDB once it is up.
  • MetaHeartbeatServer: listens to internal_meta_port, receives MetaGroup heartbeats, and forwards them to MetaMember.
  • DataClusterServer: listens to internal_data_port, receives DataGroup-related Raft requests, decides which DataMember will process it and forwards to it.
  • DataHeartbeatServer: listens to internal_data_port + 1, only receives DataGroup heartbeats, decides which DataMember will process it and forwards to it.
  • ClientClusterServer: listens to external_client_port, receives database operations from clients, and forwards them to MetaMember.

Raft Memers

Raft Members are implementations and extensions of Raft algorithm, which handles log generation, log replication, heartbeat, election, and catch-up of falling behind nodes. Each RaftMember plays one role in exactlly one Raft group. One cluster node has exactlly one MetaMember and multiple DataMembers, depending on the number of replications and partitioning strategy.

  • RaftMember: an abstraction of common Raft procedures, like log generation, log replication, heartbeat, election, and catch-up of falling behind nodes.
  • MetaMember: an extension of RaftMember, representing a member in the MetaGroup, where cluster configuration, authentication info, and storage groups are managed. It holds the partition table, processes operations that may modify the partition table, and also acts as a coordinator to route database opertions to responsible DataMember with the help of its partition table. One node has only one MetaMember.
  • DataMember: an extension of RaftMember, representing a member in one DataGroup, where timeseries schemas and timeseries data of one cluster-level partition are managed. It also handles data queries and thus manages query resources. One node may have multiple DataMembers, depending on how many data partition it manages, which is further decided by the number of replications and partitioning strategy.

Partition Table

Partition Table maintains the mapping from a partition key (storage group name and time partition id in IoTDB) to a DataMember(or a node). As timeseries schemas and data are replicated only in a subset of the cluster (i.e., a DataGroup), opertions involving them can only be applied to specific nodes. When a node receives an operation, and it does not manage the corresponding data or schemas of the operation, the node must forward it to the right node, and this is where a Partition Table applies.

  • SlotPartitionTable: a consistent-slot-hashing-based Partition Table, which divides the whole data into a fixed number of slots (10000 slots by default), distributes the slots evenly to all DataGroups, hashes the partition key to one of the slots, and the data holder will be the holder of the slot. Upon the DataGroups are updated, the slots are re-distributed and the changes are recorded for data migration.

Heartbeat Thread

Heartbeat Thread is an important component of a Raft Member. As the chracter of a Raft Member is alwayls one of LEADER, FOLLOWER, or ELECTOR(CANDIDATE), it utilizes Heartbeat Thread in three ways: as a LEADER, Heartbeat Thread sends heartbeats to followers; as a FOLLOWER, Heartbeat Thread periodicallly checks whether the leader has timed out; as a CANDIDATE, it starts elections and send election requests to other nodes until one of the nodes becomes a valid leader.




  • No labels