ID	IEP-73
Author	Alexander
Sponsor	Slava Koptilin
Created	14 Apr 2021
Status	DRAFT

Motivation

In order to unblock business-related logic (atomic protocol, transactions, table management, etc) above the service-related logic(network, discovery protocol through meta storage, etc) it's required to specify components communication flow and initialization logic. It seems natural to choose node startup as an entry point for such high-level needs. Thus, node startup should provide:

control over components initialization.
control over components communications channels.

Description

From a bird's eye view, the set of the components and their connections may look like this:

where:

The number in front of the component's name shows an order in which components are initialized. So the very first component to be initialized during node startup is Vault. There are few components that should be instantiated before node start-up: cli and ignite-runner, however they are out of the scope of the node startup process.
Arrows show direct method calls. For example Affinity component could retrieve baseline from Baseline component using some sort of baseline() method. In order to decrease mess a bit, two explicit groups of arrows are introduced:
- Green Arrows show direct method calls to Meta Storage (MS) Component.
- Blue Arrows show direct method calls to Configuration Component.
There's also an upwards communication flow through the listeners/watches mechanism. However, within the scope of alpha 2 release, it's only possible to listen to Vault, Meta Storage, and Configuration updates.

Few more words about components responsibilities and inner flow:

VaultManager and LocalConfigurationManager

Vault is responsible for handling local keys, including distributed projections. During initialization, VaultManager checks whether there is any configuration within Vault's PDS, if not it uses the customer's bootstrap configuration if provided. Bootstrap configuration goes through the local configuration manager.

Vault and Local Configuration Manager

// Vault Component startup.
VaultManager vaultMgr = new VaultManager();

boolean cfgBootstrappedFromPds = vaultMgr.bootstrapped();

List<RootKey<?, ?>> rootKeys = new ArrayList<>(Collections.s
ingletonList(NetworkConfiguration.KEY));

List<ConfigurationStorage> configurationStorages =
	new ArrayList<>(Collections.singletonList(new LocalConfigurationStorage(vaultMgr)));

// Bootstrap local configuration manager.
ConfigurationManager locConfigurationMgr = new ConfigurationManager(rootKeys, configurationStorages);

if (!cfgBootstrappedFromPds)
	try {
    
	locConfigurationMgr.bootstrap(jsonStrBootstrapCfg);
    }
    catch (Exception e) {
    	log.warn("Unable to parse user specific configuration, default configuration will be used", e);
    }
else if (jsonStrBootstrapCfg != null)
	log.warn("User specific configuration will be ignored, cause vault was bootstrapped with pds configuration");

Manager	Depends On	Used By
VaultManager	-	LocalConfigurationManager in order to store local configuration and update it consistently through listeners. MetastorageManager in order to commit processed MS watch notifications atomically with corresponding applied revision.
LocalConfigurationManager	VaultManager	NetworkManager in order to bootstrap itself with network configuration including sort of IPFinder and handle corresponding configuration changes. MetaStorageManager in order to handle meta storage group changes. ConfigurationManager indirectly through LocalConfigurationStorage for the purposes of handling local configuration changes.

NetworkManager

It's possible to instantiate network manager when local configuration manager with vault underneath it is ready

Network Manager

NetworkView netConfigurationView =
	locConfigurationMgr.configurationRegistry().getConfiguration(NetworkConfiguration.KEY).value();

// Network startup.
Network net = new Network(
	new ScaleCubeNetworkClusterFactory(
    	localMemberName,
        netConfigurationView.port(),
        Arrays.asList(netConfigurationView.networkMembersNames()),
        new ScaleCubeMemberResolver()));

NetworkCluster netMember = net.start();

Manager	Depends On	Used By
NetworkManager	LocalConfigurationManager	MetaStorageManager in order to handle cluster init message. RaftManager in order to handle RaftGroupClientService requests and for the purposes of inner raft group communication. BaselineManger in order to retrieve information about current network members.

RaftManager <Loza>

After starting network member Raft Manager is instantiated. Raft Manager is responsible for handling raft servers and services life cycle.

RaftManager

// Raft Component startup.
Loza raftMgr = new Loza(netMember);

Manager	Depends On	Used By
Raftmanager	NetworkManager	MetaStorageManager in order to instantiate and handle distributed metaStorage raft group. TableManager in order to instantiate and handle partitioned/ranged raft groups.

MetaStorageManger and ConfigurationManger

Now it's possible to instantiate MetaStorage Manager and Configuration Manager that will handle both local and distributed properties.

MetaStorage Manager and Configuration Manager

// MetaStorage Component startup.
MetaStorageManager metaStorageMgr = new MetaStorageManager(
	netMember,
    raftMgr,
    locConfigurationMgr
);

// Here distributed configuraion keys are registered.
configurationStorages.add(new DistributedConfigurationStorage(metaStorageMgr));


// Start configuration manager.
ConfigurationManager configurationMgr = new ConfigurationManager(rootKeys, configurationStorages);

Manager	Depends On	Used By
MetaStorageManager	VaultManager NetworkManager RaftManager LocalConfigurationManager	ConfigurationManager in order to store and handle distributed configuration changes. BaselineManager in order to watch private distributed keys, cause ConfigurationManger handles only public keys. AffinityManager for the same purposes. Probably SchemaManager for the same purposes. TableManager for the same purposes.
ConfigurationManager	LocalConfigurationManager MetaStorageManager	BaselineManager in order to watch public keys. AffinityManager for the same purposes. Probably SchemaManager for the same purposes. TableManager for the same purposes. IgniteImpl

Business logic components: BaselineManager, AffinityManager, SchemaManager, TableManager, etc.

At this point it's possible to start business logic components like Baseline Manager, Affinity Manager, Schema Manager and Table Manager. The exact set of such components is undefined.

Top Level Managers

// Baseline manager startup.
BaselineManager baselineMgr = new BaselineManager(configurationMgr, metaStorageMgr, netMember);

// Affinity manager startup.
AffinityManager affinityMgr = new AffinityManager(configurationMgr, metaStorageMgr, baselineMgr);

SchemaManager schemaManager = new SchemaManager(configurationMgr);

// Distributed table manager startup.
TableManager distributedTblMgr = new TableManagerImpl(
	configurationMgr,
    netMember,
    metaStorageMgr,
	affinityMgr,
    schemaManager);


// Rest manager also goes here.

Manager	Depends On	Used By
BaselineManager	ConfigurationManager MetaStorageManager NetworkManager	AffinityManager in order to retrieve the current baseline.
AffinityManager	ConfigurationManager MetaStorageManager BaselineManager	TableManager strictly or indirectly through corresponding private distributed affinityAssignment key.
SchemaManager	ConfigurationManager Probably MetaStorageManager	TableManager in order to handle corresponding schema changes.
TableManager	ConfigurationManager MetaStorageManager NetworkManager AffinityManager SchemaManager RaftManager	IgniteImpl

Deploying watches and preparing IgnitionImpl

Finally, it's possible to deploy registered watches and create IgniteImpl that will inject top-level managers in order to provide table and data manipulation logic to the user.

Deploy registered watches and create IgniteImpl

// Deploy all resisted watches cause all components are ready and have registered their listeners.
metaStorageMgr.deployWatches();

return new IgniteImpl(configurationMgr, distributedTblMgr);

Component flow

In general from some point of view node is a collaboration of components mentioned above. In order to satisfy needs of:

Node start/stop
Dynamic component start/stop
Dynamic run level changes both forward and backwards

it makes sense to describe component flow in more detail. Here it is:

Component object instantiation.
Dependencies injection either with constructor or DI framework.
Component start either in specified run level or with default one:
1. Read and apply related local configuration if any.
2. Check whether a component's state lags behind the state of already started components, in other words check if component's applied revision is less than node's applied revision (by design all already started components will have same applied revision, so it's possible to denote such bunch of applied revisions as node applied revision). Applied revision here is the latest revision of configuration and meta storage updates that component have successfully processed and committed to vault. If the current component's applied revision is less then node's one than component either uses historical update logic(optimization, not an option currently) or full one to promote the state. Both options will be described in more detail in corresponding sections.
3. Deploy configuration and meta storage watches in order to be notified about any new updates.
4. Register message handlers.
5. Register local components listeners.
6. Start threads/thread pools.
7. All other inner related stuff.
8. Please pay attention that the order and completeness of the above operations are not strictly defined and depend on the design specifics of a particular component.
  Component is ready.
At a given state it's possible to move component's run level both further or backwards. Depending on the specified run level it might be necessary to process logic specified at step 3.
Component stop. The core point of stopping a component is to safely terminate the component's ongoing operations as fast as possible. That include following steps:
1. Unregister configuration component's configuration listeners and meta storage watches.
  Prevent any network communication. In case of an attempt of a network communication - ComponentStoppingException should be thrown. Catching such exceptions should complete awaiting futures if any with corresponding reason.
2. Unregister local listeners that will also complete local listeners futures with ComponentStoppingException.
3. Stop treads/thread pools and do all other inner component related stuff.

Historical update

Historical here reflects a certain similarity with the historical rebalance design. It is worth mentioning that the main problem of upgrading components to the state that is newer than node's state is an impossibility of doing consistant reads of beneath components. In other words, if we start TableManager from within appliedRevision 10 and SchemaManager already have appliedRevision 20, schemaManager.getScheama(tableId) will return schema for revision 20 that is not expected by TableManager that processes table updates for revision 11. In order to provide consistent reads, the requested component could analyze the callers context and either recalculate requested data based on callers applied revision or return previously cached historical data. In any case this logic seems to be non-trivial and might be implemented later as a sort of optimization.

Full update

Another solution to satisfy component update that will preserve consistent cross components reads will include:

Component's data eviction.
Synchronous component's state retrieval from distributedConfigurationManager. At this point, cause all intermediate updates are skipped and only up to date state is retrieved, consistent cross component reads are guaranteed.

Risks and Assumptions

// N/A

Discussion Links

http://apache-ignite-developers.2346864.n4.nabble.com/Terms-clarification-and-modules-splitting-logic-td52026.html#a52058

Reference Links

https://github.com/apache/ignite-3/tree/main/modules/runner#readme

Tickets

Umbrella Ticket: Unable to render Jira issues macro, execution error.

Initial Implementation: Unable to render Jira issues macro, execution error.

Page tree

Motivation

Description

VaultManager and LocalConfigurationManager

NetworkManager

RaftManager <Loza>

MetaStorageManger and ConfigurationManger

Business logic components: BaselineManager, AffinityManager, SchemaManager, TableManager, etc.

Deploying watches and preparing IgnitionImpl

Component flow

Historical update

Full update

Risks and Assumptions

Discussion Links

Reference Links

Tickets

4 Comments

Andrey Mashenkov

Alexander

Sergey Chugunov

Andrey Gura

Page tree

IEP-73: Node startup

Motivation

Description

VaultManager and LocalConfigurationManager

NetworkManager

RaftManager <Loza>

MetaStorageManger and ConfigurationManger

Business logic components: BaselineManager, AffinityManager, SchemaManager, TableManager, etc.

Deploying watches and preparing IgnitionImpl

Component flow

Historical update

Full update

Risks and Assumptions

Discussion Links

Reference Links

Tickets

4 Comments

Andrey Mashenkov

Alexander

Sergey Chugunov

Andrey Gura