IDIEP-55
Author
Sponsor
Created

  

Status

ACTIVE

Motivation

The current approach for Ignite configuration was formed a long time ago (originally inherited from GridGain during the project donation) and has several disadvantages that manifested themselves over time as project gained more usage across different users. There are several orthogonal aspects that need to be addressed.

Configuration representation

Ignite relies on POJO classes as a primary configuration form. At the same time, configuration objects embed pieces of executable code (SPIs, Ignite services, continuous query listeners, etc) which makes a configuration instance non-portable and hard to serialize: an instance of configuration cannot be deserializaed without custom user classes; an instance of configuration cannot be reused for starting two Ignite nodes (one need to make sure different nodes have different SPI instances).

Configuration files usability

Ignite uses Spring to translate configuration files to the POJO instances consumed by the library, which brings additional disadvantages. It brings a dependency for a third-party IoC library for a simple task of configuration file parsing. Spring syntax is verbose and unfamiliar to other platforms developers (C#, nodejs, C++). As an example, independent configuration objects for platforms makes it impossible to have a single configuration file, which forces Ignite to reference a Spring configuration file from C# configuration.

Config files and runtime configuration changes duality

With the addition of Ignite native persistence and introduction of runtime configuration changes (dynamic caches, dynamic indexes and columns) the configuration file lost their original semantics which led to scenarios where configuration files contradict the runtime state of the cluster and unpredictable cluster behavior after a restart. For example, if a user executes an ADD COLUMN / DROP COLUMN statement, the configuration changes are not reflected neither in QueryEntity configuration object, nor in the configuration file (which is twice as tricky to do for Spring representation). The changes, however, are persisted and preserved after a cluster restart for persistent caches. On the other hand, if a user destroys a cache defined in a configuration file, the destroy succeeds, but the cache will be re-created after a cluster restart.

Such duality makes it extremely hard to introduce runtime configuration changes as it immediately brings contradiction and undefined behavior in the presence of configuration files.

Persistent and non-persistent behavior duality

Dynamic cache configuration persistence depends on whether the cache is persistent or not. A memory-only dynamic cache will not be recreated on a cluster restart (even if there are other persistent caches), while persistent caches are preserved. 

Description

The purpose of this IEP is to provide a revisited unified approach to configuring an Ignite cluster, which addresses the issues stated above by achieving the following goals:

  • Platform-agnostic configuration representation
  • Clear separation between configuration and any third-party code
  • Predictable configuration lifecycle
  • Consistent runtime configuration change support

Configuration structure and representation

To ease user's transition to the new configuration, it is suggested to keep the tree-like configuration structure induced by the POJO classes. However, instead of the classes, we can represent the configuration as a prefix tree with shared prefixes denoting larger configuration structures and leafs representing configuration properties that have some values. Such structure can be though of either as graph of nested objects (JSON-like), or either as a list of unfolded fully-qualified properties name-value pairs (Java properties-like).

Both representations are handled by the HOCON configuration file format [1] which we suggest to use for the new configuration approach. The library can be easily shaded into the Ignite sources to prevent possible conflicts.

The values of the properties are required to be of primitive types, enforcing the strict configuration serialization and portability. User-provided objects configuration can be represented via subtrees for prefixes not reserved by Ignite.

Other configuration formats are open for discussion throughout the IEP finalization.

Keys order restriction

Both HOCON and JSON formats don't preserve keys order when parsed, it's a part of the specification. So it's proposed to have extra syntax for elements that require strict order:

HOCON keys order
// Default syntax.
root.namedList {
  key1 { value = val1 },
  key2 { value = val2 }
}
// Specific syntax for when order must be preserved.
root.namedList = [
  { elementName = key1, value = val1 },
  { elementName = key2, value = val2 }
]

Global and local properties

As practice shows, we can split the list of configuration properties into global and local:

  • Global properties denote a cluster-wide state that must be consistently processed by all nodes in the cluster. When a value of such a property differs on different nodes in the cluster, Ignite features work incorrectly. An example of such configuration property is cache configuration (e.g. different affinity function will break the cache logic) or baseline auto-adjust enabled flag
  • Local properties denote a node-local state. If such a state is different on some nodes in the cluster, it does not break the correctness of the cluster features (the difference in local properties may, however, prevent a node from starting or joining the cluster: consider different discovery or communication SPIs, for example). An example of such configuration property is the number of threads used for rebalancing

Global properties consistency is handled by a built-in mechanism which ensures that either all nodes see the most up-to-date global properties values (and notified timely when the properties change) or a node does not join the cluster with contradicting values (the same invariant is implemented in distributed metastorage component).


Local properties are stored locally, but Ignite provides tooling to change local properties on a given subset of nodes on any given topology (a single broadcast). Additional tooling may be used to control discrepancies of the local configuration properties and notify a user.

Global and local properties are naturally stored in different configuration trees, for example cluster and node respectively:

Local tree
node.rebalance.threads = 4

For a full list of node-local and cluster-wide properties refer to the Appendix A of this document. It is kept up-to-date and contains all standard properties.

Configuration lifecycle

Configuration files are not exposed directly for the user for edit. Instead, a user is provided with an interface which allows to initialize a node and a cluster during the cluster setup and tooling which allows to modify to change configuration properties. Initial configuration can be provided in either way supported by HOCON.

We suggest that:

  • Global and local properties are always persistent and preserved across restarts even for purely in-memory clusters. This will make the Ignite behavior consistent and ease the production cluster usage (there is no need to apply the same configuration changes, such as cache creation, when a cluster starts).
  • Property modification tooling is powerful enough to allow local configuration modification in offline mode, as well as to allow overwriting the whole local tree with a predefined HOCON configuration
  • Most of the properties may be changed in runtime and persisted, however, some of the properties may take effect only after node or cluster restart. Some of the properties may only be specified during the node/cluster initialization (for example, page size).
  • Generally, cluster-wide properties modification will require some portion of the nodes to be online (similarly to consensus requirements), but we may additionally provide a force option to overwrite the global properties at users' risk
Command-line tooling

A unified tool can be used to read and write both global and local configuration properties:

Create cache
ignitectl cluster.caches.partitioned={mode=PARTITIONED, backups=2}
Change baseline auto-adjust
ignitectl cluster.baseline.auto-adjust.enabled=false
Change rebalance threads
ignitectl -node consistent_id node.rebalance.threads=4


Similarly, the tool can read property or the whole subtree value, export properties to a file or even update the cluster configuration based on a file.

More examples
#read the property value
ignitectl cluster.baseilne.auto-adjust.timeout

#write values from the config file
ignitectl -w custom.properties

#update local properties on on all nodes in current topology
ignitectl -node * node.rebalance.threads=4

Appendix A - Full list of configuration properties

Last updated: May 11 2022

Node-local properties
clientConnector.port
clientConnector.portRange
clientConnector.connectTimeout
clientConnector.idleTimeout
compute.threadPoolSize
compute.threadPoolStopTimeoutMillis
network.port
network.portRange
network.shutdownQuietPeriod
network.shutdownTimeout
network.inbound.soBacklog
network.inbound.soReuseAddr
network.inbound.soKeepAlive
network.inbound.soLinger
network.inbound.tcpNoDelay
network.outbound.soKeepAlive
network.outbound.soLinger
network.outbound.tcpNoDelay
network.membership.membershipSyncInterval
network.membership.failurePingInterval
network.membership.scaleCube.membershipSuspicionMultiplier
network.membership.scaleCube.failurePingRequestMembers
network.membership.scaleCube.gossipInterval
network.nodeFinder.type
network.nodeFinder.netClusterNodes
rest.port
rest.portRange
Cluster-wide properties
pageMemory.defaultRegion.pageSize
pageMemory.defaultRegion.persistent
pageMemory.defaultRegion.initSize
pageMemory.defaultRegion.maxSize
pageMemory.defaultRegion.memoryAllocator.type
pageMemory.defaultRegion.evictionMode
pageMemory.defaultRegion.replacementMode
pageMemory.defaultRegion.evictionThreshold
pageMemory.defaultRegion.emptyPagesPoolSize
pageMemory.defaultRegion.checkpointPageBufSize
pageMemory.defaultRegion.lazyMemoryAllocation
pageMemory.regions.<name>.* # same properties as in default region
rocksDb.defaultRegion.size
rocksDb.defaultRegion.writeBufferSize
rocksDb.defaultRegion.cache
rocksDb.defaultRegion.numShardBits
rocksDb.regions.<name>.* # same properties as in default region

Risks and Assumptions

New configuration approach will require more effort from existing users during migration. Additionally, it will require a thorough configuration revisit and clear separation of global and local properties.

Discussion Links

TBD


Reference Links

  1. https://github.com/lightbend/config


Open Tickets

key summary type created updated due assignee reporter priority status resolution

JQL and issue key arguments for this macro require at least one Jira application link to be configured

Closed Tickets

key summary type created updated due assignee reporter priority status resolution

JQL and issue key arguments for this macro require at least one Jira application link to be configured

  • No labels

1 Comment

  1. Alexey Goncharuk

    I would suggest to keep pojo based configuration as an option for java based setups.

    Otherwise it looks for me like a prejudice against java users with their powerful tools like Spring.