IDIEP-117
Author
Sponsor
Created

 

Status


IN PROGRESS

Motivation

The Apache Ignite 3.x supports different storage engines. At least at the moment, the following storage engines are supported:

  • PageMemory based In-memory key-value storage engine
  • PageMemory based persistent key-value storage engine
  • RocksDB based persistent key-value storage engine

Different storage engines allow a user to get the best suitable storage for his use case.

Definitions

Storage engine - subsystem which supports all required data structures and functionality needed for data storage, manipulation and fetching. A storage engine implementation must be deployed on all cluster nodes where this storage can be used.

Storage engine bridge interface - because Apache ignite 3.x is SQL database, storage engines must implement Storage engine bridge interface. The main responsibility of storage engine bridge interface is a translation of low level data manipulation API to specific storage engine’s operations. In general, the storage engine knows nothing about database schema and table relations and the storage should reflect Apache Ignite data knowledge to the storage specific data structures (e.g. RocksDB based storage engine is pure key-value storage and Storage engine bridge interface must translate operations in terms of Apache Ignite to terms of RocksDB).

Schema - the only source of knowledge about database entities like catalogs and tables. The schema is a versioned entity. The schema version must be incremented on each schema change (adding or deletion of tables, views, etc). NOTE: additional details about schema versioning and schema changes propagation will be defined in special protocol later.

Table schema - the only source of knowledge about data stored in the table (etc. column types, primary key, constraints). The table schema is a versioned entity. The version must be incremented on each table alteration. NOTE: additional details about schema versioning and schema changes propagation will be defined in special protocol later.

Storage engine configuration - part of Apache Ignite configuration which also supports storage engine specific configuration. Each storage engine has a unique name which uniquely identifies the storage engine.

Storage engine registry - a part of the system configuration which provides access to all storages’ configuration. Storage Engine Registry doesn’t allow having the same storage multiple times.

General Provisions

Any storage engine has a specific configuration which depends on the storage engine’s implementation details. Such configuration must be supported in a natural way for a user, that is the user should use public API and CLI tool for configuration.

It is possible that Apache Ignite uses different storage engines for different tables at the same time (e.g. some tables are placed in memory, while others - in some persistent storage engine). 

Storage Engine Configuration

In order to configure any storage engine the storage engine configuration schema and the storage engine implementation must be available for a cluster node. Service loader is used for picking up the storage engine implementation and the storage engine specific configuration.

Any storage engine can be configured during cluster initialization stage or at a runtime when cluster is in the running state. Each storage engine implementation listens to the meta storage for configuration changes and reacts to these changes.

Default Storage Engine

It should be possible to configure the default storage engine configuration name which will be used in DDL scripts.

There are following scenarios:

  1. Only one storage engine is configured in the system. 
  2. More than one storage engines are configured and
    1. defaultStorageEngine property is defined.
    2. defaultStorageEngine property is not defined.

Resolving of default storage engine name should be done accordingly to the following rules:

  • Scenario 1: Name of configured storage engine. The defaultStorageEngine property must be ignored.
  • Scenario 2a: The defaultStorageEngine property must be used.
  • Scenario 2b: An ambiguity. An error should be returned to a user.

Storage Initialization and Start

Storage engine could be initialized and started at the stage of cluster initialization. Also, new storage could be configured, and hence, initialized and started at a stage when the cluster is already staying in running state.

DDL support

There must be an ability to define a storage engine for every table using the following DDL syntax:

CREATE TABLE <table_name> (

<table_schema>

ENGINE <storage_engine_name>;

It is possible to define a default storage engine in the following manner:

SET default_engine = <storage_engine_name>;

CREATE TABLE <table_name> (

<table_schema>

);

The default_engine parameter is not a durable property of a cluster. The scope of the SET expression is limited by DDL script execution context.

In cases when the storage engine is not defined explicitly (using ENGINE or SET default_engine) the defaultStorageEngine configuration property should be used (see resolving rules in the Default Storage Engine section above). 

If the given storage engine name does not exist then an error should be returned to a user.

Since different storage engines have various parameters which could be taken into account (e.g. DataRegion for page memory based storage) it should be possible to pass these parameters into aforementioned statements using WITH keyword. For example:

CREATE TABLE <table_name> (

<table_schema>

ENGINE <storage_engine_name> 

WITH dataRegion = <data_region_name>;

It is impossible to change the storage engine for the table after creation because it leads to moving a big amount of data between different storages. But a user could copy a table to another table which was created on a different storage engine. Maybe such kind of action will be automated later.

Storage specific parameters (like dataRegion) can or can’t be changed at runtime. It should be defined for every parameter in a specific storage configuration.

Storage Engine Bridge Interface

Storage engine bridge interface is the interface provided by the Apache Ignite. Every storage engine must provide implementation of this interface. The responsibility of implementation of storage engine bridge interface is translating a set of Apache Ignite operations to storage engine specific operations in order to manipulate the data effectively (e.g. all SQL statements will be eventually translated to some set of key-value operations for RocksDB based storage engine).

The storage engine bridge interface is some kind of a collective image, that is not exactly one interface in terms of Java programming language. Actually the storage bridge interface is a family of interfaces which represents the API that should be implemented by a specific storage engine’s vendor.

The current design contains at least the following entities which could represent storage bridge interface (together or separately): 

  • org.apache.ignite.internal.storage.engine.TableStorage interface
  • org.apache.ignite.internal.table.distributed.storage.VersionedRowStore interface
  • org.apache.ignite.internal.storage.PartitionStorage interface
  • org.apache.ignite.internal.storage.engine.StorageEngine interface
  • org.apache.ignite.configuration.schemas.store.StorageEngineConfigurationSchema interface

Transparent Data Access And Modification

Since different tables can store their data in different storage engines, any data manipulation operation must  get access to the data transparently using only a table name as input. A user should not care about a specific storage engine after the database schema is defined and created. It also means that SQL query engine must be able to execute any cross-table query, and hence, cross-engine query, transparently.

Open Questions

Default storage engine

Probably it will be useful to provide a way for configuration of a default storage engine. It should be a cluster wide property because different values for this property on different nodes will lead to an unpredictable behavior. 

The existence of such configuration property also could lead to problems in cases when the DDL script can be applied on different clusters with different configurations (e.g. test and production environment). So explicit usage of SET default_engine = <storage_engine_name> statement is the safe way for a user.

On the other hand, there are a lot of cases where only one specific storage engine exists in the system. So the system can infer the default storage engine name from configuration. It is easy, because there is only one configured storage engine.

Usage of storage specific parameters in DDL

Every storage engine has a different configuration due to the difference in implementation. A user should be able to configure a storage engine in well known terms instead of terms offered by Apache Ignite developers. For example, there is no “data region” term in RocksDB. Moreover, it is hard enough (may be impossible) to invent abstractions for every storage engine.

So we need some way for defining storage specific configuration properties that could be used from DDL (e.g. data region name for page memory storage).


Open Tickets


key summary type created updated due assignee reporter customfield_12311032 customfield_12311037 customfield_12311022 customfield_12311027 priority status resolution

JQL and issue key arguments for this macro require at least one Jira application link to be configured

Closed Tickets

key summary type created updated due assignee reporter customfield_12311032 customfield_12311037 customfield_12311022 customfield_12311027 priority status resolution

JQL and issue key arguments for this macro require at least one Jira application link to be configured

  • No labels