ID | IEP-54 |
Author | |
Sponsor | |
Created |
|
Status | DRAFT |
The way Ignite works with data schemas is inconsistent:
This creates multiple usability issues:
The general idea is to have a one-to-one mapping between data schemas and caches/tables. There is a single unified schema for every cache, it is applied to both data storage itself and to the SQL.
When a cache is created, it is configured with a corresponding data schema. There must be an API and a tool to see the current version of the schema for any cache, as well as make updates to it. Schema updates are applied dynamically without downtime.
DDL should work on top of this API providing similar functionality. E.g. CREATE TABLE invocation translates to a cache creation with the schema described in the statement.
Anything stored in a cache/table must be compliant with the current schema. An attempt to store incompatible data should fail.
The binary protocol should be used only as the data storage format. All serialization that happens for communication only should be performed by a different protocol. The data storage format will be coupled with the schemas, while the communication is independent of them. As a bonus, this will likely allow for multiple optimizations on both sides, as serialization protocols will become more narrow purposed.
BinaryObject
API should be reworked, as it will not represent actual serialized objects anymore. It should be replaced with something like BinaryRecord
or DataRecord
representing a record in a cache or table. Similarly to the current binary objects, records will provide access to individual fields. A record can also be deserialized into a class with any subset of fields represented in the record.
Schema-first approach imposes certain natural requirements which are more strict than binary object serialization format:
Unlike Ignite 2.x approach, where binary object schema ID is defined by a set of fields which are present in a binary object, for schema-first approach we assign a monotonically growing identifier to each version of the cache schema. The ordering guarantees should be provided by the underlying metadata storage layer (for example, current distributed metastorage implementation or a consensus-based metadata storage). The schema identifier should be stored together with the data tuples (but not necessarily with each tuple individually: we can store schema ID along with a page or larger chunks of data). The history of schema versions must be stored for a long enough period of time to allow upgrade all existing data stored in a given cache.
Given schema evolution history, a tuple migration from version N-k to version N is a straightforward operation. We identify fields that were dropped during the last k schema operations and fields that were added (taking into account default field values) and update the tuple based on the field modifications. Afterwards, the updated tuple is written in the schema version N layout format. The tuple upgrade may happen on read with optional writeback or on next update. Additionally, tuple upgrade in background is possible.
For example, consider the following sequence of schema modifications expressed in SQL-like terms:
CREATE TABLE Person (id INT, name VARCHAR(32), lastname VARCHAR(32), taxid int); ALTER TABLE Person ADD COLUMN residence VARCHAR(2) DEFAULT "GB"; ALTER TABLE Person DROP COLUMN lastname, taxid; ALTER TABLE Person ADD COLUMN lastname DEFAULT "N/A";
This sequence of modifications will result in the following schema history
ID | Columns | Delta |
---|---|---|
1 | id, name, lastname, taxid | N/A |
2 | id, name, lastname, taxid, residence | + residence ("GB") |
3 | id, name, residence | -lastname, -taxid |
4 | id, name, residence, lastname | +lastname ("N/A") |
With this history, upgrading a tuple (1, "John", "Doe") of version 1 to version 4 means erasing columns lastname and taxid and adding columns residence with default "GB" and lastname (the column is returned back) with default "N/A" resulting in tuple (1, "John", "GB", "N/A").
It's clear that given a fixed schema, we can generate an infinite number of classes that match the column of this schema. This observation can be used to simplify ORM for the end-users. For the APIs which return Java objects, the mapping from schema columns to the object fields can be constructed dynamically, allowing to deserialize a single tuple into instances of different classes.
For example, let's say we have a schema PERSON (id INT, name VARCHAR (32), lastname VARCHAR (32), residence VARCHAR (2), taxid INT). Each tuple of this schema can be deserialized into the following classes:
class Person { int id; String name; String lastName; }
class RichPerson { int id; String name; String lastName; String residence; int taxId; }
For each table, a user may specify a default Java class binding, and for each individual operation a user may provide a target class for deserialization:
Person p = table.get(key, Person.class);
Given the set of fields in the target class, Ignite may optimize the amount of data sent over the network by skipping fields that would be ignored during deserialization.
n/a
n/a
n/a