Page History

...

BinaryObject API should be reworked, as it will not represent actual serialized objects anymore. It should be replaced with something like BinaryRecord or DataRecord representing a record in a cache or table. Similarly Similar to the current binary objects, records will provide access to individual fields. A record can also be deserialized into a class with any subset of fields represented in the record.

Schema Definition API

There are several ways a schema can be defined. The initial entry point to the schema definition is SchemaBuilder java API:

TBD

The schema builder calls are transparently mapped to DDL statements so that all operations possible via a builder are also possible via DDL and vice versa.

Additionally, we may introduce an API that will infer the schema from a key-value pair using class fields and annotations. The inference happens on the calling site of the node invoking the table modification operation.

Table schema should be automatically exposed to the tabel configuration subtree so that simple schema changes are available via ignite CLI and the schema can be defined during the table creation via ignite CLI.

Data restrictions

Schema-first approach imposes certain natural requirements which are more strict than binary object serialization format:

The column type must be of one of a predefined set of available 'primitives' (including Strings, UUIDs, date & time values)
Arbitrary nested objects and collections are not allowed as column values. Nested POJOs should either be inlined into a schema or stored as BLOBs
Date & time values should be compressed with preserving natural order and decompression should be a trivial operation (like applying bitmask).

The suggested list of supported built-in data types is listed in the table below:

Introducing versioned schema allows to upgrade rows to the latest version on-fly and even to update a schema automatically in some simple cases, e.g. adding a new column.
So, a user may choose between two modes: Strict and Live for manual schema management and dynamic schema expansion correspondingly.

Schema Definition API

There are several ways a schema can be defined. The initial entry point to the schema definition is SchemaBuilder java API:

TBD (see SchemaBuilders class for details)

The schema builder calls are transparently mapped to DDL statements so that all operations possible via a builder are also possible via DDL and vice versa.

Additionally, we may introduce an API that will infer the schema from a key-value pair using class fields and annotations. The inference happens on the calling site of the node invoking the table modification operation.

Table schema should be automatically exposed to the table configuration subtree so that simple schema changes are available via ignite CLI and the schema can be defined during the table creation via ignite CLI.

Data restrictions

The Schema-first approach imposes certain natural requirements which are more strict than binary object serialization format:

The column type must be of one of a predefined set of available 'primitives' (including Strings, UUIDs, date & time values)
Arbitrary nested objects and collections are not allowed as column values. Nested POJOs should either be inlined into a schema or stored as BLOBs
Date & time values should be compressed with preserving natural order and decompression should be a trivial operation (like applying bitmask).

The suggested list of supported built-in data types is listed in the table below:

Type	Size	Description
Bitmask(n)	⌈n/8⌉ bytes	A fixed-length bitmask of n bits
Int8
Type	Size	Description
Bitmask(n)	⌈n/8⌉ bytes	A fixed-length bitmask of n bits
Int8	1 byte	1-byte signed integer
Uint8	1 byte	1-byte unsigned integer
Int16	2 bytes	2-byte signed integer
Uint16	2 bytes	2-byte unsigned integer
Int32	4 bytes	4-byte signed integer
Uint32	4 bytes	4-byte unsigned integer
Int64	8 bytes	8-byte signed integer
Uint64	8 bytes	8-byte unsigned integer
Float	4 bytes	4-byte floating-point number
Double	8 bytes	8-byte floating-point number
Number([n])	Variable	Variable-length number (optionally bound by n bytes in size)
Decimal	Variable	Variable-length floating-point number
UUID	16 bytes	UUID
String	Variable	A string encoded with a given Charset
Date	3 bytes	A timezone-free date encoded as a year (15 bits), month (4 bits), day (5 bits)
Time	4 bytes	A timezone-free time encoded as padding (5 bits), hour (5 bits), minute (6 bits), second (6 bits), millisecond (10 bits)
Datetime	7 bytes	A timezone-free datetime encoded as (date, time)
Timestamp	8 bytes	Number of milliseconds since Jan 1, 1970 00:00:00.000 (with no timezone)
Binary	Variable	Variable-size byte array

...

Variable

Field	Size
Schema version	2 bytes
Flags	2 byte
Key columns hash	4 bytes
Key chunk:
Key chunk size	4 bytes
Null-map	⌈number of columns / 8⌉
Variable-length columns offsets table size	2 bytes
Variable-length columns offsets table	Variable (number of non-null varlen columns * 4)
Fix-sized columns values	Variable
Variable-length columns values	Variable
Value chunk:
Value chunk size	4 bytes
Null-map	⌈number of columns / 8⌉
Variable-length columns offsets table size	2 bytes
Variable-length columns offsets table	Variable (number of non-null varlen columns * 4)
Fix-sized columns values	Variable	Variable-length columns values	Variable
Variable-length columns values	Variable

For the small rows, the metadata sizes may introduce a very noticeable overhead, so it looks reasonable to write them in a more compact way using different techniques.

VarInt - variable size integer for sizes
different VarTable formats with byte/short/int offsets
skip writing VarTable and/or Null-map if possible.

The flags field can be used to detect the format.

IMPORTANT: having multiple formats MUST guarantee the key (as well as value) chunk will be always written in a single possible way to allow comparing chunks of rows of the same version as just byte arrays.

The flags field is a bitmask with each bit treated as a flag, with the following flags available (from flag 0 being the LSB to flag 7 being MSB):

Flag 0: no value. If the flag is set, the value chunk is omitted, e.g. the row represents a tombstone
Flag 1: skip key nullmapNull-map. If the flag is set, all column values in the key chunk are non-null, so that the null the Null-map for the key chunk is omitted
Flag 2: skip value nullmapNull-map. If the flag is set, all column values in the value chunk are non-null, so that the null the Null-map for the value chunk is omitted
Flag 3: skip key varlen tableVarTable. If flag is set, all column values in the key chunk either of fix-sized type or null, so that the varlen table VarTable for key chunk is omitted.
Flag 4: skip value varlen value VarTable table. If flag is set, all column values in the value chunk either of fix-sized type or null, so that the varlen table VarTable for value chunk is omitted.
Flags 5-15: Reserved for future use.

Hash calculation and key comparison

Row hash can be calculated from affinity field values while marshalling marshaling to byte array. Because od field order is defined by the scheme, a key hash can be calculated consistently regarding the column order.

Key can be compared as byte[] for compatible schemas (that has have the same key column set), otherwise otherwise, the oldest row should be upgrade upgraded first.
It is possible to compare keys column-by-column regarding the schema if the same key can be serialized in more than one way. E.g. kind of compression will be supported and compressed rows could be marked with a flag.

...

Given schema evolution history, a row migration from version N-k to version N is a straightforward operation. We identify fields that were dropped during the last k schema operations and fields that were added (taking into account default field values) and update the row based on the field modifications. Afterward, the updated row is written in the schema version N layout format. The row upgrade may happen on read reading with an optional writeback or on the next update. Additionally, a row upgrade in the background is possible.

Since the row key hashcode is inlined to the row data for quick key lookups, we require that the set of key columns do not change during the schema evolution. In the future, we may remove this restriction, but this will require careful hashcode calculation adjustments since the hash code value should not change after adding a new column with default value. Removing a column from the key columns does not seem possible since it may produce duplicates, and checking for duplicates may require a full scan.

...

If one will try to serialize object with 'short' value out of Uint8 range then it end up with exception (ColumnValueIsOutOfRangeException).

Dynamic schema expansion (

...

Live-schema)

One of the important benefits of binary objects was the ability to store objects with different sets of fields in a single cache. We can accommodate for a very similar behavior in the schema-first approach.

When an object a tuple is inserted into a table, we attempt to 'fit' object tuple fields to the schema columns. If a Java object the tuple has some extra fields which are not present in the current schema, the schema is automatically updated to store additional extra fields that are present in the tuple.
This will work in the same way any Java objects that are first-citizens: e.g. Java object or objects in terms of other languages which has implementaion.

On the other hand, if an object has fewer fields than the current schema, the schema is not updated auto(such scenario usually means that an update is executed from an outdated client which did not yet receive a proper object class version). In other words, columns are never dropped during automatic schema evolution; a column can only be dropped by an explicit user command.

...

Page tree

Versions Compared

Old Version 25

New Version 26

Key

Schema Definition API

Data restrictions

Schema Definition API

Data restrictions

Hash calculation and key comparison

Dynamic schema expansion (

Live-schema)