Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Given a set of user-defined columns, this set is then rearranged so that fixed-sized columns go first. This sorted set of columns is used to form a tuplerow. Tuple Row layout is as follows:

FieldSize
Schema version2 bytes
Flags2 byte
Key columns hash4 bytes
Key columnschunk:
Key columns full chunk size
4 bytes
Key columns varsize columns Variable-length columns offsets table size2 bytes
Key columns varsize Variable-length columns offsets tableVariable (number of non-null non-default varsize varlen columns * 4)
Key columns nullNull-defaults mapnumber of columns / 8
Key Fix-sized columns fixed size valuesVariable
Key Variable-length columns variable size valuesVariable
Value columnschunk:
Value columns full chunk size4 bytes
Value columns varsize columns Variable-length  columns offsets table size2 bytes
Value columns varsize columns Variable-length  columns offsets tableVariable (number of non-null non-default varsize varlen columns * 4)
Value columns nullNull-defaults mapnumber of columns / 8
Value Fix-sized columns fixed size valuesVariable
Value Variable-length columns variable size valuesVariable

The flags field is a bitmask with each bit treated as a flag, with the following flags available (from flag 0 being the LSB to flag 7 being MSB):

  • Flag 0: tombstoneno value. If the flag is set, the value chunk is omitted, and the tuple e.g. the row represents a tombstone
  • Flag 1: skip key nullmap. If the flag is set, all column values in the key chunk are non-null and non-default, so that the null map for the key chunk is omitted
  • Flag 2: skip value nullmap. If the flag is set, all column values in the value chunk are non-null and non-default, so that the null map for the value chunk is omitted
  • Flags Flag 3-7: Reserved for future use
Schema evolution

Unlike Ignite 2.x approach, where binary object schema ID is defined by a set of fields that are present in a binary object, for the schema-first approach we assign a monotonically growing identifier to each version of the cache schema. The ordering guarantees should be provided by the underlying metadata storage layer (for example, the current distributed metastorage implementation or consensus-based metadata storage). The schema identifier should be stored together with the data tuples (but not necessarily with each tuple individually: we can store schema ID along with a page or larger chunks of data). The history of schema versions must be stored for a long enough period of time to allow upgrade all existing data stored in a given cache.

  • : skip key varlen table. If flag is set, all column values in the key chunk either of fix-sized type or null, so that the varlen table for key chunk is omitted.
  • Flag 4: skip value varlen table. If flag is set, all column values in the value chunk either of fix-sized type or null, so that the varlen table for value chunk is omitted.
  • Flags 5-15: Reserved for future use
Schema evolution

Unlike Ignite 2.x approach, where binary object schema ID is defined by a set of fields that are present in a binary object, for the schema-first approach we assign a monotonically growing identifier to each version of the cache schema. The ordering guarantees should be provided by the underlying metadata storage layer (for example, the current distributed metastorage implementation or consensus-based metadata storage). The schema identifier should be stored together with the data rows (but not necessarily with each row individually: we can store schema ID along with a page or larger chunks of data). The history of schema versions must be stored for a long enough period of time to allow upgrade all existing data stored in a given cache.

Given schema evolution history, a row migration from version N-k to version N is a Given schema evolution history, a tuple migration from version N-k to version N is a straightforward operation. We identify fields that were dropped during the last k schema operations and fields that were added (taking into account default field values) and update the tuple row based on the field modifications. Afterward, the updated tuple row is written in the schema version N layout format. The tuple row upgrade may happen on read with an optional writeback or on next update. Additionally, tuple row upgrade in background is possible.

Since the tuple row key hashcode is inlined to the tuple row data for quick key lookups, we require that the set of key columns do not change during the schema evolution. In the future, we may remove this restriction, but this will require careful hashcode calculation adjustments since the hash code value should not change after adding a new column with default value. Removing a column from the key columns does not seem possible since it may produce duplicates, and checking for duplicates may require a full scan.

...

With this history, upgrading a tuple row (1, "John", "Doe") of version 1 to version 4 means erasing columns lastname and taxid and adding columns residence with default "GB" and lastname (the column is returned back) with default "N/A" resulting in tuple row (1, "John", "GB", "N/A").

...

It's clear that given a fixed schema, we can generate an infinite number of classes that match the column of this schema. This observation can be used to simplify ORM for the end-users. For the APIs which return Java objects, the mapping from schema columns to the object fields can be constructed dynamically, allowing to deserialize a single tuple row into instances of different classes.

For example, let's  say we have a schema PERSON (id INT, name VARCHAR (32), lastname VARCHAR (32), residence VARCHAR (2), taxid INT). Each tuple row of this schema can be deserialized into the following classes:

...

Given the set of fields in the target class, Ignite may optimize the amount of data sent over the network by skipping fields that would be ignored during deserialization.

Update operation with object of truncated class is also possible, but missed fields will be treated as "not-set" as if it is done via SQL INSERT statement with some PERSON table fields missed. Missed field values will be implicitly set to DEFAULT column value regarding the row schema version.

Code Block
languagejava
table.insert(Person);

It may be impossible to insert an object/row with missed field if field is declared with NOT-NULL constraint and without DEFAULT (non-null) value specified.

Type mapping

Ignite will provide out-of-box mapping from standard platform types (Java, C#, C++) to built-in primitives. A user will be able to alter this mapping using some external mechanism (e.g. annotations to map long values to Number). Standard mapping is listed in the table below:

...