...
BinaryObject
API should be reworked, as it will not represent actual serialized objects anymore. It should be replaced with something like BinaryRecord
or DataRecord
representing a record in a cache or table. Similarly Similar to the current binary objects, records will provide access to individual fields. A record can also be deserialized into a class with any subset of fields represented in the record.
There are several ways a schema can be defined. The initial entry point to the schema definition is SchemaBuilder
java API:
TBD
The schema builder calls are transparently mapped to DDL statements so that all operations possible via a builder are also possible via DDL and vice versa.
Additionally, we may introduce an API that will infer the schema from a key-value pair using class fields and annotations. The inference happens on the calling site of the node invoking the table modification operation.
Table schema should be automatically exposed to the tabel configuration subtree so that simple schema changes are available via ignite
CLI and the schema can be defined during the table creation via ignite
CLI.
Schema-first approach imposes certain natural requirements which are more strict than binary object serialization format:
The suggested list of supported built-in data types is listed in the table below:
Introducing versioned schema allows to upgrade rows to the latest version on-fly and even to update a schema automatically in some simple cases, e.g. adding a new column.
So, a user may choose between two modes: Strict and Live for manual schema management and dynamic schema expansion correspondingly.
There are several ways a schema can be defined. The initial entry point to the schema definition is SchemaBuilder
java API:
TBD (see SchemaBuilders class for details)
The schema builder calls are transparently mapped to DDL statements so that all operations possible via a builder are also possible via DDL and vice versa.
Additionally, we may introduce an API that will infer the schema from a key-value pair using class fields and annotations. The inference happens on the calling site of the node invoking the table modification operation.
Table schema should be automatically exposed to the table configuration subtree so that simple schema changes are available via ignite
CLI and the schema can be defined during the table creation via ignite
CLI.
The Schema-first approach imposes certain natural requirements which are more strict than binary object serialization format:
The suggested list of supported built-in data types is listed in the table below:
Type | Size | Description |
---|---|---|
Bitmask(n) | ⌈n/8⌉ bytes | A fixed-length bitmask of n bits |
Int8 | ||
Type | Size | Description |
Bitmask(n) | ⌈n/8⌉ bytes | A fixed-length bitmask of n bits |
Int8 | 1 byte | 1-byte signed integer |
Uint8 | 1 byte | 1-byte unsigned integer |
Int16 | 2 bytes | 2-byte signed integer |
Uint16 | 2 bytes | 2-byte unsigned integer |
Int32 | 4 bytes | 4-byte signed integer |
Uint32 | 4 bytes | 4-byte unsigned integer |
Int64 | 8 bytes | 8-byte signed integer |
Uint64 | 8 bytes | 8-byte unsigned integer |
Float | 4 bytes | 4-byte floating-point number |
Double | 8 bytes | 8-byte floating-point number |
Number([n]) | Variable | Variable-length number (optionally bound by n bytes in size) |
Decimal | Variable | Variable-length floating-point number |
UUID | 16 bytes | UUID |
String | Variable | A string encoded with a given Charset |
Date | 3 bytes | A timezone-free date encoded as a year (15 bits), month (4 bits), day (5 bits) |
Time | 4 bytes | A timezone-free time encoded as padding (5 bits), hour (5 bits), minute (6 bits), second (6 bits), millisecond (10 bits) |
Datetime | 7 bytes | A timezone-free datetime encoded as (date, time) |
Timestamp | 8 bytes | Number of milliseconds since Jan 1, 1970 00:00:00.000 (with no timezone) |
Binary | Variable | Variable-size byte array |
...
Field | Size | ||
---|---|---|---|
Schema version | 2 bytes | ||
Flags | 2 byte | ||
Key columns hash | 4 bytes | ||
Key chunk: | |||
Key chunk size | 4 bytes | ||
Null-map | ⌈number of columns / 8⌉ | ||
Variable-length columns offsets table size | 2 bytes | ||
Variable-length columns offsets table | Variable (number of non-null varlen columns * 4) | ||
Fix-sized columns values | Variable | ||
Variable-length columns values | Variable | ||
Value chunk: | |||
Value chunk size | 4 bytes | ||
Null-map | ⌈number of columns / 8⌉ | ||
Variable-length columns offsets table size | 2 bytes | ||
Variable-length columns offsets table | Variable (number of non-null varlen columns * 4) | ||
Fix-sized columns values | Variable | Variable-length columns values | VariableVariable |
Variable-length columns values | Variable |
For the small rows, the metadata sizes may introduce a very noticeable overhead, so it looks reasonable to write them in a more compact way using different techniques.
The flags field can be used to detect the format.
IMPORTANT: having multiple formats MUST guarantee the key (as well as value) chunk will be always written in a single possible way to allow comparing chunks of rows of the same version as just byte arrays.
The flags field is a bitmask with each bit treated as a flag, with the following flags available (from flag 0 being the LSB to flag 7 being MSB):
Row hash can be calculated from affinity field values while marshalling marshaling to byte array. Because od field order is defined by the scheme, a key hash can be calculated consistently regarding the column order.
Key can be compared as byte[] for compatible schemas (that has have the same key column set), otherwise otherwise, the oldest row should be upgrade upgraded first.
It is possible to compare keys column-by-column regarding the schema if the same key can be serialized in more than one way. E.g. kind of compression will be supported and compressed rows could be marked with a flag.
...
Given schema evolution history, a row migration from version N-k to version N is a straightforward operation. We identify fields that were dropped during the last k schema operations and fields that were added (taking into account default field values) and update the row based on the field modifications. Afterward, the updated row is written in the schema version N layout format. The row upgrade may happen on read reading with an optional writeback or on the next update. Additionally, a row upgrade in the background is possible.
Since the row key hashcode is inlined to the row data for quick key lookups, we require that the set of key columns do not change during the schema evolution. In the future, we may remove this restriction, but this will require careful hashcode calculation adjustments since the hash code value should not change after adding a new column with default value. Removing a column from the key columns does not seem possible since it may produce duplicates, and checking for duplicates may require a full scan.
...
If one will try to serialize object with 'short' value out of Uint8 range then it end up with exception (ColumnValueIsOutOfRangeException).
...
One of the important benefits of binary objects was the ability to store objects with different sets of fields in a single cache. We can accommodate for a very similar behavior in the schema-first approach.
When an object a tuple is inserted into a table, we attempt to 'fit' object tuple fields to the schema columns. If a Java object the tuple has some extra fields which are not present in the current schema, the schema is automatically updated to store additional extra fields that are present in the tuple.
This will work in the same way any Java objects that are first-citizens: e.g. Java object or objects in terms of other languages which has implementaion.
On the other hand, if an object has fewer fields than the current schema, the schema is not updated auto(such scenario usually means that an update is executed from an outdated client which did not yet receive a proper object class version). In other words, columns are never dropped during automatic schema evolution; a column can only be dropped by an explicit user command.
...