Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Add observable timestamp to TX_BEGIN operation

...

IDIEP-76
Author
Sponsor
Created

  

Status

Status
colour

Grey

Green
title

DRAFT

ACtIVE


Table of Contents

Motivation

...

Adapt Ignite 2.x protocol for Ignite 3.0. Main The main differences are:

...

  • Every Ignite node listens on a TCP port. Thin client implementations connect to any node in the cluster (possibly multiple nodes) through a TCP socket and perform Ignite operations using a well-defined binary protocol.
  • Server-side connection parameters are defined in ClientConnectorConfiguration class.
    • Default port is 10800.
    • Connector is enabled by default, no configuration changes are needed.
  • Netty is used on the server for network IO.

...

MsgPack is used for data serialization (uses big-endian byte order).

Data Types Mapping

Ignite data types defined in IEP-54 map to MsgPack data types the following way:

Ignite Type

Size

MsgPack TypeNotes
Bitmask(n)n/8 bytesbin8 / bin16 / bin32Extension 8
IntX, UintX1-8 bytesfixint / intX / uintXInteger types are interchangeable when possible. fixint (1 byte) can be passed as a value for uint64 column.
Float4 bytesfloat32
Double8 bytesfloat64
Number([n])Variableext16 (up to 2^8 - 1 bytes)Extension 1
DecimalVariableext16 (up to 2^8 - 1 bytes)Extension 2
UUID16 bytesfixext16Extension 3
StringVariablestr
Date3 bytesfixext4Extension 4
Time4 5 bytesfixext4ext8Extension 5
Datetime7 bytesfixext8Extension 6
Timestamp8 bytes10 bytesext8Extension 7
IgniteUuid24 bytesext8Extension 9
NoValue1 bytefixext1Extension 10timestamp 64
BinaryVariablebin32 (up to 2^32 - 1 bytes)

...

MsgPack provides multiple data types for integer values. When encoding a value, the smallest data type for that value is picked automatically.

...

Message format

All messages, request requests and responseresponses, except handshake, start with int a 4-byte message length (excluding the length itself).

  • Any MsgPack int type can be used (see varint above). Empty message will be single 0 byteMessage length is not encoded with MsgPack, 4 bytes contain raw int32 value in little-endian byte order.
  • Maximum message length is 2147483647 (Integer.MAX_VALUE and maximum byte array length in Java)
int 4 bytes Length of Payload
...Payload

Tuple serialization

  • This IEP covers Table API - Ignite#tables(), which is the only public API available at the moment.
  • Table API operates on tuples (Tuple and TupleBuilder interfaces).
  • Tuple is a set of key-value pairs - column name and value.
  • Values can be of basic types described above and in IEP-54, according to the table schema. 
  • Table schema can evolve over time, columns get added and removed. Ignite tracks the evolution with incrementing integer schema version.
  • All schema versions are stored by Ignite. It is possible to retrieve the set of columns for any version.

A simple way to serialize a tuple would be to write a map (String → ...) from column name to value.

However, thanks to schema-first approach, we can avoid sending column names with the values (serializing strings is expensive). Instead, we can write an integer schema version, and then values for every column in that schema.

values "type" below indicates a raw sequence of values (not a MsgPack array). Since we know the number of columns in the schema, we don't need an array header.

intSchema version
valuesColumn values in schema order. nil when there is no value for a particular column.

To read or write a value for a column, we get the column type from the schema and use corresponding MsgPack data type according to Data Types Mapping above.

Code Block
languagejava
titleTuple serialization (pseudocode)
for (Column col : tuple) {
	switch (col.type().spec()) {
    	case BYTE:
        	packer.packByte(tuple.byteValue(col.name()));
	        break;
	    case SHORT:
...

Key tuples

Key tuples are tuples with key columns only. Key columns always come first in the schema. So if there are 2 key columns, first two values of the tuple is the key.

Null vs NoValue

Ignite Table API handles "column set to null" (1) and "column not set" (2) differently.

  • Non-nullable column does not allow (1), but allows (2) as long as there is a default value.
  • Nullable column will be set to null in (1) and to default value in (2).

NoValue custom protocol type reflects this distinction.

Handshake


Request
4 bytesMagic number 49 47 4E 49, or "IGNI". Helps identifying to identify misconfigured SSL or other apps probing the port.
intPayload length
intVersion major
intVersion minor
intVersion patch
intClient code (1 for JDBC, 2 for general-purpose client)
binFeatures (bitmask)
map (string → any)Extensions (auth, attributes, etc). Server can skip unknown extension data (when client is newer).

...

Response
4 bytesMagic number 49 47 4E 49, or "IGNI". Helps identifying to identify misconfigured SSL or different server on the port.
intPayload length
intServer version major
intServer version minor
intServer version patch
intError code (0 for success)
stringError message (when error code is not 0)
intWhen error code is 0: Server idle timeout
stringWhen error code is 0: Server node id
string When error code is 0: Server node name (consistent id)
binWhen error code is 0: Server features (bitmask)
map (string → any)When error code is 0: Extensions (auth, attributes, etc). Client can skip unknown extension data (when server is newer).

...

 Upon successful handshake, client can start performing operations by sending a request with specific op code. Each operation has it's its own request and response format, with a common header.

Request
intOperation code
intRequest id, generated by client and returned as-is in response
...Operation-specific data


0 for success error code is not 0
Response
intType = 0
intRequest id
intFlags (1 = partition assignment changed)
longObservable timestamp (causality token)
uuid or nullTrace id (null for success)
intError code (when trace id is not null)
stringError message (when trace id is not null)
string or nullError stack trace (when trace id is not null)
map or nullError details (when trace id is not null)
...Operation-specific data (when trace id is null)

Request or response without operation-specific data is called basic request or basic response below.

...

Operation codes and request ids are omitted everywhere below for brevity.

TABLE_CREATE = 1

...

TABLE_DROP = 2

...

Basic response.

TABLES_GET = 3

Basic request.

Response
intN = table count
N * map (UUID + -> string)pairs map of tables table ids and names

Note: tables can only be created/deleted with SQL, there are no TABLE_CREATE or TABLE_DROP operations.

TABLE_GET = 4

Request
stringtable name

...

Response
UUID or niltable ID or null when a table with the given name does not exist

...

Request
UUIDtable ID
arr or nilschema IDs, or null to get alllatest


Response
map (int → array (array))Map from schema ID to columns. Column is represented by an array of values for: name, type, isKey, isNullable. The array can contain extra data in future for additional properties.

TUPLE_UPSERT = 10

Request
UUIDtable ID
int or niltransaction ID
intschema ID
arrvaluesvalues for all columns in given schema (nil when value is missing for a column)

...

Client side is supposed to match provided columns against the latest known schema.

  • If columns don't match, request the latest schema and try again.
  • If the latest schema still does not match, and live schema is enabled, use TUPLE_UPSERT_SCHEMALESS

TUPLE_GET =

...

12

Request
UUIDtable ID
int or niltransaction ID
intschema ID
arrvaluesvalues for key columns (in schema order)


Response
int or nilschema id for the current tuple, or nil when there is no matching record
valuesvalues for value columns arrtuple values in schema order, when schema id is not nil

Clients should retrieve schemas with SCHEMAS_GET and cache them per table.

TUPLE_UPSERT_ALL =

...

13

Request
UUIDtable ID
int or niltransaction ID
intschema ID
introw count
valuesarr of arrarray of rows with values for all columns in given schema (nil when value is missing for a column)

Basic response.

TUPLE_GET_ALL =

...

15

Request
UUIDtable ID
int or niltransaction ID
intschema ID, or nil when result set is empty
introw count
valuesarr of arrarray of rows with values for key columns (in schema order)

...

Response
intschema ID (for all tuples in response)
introw count
valuesarr of arrarray of rows with values in schema order

TUPLE_GET_AND_UPSERT =

...

16

Request
UUIDtable ID
int or niltransaction ID
intschema ID
valuesvalues for all columns in given schema (nil when value is missing for a column)


Response
int or nilschema id for the current tuple, or nil when there is no matching record
valuesvalues for value columns in schema order

TUPLE_INSERT =

...

18

Request
UUIDtable ID
int or niltransaction ID
intschema ID
valuesvalues for all columns in given schema (nil when value is missing for a column)


Response
boolInsert result

TUPLE_INSERT_ALL =

...

20

Request
UUIDtable ID
int or niltransaction ID
intschema ID
introw count
valuesrows with values for all columns in given schema (nil when value is missing for a column)


Response
intschema id, or nil when no rows were skipped
intskipped row count
valuesskipped rows (values in schema order)

TUPLE_REPLACE =

...

22

Request
UUIDtable ID
int or niltransaction ID
intschema ID
valuesvalues for all columns in given schema (nil when value is missing for a column)


Response
boolReplace result

TUPLE_

...

REPLACE_EXACT = 24

Request
UUIDtable ID
int or niltransaction ID
intschema ID
valuesoldRec: values for all columns in given schema (nil when value is missing for a column)
valuesnewRec: values for all columns in given schema (nil when value is missing for a column)


Response
boolReplace result

TUPLE_GET_AND_REPLACE =

...

26

Request
UUIDtable ID
int or niltransaction ID
intschema ID
valuesvalues for all columns in given schema (nil when value is missing for a column)


Response
int or nilschema id for the current tuple, or nil when there is no matching record
valuesvalues for value columns in schema order

TUPLE_DELETE =

...

28

Request
UUIDtable ID
int or niltransaction ID
intschema ID
valuesvalues for key columns in given schema


Response
boolDelete result

TUPLE_DELETE_ALL =

...

29

Request
UUIDtable ID
int or niltransaction ID
intschema ID
introw count
valuesrows with values for key columns in a given schema


Response
intschema id, or nil when no rows were skipped
intskipped row count
valuesskipped rows (values for key columns in schema order)


TUPLE_DELETE_EXACT =

...

30

Request
UUIDtable ID
int or niltransaction ID
intschema ID
valuesvalues for all columns in given schema (nil when value is missing for a column)


Response
boolDelete result

TUPLE_DELETE_ALL_EXACT =

...

31

Request
UUIDtable ID
int or niltransaction ID
intschema ID
introw count
valuesrows with values for all columns in given schema (nil when value is missing for a column)


Response
intschema id, or nil when no rows were skipped
intskipped row count
valuesskipped rows (values for key columns in schema order)

TUPLE_GET_AND_DELETE =

...

32

Request
UUIDtable ID
int or niltransaction ID
intschema ID
valuesvalues for all columns in given schema (nil when value is missing for a column)


Response
int or nilschema id for the current tuple, or nil when there is no matching record
valuesvalues for value columns in schema order, when schema id is not null

TUPLE_CONTAINS_KEY = 33

Request
UUIDtable ID
int or niltransaction ID
intschema ID
valuesvalues for key columns


Response
boolwhether a tuple with the given key exists

TX_BEGIN = 43

Request
boolRead-only tx flag
longObservable timestamp (value from server - see standard response header)


Response
intTransaction ID

TX_COMMIT = 44

Request
intTransaction ID

Basic response

TX_ROLLBACK = 45

Request
intTransaction ID

Basic response

Risks and Assumptions

  • This IEP covers handshake and Tables API, which is the only public API available at the moment.

...

  • Invoke API is out of scope

...

  • : code deployment and processor serialization should be designed separately.

Discussion Links

Reference Links

...