




Thin clients need an efficient way to stream large amounts of data into the cluster.


Add DataStreamer operations to the Thin Client protocol: OP_DATA_STREAMER_START, OP_DATA_STREAMER_ADD_DATA

There are multiple options for the client-side implementation with this approach, from simple to more efficient:

  • Stateless -- all data goes though a single server node, only OP_DATA_STREAMER_START is used to write the batch and close the streamer until next batch is ready
  • Stateful -- all data goes though a single server node, and streamer is kept open
  • Partition-aware stateless – data is grouped by node and batches are sent to the primary, new streamer is used for every batch
  • Partition-aware stateful – data is grouped by node and batches are sent to the primary, streamer per node is kept open

OP_RESOURCE_CLOSE can be used to close the streamer, as well as Close flag, depending on the use case:

  • Cancel and close - use OP_RESOURCE_CLOSE
  • Flush and close - use OP_DATA_STREAMER_ADD_DATA with Close flag (to avoid an extra OP_RESOURCE_CLOSE call)


Initial operation combines streamer options and the first batch of entries.

byteflags (allowOverwrite, skipStore, keepBinary, flush, close)
intperNodeBufferSize, -1 for server default
intperThreadBufferSize, -1 for server default
BinaryObjectStream receiver
bytereceiverPlatform, when receiver is not null (1 = Java, 2 = .NET, 3 = C++) 
n*(Object, Object)entries (add when value is not null, remove otherwise)

longresourceId (0 when close flag is set)


  • Close flag can be true when there is only a single batch, so an additional close request is not necessary
  • Flush flag should be true when client-side user code calls Flush method, and false otherwise


Add data to the existing streamer by a resource id, optionally flush and/or close the streamer.

byteflags (flush, close)
n*(Object, Object)entries (add when value is not null, remove otherwise)

longresourceId (0 when close flag is set)


  • Close flag can be true for the last batch, so an additional close request is not necessary
  • Flush flag should be true when client-side user code calls Flush method, and false otherwise

Risks and Assumptions

  • Unlike existing thick streamer API, we are not going to allow changing options (allowOverwrite, etc) after the start. This behavior seems confusing. Every client-side implementation can decide on the API, but it makes sense to remove setters from the DataStreamer interface and move all the options to a separate type, like DataStreamerOptions, and pass this once to igniteClient.dataStreamer(cacheName, options).
  • Buffer sizes can be matching or different on client and server sides.
    • Example 1: per-node buffer size is the same on partition-aware client and server. When client flushes the buffer, it gets flushed on the server right away.
    • Example 2: client-side buffer is small due to resource constraints, server-side buffer is bigger for better batching and performance.
  • Client API can expose both server-side and client-side buffer sizes as configuration parameters, or choose to hide them for simplicity

Discussion Links

Reference Links



key summary type created updated due assignee reporter priority status resolution

JQL and issue key arguments for this macro require at least one Jira application link to be configured

  • No labels