Moved to Icebox
A substantial portion of a New Client Server Protocol was implemented as described below. The project was tabled indefinitely soon afterward and dormant for quite a while. After discussing the matter on the dev email list we decided to delete the experimental implementation from the "develop" branch. This was done in release 1.15 in Github SHA ae6b3ac1550bdaed75159fbbe360b6733e7e84ee.
See GEODE-8997
Table of Contents |
---|
Introduction
Apache Geode is a reliable distributed data management tool. There is data management platform that provides real-time, consistent access to data-intensive applications throughout widely distributed cloud architectures. While it currently has high-speed client interfaces for Java, C++ and .NET there is both a need to create lighter-weight clients and a demand to access Geode from various other programming languages. But Unfortunately, the existing client-server protocol is too complex to understand, and it’s not even documented. That establishes the need for a new client-server protocol.is undocumented. It evolved over time and is overly complex to meet either of these needs.
This proposal details the requirements, API and structure for a new client/server protocol. It does not specify the exact serialization mechanism, but the intent is for the protocol that is described to be complete in terms of the interface and message ordering. The intent is to make it pluggable so that we can experiment with different serialization formats based on varying performance and ease-of-use needs. In particular, we expect to use a widely-available IDL to serialize the protocol at first and make it accessible from many languages, and possibly implement a custom serialization later for clients needing very high performance. Choosing the IDL is an open goal.
The intent is to allow client functionality to be implemented in phases, moving from a "basic client" to a more advanced "smart client". It endeavors to provide a protocol that is also more amenable to more modern APIs such as those using asynchronous or reactive patterns.
Serialization of application keys, values, callback arguments, function parameters and so forth is a separate matter and are not necessarily tied to the serialization protocol used for client/server messaging. The initial protocol will support primitive types such as scalars, strings, and byte arrays. It will also support JSON documents as values and convert between these and Geode PDX-serialized objects in the servers.
Goals
The high-level goals for the protocol are defined here.
Protocol Requirements
In the evaluation or definition of any protocol, it expected that the evaluated protocol/framework meets the following requirements:
Versioning: The protocol has to provide version information, in order to distinguish different protocol versions from one another.
Correlation Id: This number is a unique identifier that allows the system to correlate requests and responses.
Object Type: The serialization type of the data objects stored inside the messages
Protocol Terms
Any binary protocol will require following things
Version: This indicates the API version.
Request Type: This indicates API needs to invoke.
Correlation Id: This helps to relate request-response.
Response Type: It indicates whether
a response is partial or complete.
ErrorCodes: It indicates the problem with API invocation.
Streaming support: To support the large response or continuous response.
Request Format: Api request and response.
Byte Order(Big Endian)
Request: It indicates client's message
Response: It indicates server's message.
Message: Bytes which contains defined format.
Connect
The new protocol will be integrated with current Geode server. The new client driver can connect with Geode server by sending byte “110”.
Request Type
Following table contains the request type and its corresponding id. RequestTypeId would recognize the API, which client want to invoke on the server. The request format will contain the 2-bytes(int16) for request type id. It will be marked as requestTypeId in request format.
RequestType | RequestTypeId |
---|---|
MetaDataConfigRequestType | 1 |
AuthenticationRequestType | 2 |
PutRequestType | 3 |
GetRequestType | 4 |
PutAllRequestType | 5 |
GetAllRequestType | 6 |
FunctionRequestType | 7 |
CreateRequestType | 8 |
InvalidateRequestType | 9 |
DestroyRequestType | 10 |
KeySetRequestType | 11 |
ValuesRequestType | 12 |
EntrySetRequestType | 13 |
ContainsValueForKeyRequestType | 14 |
ContainsKeyRequestType | 15 |
ContainsValueRequestType | 16 |
RemoveAllRequestType | 17 |
SizeRequestType | 18 |
PutIfAbsentRequestType | 19 |
RemoveIfValueIsSameRequestType | 20 |
ReplaceIfValueIsSameRequestType | 21 |
ReplaceIfValueExistType | 22 |
...
The purpose of correlation id to match the request and its corresponding response. The request format will contain the 4-bytes(int32) for correlation Id. It will be marked as correlationId in request format.
Object Type
We will support all the object types which Geode understands. This would include all the java primitive types, an array of primitive types, collections, java serialization, data serializable, pdx serialization and custom user data serializers. For the purpose of the request format, we would distinguish key type and value type.
ObjectKeyType
Geode supports only few object types as the region Key. The region key will be marked as KeyObject in the request format.
ObjectValueType
All the supported Geode types can be defined as the region value. In the request format the region value will be marked as ValueObject. Value object will be preceded by ValueHeader. This will represent the size of the serialized bytes of ValueObject. It will consume 5-byte in request format. And it will be marked as ValueHeader in the request format.
ResponseType
ReponseType will indicate that whether the response is partial or complete. A client can process a partial response. Response with FullResponse type id will indicate the completion of that request.
The response format will contain the 2-bytes(int16) for response type. It will be marked as FullResponse or PartialResponse in the response format.
ResponseType | ResponseTypeId |
---|---|
FullResponse | 1 |
PartialResponse | 2 |
Error Codes
Error codes indicate the issue with the invocation of API at the server. We have following error code for various issues at the server. The response format will contain the 2-bytes(int16) for error codes. It will be marked as ErrorCode in the response format.
Exception Type | ErrorCode |
---|---|
AUTHENICATION_REQUIRED_EXCEPTION | 1 |
AUTHORIZATION_FAILED_EXCEPTION | 2 |
AUTHETICATIONFAILED_EXCEPTION | 3 |
BUCKET_MOVED_EXCEPTION | 4 |
SERIALIZATION_EXCEPTION | 5 |
INTERRUPTED_EXCEPTION | 6 |
ILLEGAL_ARGUMNET_EXCEPTION | 7 |
ILLEGAL_STATE_EXCEPTION | 8 |
TIMEOUT_EXCEPTION | 9 |
CACHE_WRITER_EXCEPTION | 10 |
REGION_EXIST_EXCEPTION | 11 |
REGION_NOT_EXIST_EXCEPTION | 12 |
LEASE_EXPIRED_EXCEPTION | 13 |
CACHE_LOADER_EXCEPTION | 14 |
REGION_DESTROYED_EXCEPTION | 15 |
ENTRY_DESTROYED_EXCEPTION | 16 |
ENTRY_NOT_FOUND_EXCEPTION | 17 |
FUNCTION_NOT_FOUND_EXCEPTION | 18 |
FUNCTION_ATTRIBUTE_MISMATCH_EXCEPTION | 19 |
FUNCTION_EXECUTION_EXCEPTION | 20 |
CONCURRENT_MODIFICATION_EXCEPTION | 21 |
UNKNOWN_EXCEPTION | 22 |
CLASS_CAST_EXCEPTION | 23 |
GEODE_IO_EXCEPTION | 24 |
NULL_POINTER_EXCEPTION | 25 |
ENTRY_EXIST_EXCEPTION | 26 |
DISK_ACCESS_EXCEPTION | 27 |
QUERY_EXCEPTION | 28 |
CACHE_CLOSED_EXCEPTION | 29 |
MESSAGE_FORMAT_EXCEPTION | 30 |
CACHE_LISTENER_EXCEPTION | 31 |
CQ_EXCEPTION | 32 |
CQ_CLOSED_EXCEPTION | 33 |
CQ_QUERY_EXCEPTION | 34 |
CQ_EXIST_EXCEPTION | 35 |
CQ_INVALID_EXCEPTION | 36 |
INVALID_DELTA_EXCEPTION | 37 |
TRANSACTION_EXCEPTION | 38 |
TRANSACTION_DATA_NODE_DEPARTED_EXCEPTION | 39 |
TRANSACTION_REBALANCED_EXCEPTION | 40 |
COMMIT_CONFLICT_EXCEPTION | 41 |
PUTALL_PARTIAL_RESULT_EXCEPTION | 42 |
Message(Framing)
A message is series of bytes which contains request or response. If the message is large, then we need to divide the message into small messages. Then message can be sent in following way.
Message --> MessageHeader (Request | Response) |
---|
MessageHeader --> Size PartialMessage CorrelationId |
Size --> int32 (Size of request or response) |
PartialMessage --> boolean (isMessageCompleted) |
CorrelationId -->int32( to co-relate request and response) |
Request Format
Request --> RequestType Version hasMetaData [MetaData] RequestAPI |
---|
RequestType --> RequestTypeId |
version --> int16 (api version) |
hasMetaData --> boolean (if there is any meta data associated with this request) |
MetaData --> optional |
RequestAPI --> (PutRequest | GetRequest | PutAllRequest | GetAllRequest) |
Response Format
Response --> (ResponseTypeId | ErrorCode) hasMetaData [MetaData] APIResponse |
---|
ResponseTypeId --> int16(codes defined above) |
ErrorCode --> int16 (codes defined above) |
hasMetaData --> boolean (if there is any meta data associated with this request) |
MetaData --> Optional |
APIResponse –> (PutResponse | GetResponse | PutAlLRequest | GetAllRequest) |
Chunk Response: The ability to send a large response in multiple smaller, more manageable chunks.
Continuous Response: Client can register(Observer pattern) for events and then server notify the client if those events occur.
Request: The request message to be sent
Response: The response message received in relation to a request message
Request Format: Format of request API and its parameters, which client wants to invoke.
Response Format: Format for API return value, which client invoked.
Message: The generic construct that represents a message that is to be sent which contains a Message Header and Request/Response.
Serialized Byte Order: Big Endian
RPC and Message serialization Frameworks
During the investigation into frameworks to help "lower the barrier of entry," it became evident that there are two types of external frameworks:
- Message Serialization Frameworks - These frameworks allow for the definition of a message in a generic IDL, the generation of language specific classes from the IDL, and the encoding/decoding of those message to be sent over a transport
- RPC Frameworks - There frameworks provide greater coverage in the node-to-node communication: the transport layer (HTTP, TCP, UDP), the message definition in IDL with corresponding serialization mechanism and the definition of methods in the IDL, as well as the generation of corresponding service stubs.
- Message serialization frameworks define the encoding/decoding of defined messages but not the transport or connectivity.
- RPC frameworks concern themselves with connectivity and transport, remote method invocations and the encoding/decoding of defined messages.
Because this protocol needs to be tunable for very high performance, for some lack of functionality and because RPC frameworks hide their network and threading internals, it was decided that option 2 was not viable. See a comparison at RPC framework evaluation.
From an higher-level architectural perspective we can identify 2 layers:
- A Transport Layer (TCP, UDP, HTTP, etc..)
- Message encoding/decoding Layer
This proposal will define the message structure and protocol to be agnostic of transport used.
Message Structure Definition
All details relating to the Message structure definition can be found on the page Message Structure and Definition.
Proposed Implementation Phases
Introducing a new protocol into GEODE has the potential to be highly disruptive. In order to minimize the disruption and maximize the feedback cycles, we suggest implementing the changes in a phased approach. To view the milestones for each phase please see the page Phases and Milestones.
Example messages
To better visualize the protocol messages a few sample messages have been provided on the page Protocol Message Examples