You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Background

Protocol

We've been writing a new client-server protocol as a public API for the creation of new Geode clients. We settled on using Protobuf as the encoding for the protocol as it allows a user not to think about a lot of the encoding details, which makes writing clients significantly easier.

Encoding

One of the big challenges in designing a new protocol has been how to encode values. Like the old binary client protocol, the PDX encoding is complicated, underdocumented, and stateful. However, we need a way to send values.

The first approach was to use the JSON-PDX conversion that is already used for the REST API. Many languages have libraries to encode objects as JSON, but this has downsides, among them being slow.

Proposal

Regardless of proposal, we should allow users to have a pluggable object encoding that they can register a handler with on the server. This encoder will receive a byte array and return an Object. This allows users to do custom serialization if desired.

Option 1: Struct encoding

Protobuf ships with a file, described in struct.proto, that can be recursively nested to encode JSON.

By extending the same sort of structure, we can encode almost any value with a type that is supported, including adding support for more complex types like dates or UUIDs. Clients can write their own encoders and driver developers can write auto-serializers that will serialize these objects via reflection to Protobuf in a manner similar to how JSON is currently serialized:

Protobuf "struct"
message Struct {
  string typeName = 1;
  repeated StructEntry entries = 2;

}
message StructEntry {
  string fieldName = 1;
  oneof value {
    int32 intField = 2;
    int64 longField = 3;
    int32 shortField = 4;
    byte byteField = 5;
    bool booleanField = 6;
    double doubleField = 7;
    float floatField = 8;
    bytes binaryField = 9;
    string stringField = 10;
    google.protobuf.NullValue nullField = 11;
    // Field serialized using a custom serialization format. This can only be used if
    // A HandshakeRequest is sent with valueFormat set to a valid format.
    //
    // See HandshakeRequest.valueFormat.
    bytes customObjectField = 12;
  }
}

 

The typeName field can be used for other clients to recognize the same type. Internally, it will be stored in the PDXInstance that this is converted to, but that detail shouldn't need to be exposed to the user.

So for example, given the following class and value:

class User {
  String name;
  int age;
}

value = new User("Amy", 44);

would encode as (using a pseudo-static initializer syntax):

Struct{
  typeName: "<packagename?>User",
  entries: [
    StructEntry{
      fieldName: "name",
      value: stringValue{"Amy"}
    },
    StructEntry{
      fieldName: "age",
      value: intValue{44}
    }
  ]
}

This all gets compiled down to binary for an encoding that is more efficient than JSON.

Ideally, a driver developer would provide annotations or registration for client developers to register their types. In languages that use setters and getters by convention, it would probably be more idiomatic to refer to setters getters for reflection rather than the member variables of the object.

Option 2: Type registration

As an optimization to avoid sending field names with every message, allow clients to register types to communicate the metadata for data they are about to send. The server will give back an ID for that datatype, and the ID can be used in future messages to refer to the metadata without retransmitting that metadata.

Type registration will be per-connection (meaning IDs cannot be cached between connections). This eliminates the need to keep synchronization on the server, as well as decoupling client registrations from the internal details of PDX. It also means that the clients only have to keep track of a relatively small amount of data.

The outline of type registration for the client is this:

  1. Send
  2. Get back a type ID that references the type description
  3. Use that type ID when encoding values of that type

The message for sending a type definition will look like this:

Protobuf Type registration
message TypeRegistrationRequest {
  ValueTypeDefinition typeDefinition = 1;
}
message TypeRegistrationResponse {
  int typeID = 1;
}

message ValueTypeDefinition {
  string typeName = 1;
  repeated ValueTypeFieldDefinition definition = 2;
}
message ValueTypeFieldDefinition {
  string fieldName = 1;
  enum fieldType {
    intField;
    longField;
    shortField;
    byteField;
    booleanField;
    doubleField;
    floatField;
    binaryField;
    stringField;
    // no JSON?  string jsonObjectField;
    // Field serialized using a custom serialization format. This can only be used if
    // A HandshakeRequest is sent with valueFormat set to a valid format.
    //
    // See HandshakeRequest.valueFormat.
    customObjectField;
  }
}

and for sending values:

Protobut Values
message Value {
  int typeID = 1;
  repeated ValueField field = 2;
}
message ValueField = {
  oneof value {
    int32 intField = 1;
    int64 longField = 2;
    int32 shortField = 3;
    byte byteField = 4;
    bool booleanField = 5;
    double doubleField = 6;
    float floatField = 7;
    bytes binaryField = 8;
    string stringField = 9;
    google.protobuf.NullValue nullField = 11;
    // Field serialized using a custom serialization format. This can only be used if
    // A HandshakeRequest is sent with valueFormat set to a valid format.
    //
    // See HandshakeRequest.valueFormat.
    bytes customObjectField = 12;
  }
}

The client sends a registration request, and the server can determine the typeID.

If a server sends back a value of a type a client has not registered, the client can send a TypeDefinitionLookupRequest:

message TypeDefinitionLookupRequest {
  int typeId = 1;
}
message TypeDefinitionLookupResponse {
  int typeId = 1;
  string fieldName = 2;
  ValueTypeDefinition typeDefinition = 3;
}

This way a client can implement logic to find the correct type and deserialize the value.

  • No labels