Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

 

Purpose 

The purpose of this document to describe the native serialization of a various data type types which Geode understandsunderstand

Data Types

Geode supports all the Java primitive data types and Java arrays and collections.. Custom Java object objects can be serialized through Geode PdxSerializable and DataSerializable interfaces. The Application can attach its own data serializer through Geode the Geode DataSerializer and Geode and PdxSerializer interface. Using Geode, the application can also serialize arrays of primitive java types and java collections. It also understands the Java Serializable interfaces. Objects implementing java.io.Serializable can be serialized with Geode as well.

In Geode, every supported data type is associated with a single-byte type Id. Type Id is represented by one byte. To serialize any date type Geode first writes the typeId. Then For variable sized objects, it then writes the length of serialized byte for variable typesthe serialized object followed by the serialized bytes. For fixed data types it just write those writes the serialized bytes in the Big Indian Endian byte order.

 For native serialization, strings are serialized using the Java Modified UTF-8 Format.

     
Data TypeGeode Region Key TypesType IdValueSerialized BytesDescription
Null 41 = 0x29null
typeIDtypeId0x29
 
Boolean YES53 = 0x35true
typeId0x35
bytes0x01
 
Character YES54 = 0x36'a'
typeId0x36
 bytes 

0x00 0x61

 
Byte YES55 = 0x37 1
typeId0x37
 bytes 0x01
 
Short YES56 = 0x38 1000
typeId0x38
 bytes 0x03 0xE8
 
Integer YES57 = 0x39 1000
typeId0x39
bytes0x00 0x00 0x03 0xE8  
 
Long YES58 = 0x3A 1000
typeId0x3A
bytes0x00 0x00 0x00 0x00 0x00 0x00 0x03 0xE8  
 
Float YES59 = 0x3B 1000f
typeId0x3B
bytes

0x44 0x7A 0x00 0x00

  

 
Double YES60 = 0x3C 1000d



typeId0x3C
bytes

0x40 0xF1 0x40 0x00 0x00 0x00 0x00 0x00

 

  
ASCII_STRINGStringYES87 = 0x57"hello"
typeid0x57
len0x00 0x05
bytes0x68 0x65 0x40 0x40 0x6F
 
This represents ASCII string with maximum length 0xFFFF. Code snippet to serialize and deserialize the string.
UTF_STRING YES42 = 0x2A   This represents UTF string with maximum length 0xFFFF. Code snippet to serialize and deserialize the string.
HUGE_ASCII_STRING YES88 = 0x58   This represents ASCII string with length greater than 0xFFFF. Code snippet to serialize and deserialize the string.
HUGE_UTF_STRING YES89 = 0x59   This represents UTF string with length greater than 0xFFFF. Code snippet to serialize and deserialize the string.
byte[] This we plan to support46 = 0x2E byte[] {1,2}
 
typeId0x2E 
len0x02
bytes0x01 0x02
 
short[] 47 = 0x2F short[] {1,2}
typeId0x2F
len  0x02
bytes

0x00 0x01 0x00 0x02

 
int[] 48 = 0x30 int[] {1,2}
typeId0x30
len  0x02
bytes

0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x02

 
long[] 49 = 0x31 long[] {1}
typeId0x31
len  0x01
bytes

0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01

 
float[] 50 = 0x32 float[] {2.0f}
typeId0x32
len  0x01
bytes

0x40 0x00 0x00 0x00

 
double[] 51 = 0x33 double[] {2.0d}
 
typeId0x33
len0x01 
bytes

0x40 0x00 0x00 0x00 0x00 0x00 0x00 0x00

 
string[] 64 = 0x40  String[]{"hello", "world"}
typeId0x40
len  0x02

"hello"

bytes

0x57 - ASCII_STRING

0x00 0x05

0x68 0x65 0x6c 0x6c 0x6f

 

Map67 = 0x43 
0x43

"world"

bytes

0x57 -  ASCII_STRING

0x00 0x05 - len

0x77 0x6f 0x72 0x6c 0x64

typeId


  
Map Set66 = 0x4267 = 0x43

Map s = new HashMap<>();

 s.put("hello", "world")

 

typeId0x420x43
len  0x01
 

"hello"

List  

bytes

0x57 - ASCII_STRING

0x00 0x05

0x68 0x65 0x6c 0x6c 0x6f

 

"world"

bytes

0x57 -  

ASCII_STRING

0x00 0x05 - len

0x77 0x6f 0x72 0x6c 0x64

ArrayList

 
Set 
  
 
PDX_SERIALIZATION    
PDX_SERIALIZER    
DATA_SERIALIZATION     
USER_SERIALIZATION     

 

 

  1. Null (41 = 0x29): Null object will be represented by single byte.

 

 

...

0x41 (typeid)

 

  1. Boolean (53 = 0x35) : Boolean object will be represented by typeId and value byte. Following demonstrate the serialize representation boolean value true.

 

 

...

0x52 (typeid)

...

0x01

 

  1. Character (54 = 0x36) : Character object will be represented by typeId and two bytes. Following demonstrate the serialize representation of character ‘a’.

 

 

...

0x54 (typeid)

...

0x00

...

0x61

 

  1. Byte (55 = 0x37) : Byte object will represted by typeId and  value byte. Following demonstrate the serialize representation of 1’.

 

 

...

0x55 (typeid)

...

0x01

 

  1. Short (56 = 0x38): Following will be serialized representation of 1000(0x3E8)

 

...

0x38 (typeid)

...

0x03

...

0xE8

 

 

  1. Integer (57 = 0x39): Following will be serialized representation of 1000(0x3E8)

 

 

...

0x39

...

0x00

...

0x00

...

0x03

...

0xE8

 

  1. Long (58 = 0x3A): Following will be serialized representation of 1000(0x3E8)

 

 

...

0x58

...

0x00

...

0x00

...

0x00

...

0x00

...

0x00

...

0x00

...

0x03

...

0xE8

 

  1. Float (59 = 0x3B)

 float s = 1000f;

[59, 68, 122, 0, 0]

 

 

...

0x3B (typeid)

...

0x44

...

0x7A

...

0x00

...

0x00

 

  1. Double (60 = 0x3C)

 double s = 1000d;

 [60, 64, -113, 64, 0, 0, 0, 0, 0]

 

 

...

0x3C

...

0x40

...

0xF1

...

0x40

...

0x00

...

0x00

...

0x00

...

0x00

...

0x00

 

  1. String (42 = 0x2A)

    1. STRING (87 = 0x57) ASCII; Length is 2 bytes

String s = "hello";

 [87, 0, 5, 104, 101, 108, 108, 111]

 

 

...

0x57

...

0x00(len)

...

0x05(len)

...

0x68

...

0x65

...

0x40

...

0x40

...

0x6F

 

66 = 0x42

Set s = new HashSet();

 s.add("hello");

s.add("world");


typeId0x42
len 0x02

"hello"

bytes

0x57 - ASCII_STRING

0x00 0x05

0x68 0x65 0x6c 0x6c 0x6f

"world"

bytes

0x57 -  ASCII_STRING

0x00 0x05 - len

0x77 0x6f 0x72 0x6c 0x64

 
List 10 = 0x0a

List s = new LinkedList();
s.add("hello");
s.add("world");

typeId0x0a
len 0x02

"hello"

bytes

0x57 - ASCII_STRING

0x00 0x05

0x68 0x65 0x6c 0x6c 0x6f

"world"

bytes

0x57 -  ASCII_STRING

0x00 0x05 - len

0x77 0x6f 0x72 0x6c 0x64

 
ArrayList 65=0x41

List s = new ArrayList();
s.add("hello");
s.add("world");

typeId0x41
len 0x02

"hello"

bytes

0x57 - ASCII_STRING

0x00 0x05

0x68 0x65 0x6c 0x6c 0x6f

"world"

bytes

0x57 -  ASCII_STRING

0x00 0x05 - len

0x77 0x6f 0x72 0x6c 0x64

 
PDX_SERIALIZATION 93=0x5D  Java Object can implement PdxSerializable interface to serialize data in pdx format
PDX_SERIALIZER    

The application can implement PdxSerializer interface and then install with geode cache

to serialize data in pdx format.

DATA_SERIALIZATION     Java Object can implement DataSerializable interface to serialize data.
USER_SERIALIZATION     

The application can implement DataSerializer interface and then install with geode cache

to serialize data.

JAVA_SERIALIZATIOn44= 0x2C    


Calculate Collection/Array Size

Geode calculates the size of collection or Array in following way

Write Array Length functionRead Array Length Function
Code Block
 public static final byte NULL_ARRAY = -1; // array is null
  /**
   * @since GemFire 5.7
   */
  private static final byte SHORT_ARRAY_LEN = -2; // array len encoded as unsigned short in next 2
                                                  // bytes
  /**
   * @since GemFire 5.7
   */
  public static final byte INT_ARRAY_LEN = -3; // array len encoded as int in next 4 bytes
  private static final int MAX_BYTE_ARRAY_LEN = ((byte) -4) & 0xFF;
  public static void writeArrayLength(int len, DataOutput out) throws IOException {
    if (len == -1) {
      out.writeByte(NULL_ARRAY);
    } else if (len <= MAX_BYTE_ARRAY_LEN) {
      out.writeByte(len);
    } else if (len <= 0xFFFF) {
      out.writeByte(SHORT_ARRAY_LEN);
      out.writeShort(len);
    } else {
      out.writeByte(INT_ARRAY_LEN);
      out.writeInt(len);
    }
  }
Code Block
  public static int readArrayLength(DataInput in) throws IOException {
    byte code = in.readByte();
    if (code == NULL_ARRAY) {
      return -1;
    } else {
      int result = ubyteToInt(code);
      if (result > MAX_BYTE_ARRAY_LEN) {
        if (code == SHORT_ARRAY_LEN) {
          result = in.readUnsignedShort();
        } else if (code == INT_ARRAY_LEN) {
          result = in.readInt();
        } else {
          throw new IllegalStateException("unexpected array length code=" + code);
        }
      }
      return result;
    }
  }

 

Write And Read String

The String is serialized in following way. It distinguishes ASCII string and its size.

Write StringRead String
Code Block
public static void writeString(String value, DataOutput out) throws IOException {
    InternalDataSerializer.checkOut(out);
    final boolean isDebugEnabled = logger.isTraceEnabled(LogMarker.SERIALIZER);
    if (isDebugEnabled) {
      logger.trace(LogMarker.SERIALIZER, "Writing String \"{}\"", value);
    }
    if (value == null) {
      if (isDebugEnabled) {
        logger.trace(LogMarker.SERIALIZER, "Writing NULL_STRING");
      }
      out.writeByte(DSCODE.NULL_STRING);
    } else {
      // [bruce] writeUTF is expensive - it creates a char[] to fetch
      // the string's contents, iterates over the array to compute the
      // encoded length, creates a byte[] to hold the encoded bytes,
      // iterates over the char[] again to create the encode bytes,
      // then writes the bytes. Since we usually deal with ISO-8859-1
      // strings, we can accelerate this by accessing chars directly
      // with charAt and fill a single-byte buffer. If we run into
      // a multibyte char, we revert to using writeUTF()
      int len = value.length();
      int utfLen = len; // added for bug 40932
      for (int i = 0; i < len; i++) {
        char c = value.charAt(i);
        if ((c <= 0x007F) && (c >= 0x0001)) {
          // nothing needed
        } else if (c > 0x07FF) {
          utfLen += 2;
        } else {
          utfLen += 1;
        }
        // Note we no longer have an early out when we detect the first
        // non-ascii char because we need to compute the utfLen for bug 40932.
        // This is not a performance problem because most strings are ascii
        // and they never did the early out.
      }
      boolean writeUTF = utfLen > len;
      if (writeUTF) {
        if (utfLen > 0xFFFF) {
          if (isDebugEnabled) {
            logger.trace(LogMarker.SERIALIZER, "Writing utf HUGE_STRING of len={}", len);
          }
          out.writeByte(DSCODE.HUGE_STRING);
          out.writeInt(len);
          out.writeChars(value);
        } else {
          if (isDebugEnabled) {
            logger.trace(LogMarker.SERIALIZER, "Writing utf STRING of len={}", len);
          }
          out.writeByte(DSCODE.STRING);
          out.writeUTF(value);
        }
      } else {
        if (len > 0xFFFF) {
          if (isDebugEnabled) {
            logger.trace(LogMarker.SERIALIZER, "Writing HUGE_STRING_BYTES of len={}", len);
          }
          out.writeByte(DSCODE.HUGE_STRING_BYTES);
          out.writeInt(len);
          out.writeBytes(value);
        } else {
          if (isDebugEnabled) {
            logger.trace(LogMarker.SERIALIZER, "Writing STRING_BYTES of len={}", len);
          }
          out.writeByte(DSCODE.STRING_BYTES);
          out.writeShort(len);
          out.writeBytes(value);
        }
      }
    }
  }
Code Block
  public static String readString(DataInput in, byte header) throws IOException {
    if (header == DSCODE.STRING_BYTES) {
      int len = in.readUnsignedShort();
      if (logger.isTraceEnabled(LogMarker.SERIALIZER)) {
        logger.trace(LogMarker.SERIALIZER, "Reading STRING_BYTES of len={}", len);
      }
      byte[] buf = new byte[len];
      in.readFully(buf, 0, len);
      return new String(buf, 0); // intentionally using deprecated constructor
    } else if (header == DSCODE.STRING) {
      if (logger.isTraceEnabled(LogMarker.SERIALIZER)) {
        logger.trace(LogMarker.SERIALIZER, "Reading utf STRING");
      }
      return in.readUTF();
    } else if (header == DSCODE.NULL_STRING) {
      if (logger.isTraceEnabled(LogMarker.SERIALIZER)) {
        logger.trace(LogMarker.SERIALIZER, "Reading NULL_STRING");
      }
      return null;
    } else if (header == DSCODE.HUGE_STRING_BYTES) {
      int len = in.readInt();
      if (logger.isTraceEnabled(LogMarker.SERIALIZER)) {
        logger.trace(LogMarker.SERIALIZER, "Reading HUGE_STRING_BYTES of len={}", len);
      }
      byte[] buf = new byte[len];
      in.readFully(buf, 0, len);
      return new String(buf, 0); // intentionally using deprecated constructor
    } else if (header == DSCODE.HUGE_STRING) {
      int len = in.readInt();
      if (logger.isTraceEnabled(LogMarker.SERIALIZER)) {
        logger.trace(LogMarker.SERIALIZER, "Reading HUGE_STRING of len={}", len);
      }
      char[] buf = new char[len];
      for (int i = 0; i < len; i++) {
        buf[i] = in.readChar();
      }
      return new String(buf);
    } else {
      String s = "Unknown String header " + header;
      throw new IOException(s);
    }
  }
    1. HUGE_STRING (88 = 0x58) ASCII Length is 4 bytes

    2. UTF_STRING (42 = 0x2A) UTF; Length is 2 bytes

    3. HUGE_UTF_STRING (89 = 0x59) UTF; Length is 4 bytes

  1. Array (52 = 0x34) ??

  2. BYTE_ARRAY (46 = 0x2E)

 byte[] {1,2};

 

 

...

0x2E

...

0x02(len)

...

0x01

...

0x02

 

  1. SHORT_ARRAY (47 = 0x2F)

 short[] {1,2};

 

 

...

0x2F

...

0x02(len)

...

0x00

...

0x01

...

0x00

...

0x02

 

  1. INTEGER_ARRAY (48 = 0x30)

 int[] {1,2};

 

 

...

0x30

...

0x02(len)

...

0x00

...

0x00

...

0x00

...

0x01

...

0x00

...

0x00

...

0x00

...

0x02

 

  1. LONG_ARRAY (49 = 0x31)

 long[] {1};

 

 

...

0x31

...

0x01(len)

...

0x00

...

0x00

...

0x00

...

0x00

...

0x00

...

0x00

...

0x00

...

0x01

 

  1. FLOAT_ARRAY (50 = 0x32)

 float[] {2.0f};

 

 

...

0x32

...

0x01(len)

...

0x40

...

0x00

...

0x00

...

0x00

 

  1. DOUBLE_ARRAY (51 = 0x33)

 double[] {2.0d}

 

 

...

0x33

...

0x01(len)

...

0x40

...

0x00

...

0x00

...

0x00

...

0x00

...

0x00

...

0x00

...

0x00

 

  1. STRING_ARRAY (64 = 0x40)

 String[] s = new String[]{"hello", "world"};

 [64, 2, 87, 0, 5, 104, 101, 108, 108, 111, 87, 0, 5, 119, 111, 114, 108, 100]

  1. Map (67 = 0x43)

 Map s = new HashMap<>();

    s.put("hello", "world");

 [67, 1, 87, 0, 5, 104, 101, 108, 108, 111, 87, 0, 5, 119, 111, 114, 108, 100]

  1. Set (66 = 0x42)

 Set s = new HashSet();

    s.add("hello");

    s.add("world");

 [66, 2, 87, 0, 5, 119, 111, 114, 108, 100, 87, 0, 5, 104, 101, 108, 108, 111]

...

List

...

ArrayList

...

JSON_STRING ??

...

JSON_BYTE_ARRAY ??

...

PDX_SERIALIZATION (93 = 0x5D)

...

DATA_SERIALIZATION (37 = 0x25)

...