Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

 

Purpose 

The purpose of this document to describe the native serialization of a various data type types which Geode understandsunderstand

Data Types

Geode supports all the Java primitive data types and Java arrays and collections.. Custom Java object objects can be serialized through Geode PdxSerializable and DataSerializable interfaces. The Application can attach its own data serializer through Geode the Geode DataSerializer and Geode and PdxSerializer interface. Using Geode, the application can also serialize arrays of primitive java types and java collections. It also understands the Java Serializable interfaces. Objects implementing java.io.Serializable can be serialized with Geode as well.

In Geode, every supported data type is associated with a single-byte type Id. Type Id is represented by one byte. To serialize any date type Geode first writes the typeId. Then For variable sized objects, it then writes the length of serialized byte for variable typesthe serialized object followed by the serialized bytes. For fixed data types it just write those writes the serialized bytes in the Big Indian Endian byte order.

 For native serialization, strings are serialized using the Java Modified UTF-8 Format.

String STRING () ASCII; Length is 2 bytes;

 [87, 0, 5, 104, 101, 108, 108, 111]

 

 

 

  • HUGE_STRING (88 = 0x58) ASCII Length is 4 bytes

  • UTF_STRING (42 = 0x2A) UTF; Length is 2 bytes () UTF; Length is 4 bytes 
    Data TypeGeode Region Key TypesType IdValueSerialized BytesDescription
    Null 41 = 0x29null 
    typeId
     
    0x29
     
    Boolean YES53 = 0x35 true
    typeId0x35
    bytes0x01
     
     
    Character YES54 = 0x36'a'
    typeId0x36
     
    bytes
     

    0x00 0x61

     
    Byte YES55 = 0x371
    typeId0x37
     
    bytes
     
    0x01
     
    Short YES56 = 0x381000
    typeId0x38
     
    bytes
     
    0x03 0xE8
     
    Integer YES57 = 0x39 1000
    typeId0x39
    bytes0x00 0x00 0x03 0xE8
     
     
    Long YES58 = 0x3A 1000
    typeId0x3A
    bytes0x00 0x00 0x00 0x00 0x00 0x00 0x03 0xE8
     
     
    Float YES59 = 0x3B1000f
    typeId0x3B
     
    bytes

    0x44 0x7A 0x00 0x00

     

     
    Double YES60 = 0x3C1000d



    typeId0x3C
     
    bytes

    0x40 0xF1 0x40 0x00 0x00 0x00 0x00 0x00

      
    ASCII_STRINGYES87 = 0x57String s = "hello"

    0x57

    0x00(len)

    0x05(len)

    0x68

    0x65

    0x40

    0x40

    0x6F

    typeid0x57
    len0x00 0x05
    bytes0x68 0x65 0x40 0x40 0x6F
    This represents ASCII string with maximum length 0xFFFF. Code snippet to serialize and deserialize the string.
    UTF_STRING YES42 = 0x2A  This represents UTF string with maximum length 0xFFFF. Code snippet to serialize and deserialize the string.
    HUGE_ASCII_STRING YES88 = 0x58  This represents ASCII string with length greater than 0xFFFF. Code snippet to serialize and deserialize the string.
    HUGE_UTF_STRING YES89 = 0x59  This represents UTF string with length greater than 0xFFFF. Code snippet to serialize and deserialize the string.
    byte[] This we plan to support46 = 0x2Ebyte[] {1,2}
    typeId0x2E
    len
     
    0x02
    bytes0x01 0x02
     
     
    short[] 47 = 0x2Fshort[] {1,2}
    typeId0x2F
    len 
    0x02
    bytes

    0x00 0x01 0x00 0x02

     

     
    int[] 48 = 0x30 int[] {1,2}
    typeId0x30
    len 0x02
    bytes

    0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x02

     

     
    long[] 49 = 0x31long[] {1}
    typeId0x31
    len 
     
    0x01
    bytes

    0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x01

     
    float[] 50 = 0x32 float[] {2.0f}
    typeId0x32
    len 0x01
    bytes

    0x40 0x00 0x00 0x00

     

     
    double[] 51 = 0x33   stringdouble[] 64 = 0x40   
    Map67 = 0x43   
    Set66 = 0x42   
    List    
    ArrayList    
    PDX_SERIALIZATION    
    PDX_SERIALIZER    
    DATA_SERIALIZATION     
    USER_SERIALIZATION     
         

     

     

    1. Null (41 = 0x29): Null object will be represented by single byte.

     

     

    ...

    0x41 (typeid)

     

    1. Boolean (53 = 0x35) : Boolean object will be represented by typeId and value byte. Following demonstrate the serialize representation boolean value true.

     

     

    ...

    0x52 (typeid)

    ...

    0x01

     

    1. Character (54 = 0x36) : Character object will be represented by typeId and two bytes. Following demonstrate the serialize representation of character ‘a’.

     

     

    ...

    0x54 (typeid)

    ...

    0x00

    ...

    0x61

     

    1. Byte (55 = 0x37) : Byte object will represted by typeId and  value byte. Following demonstrate the serialize representation of 1’.

     

     

    ...

    0x55 (typeid)

    ...

    0x01

     

    1. Short (56 = 0x38): Following will be serialized representation of 1000(0x3E8)

     

    ...

    0x38 (typeid)

    ...

    0x03

    ...

    0xE8

     

     

    1. Integer (57 = 0x39): Following will be serialized representation of 1000(0x3E8)

     

     

    ...

    0x39

    ...

    0x00

    ...

    0x00

    ...

    0x03

    ...

    0xE8

     

    1. Long (58 = 0x3A): Following will be serialized representation of 1000(0x3E8)

     

     

    ...

    0x58

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x03

    ...

    0xE8

     

    1. Float (59 = 0x3B)

     float s = 1000f;

    [59, 68, 122, 0, 0]

     

     

    ...

    0x3B (typeid)

    ...

    0x44

    ...

    0x7A

    ...

    0x00

    ...

    0x00

     

    1. Double (60 = 0x3C)

     double s = 1000d;

     [60, 64, -113, 64, 0, 0, 0, 0, 0]

     

     

    ...

    0x3C

    ...

    0x40

    ...

    0xF1

    ...

    0x40

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

     

    1. String (42 = 0x2A)

      1. STRING (87 = 0x57) ASCII; Length is 2 bytes

    String s = "hello";

     [87, 0, 5, 104, 101, 108, 108, 111]

     

     

    ...

    0x57

    ...

    0x00(len)

    ...

    0x05(len)

    ...

    0x68

    ...

    0x65

    ...

    0x40

    ...

    0x40

    ...

    0x6F

     

    {2.0d}
    typeId0x33
    len0x01
    bytes

    0x40 0x00 0x00 0x00 0x00 0x00 0x00 0x00

     
    string[] 64 = 0x40 String[]{"hello", "world"}
    typeId0x40
    len 0x02

    "hello"

    bytes

    0x57 - ASCII_STRING

    0x00 0x05

    0x68 0x65 0x6c 0x6c 0x6f

     

    "world"

    bytes

    0x57 -  ASCII_STRING

    0x00 0x05 - len

    0x77 0x6f 0x72 0x6c 0x64


     
    Map 67 = 0x43

    Map s = new HashMap<>();

     s.put("hello", "world")

    typeId0x43
    len 0x01

    "hello"

    bytes

    0x57 - ASCII_STRING

    0x00 0x05

    0x68 0x65 0x6c 0x6c 0x6f

     

    "world"

    bytes

    0x57 -  ASCII_STRING

    0x00 0x05 - len

    0x77 0x6f 0x72 0x6c 0x64

     
    Set 66 = 0x42

    Set s = new HashSet();

     s.add("hello");

    s.add("world");


    typeId0x42
    len 0x02

    "hello"

    bytes

    0x57 - ASCII_STRING

    0x00 0x05

    0x68 0x65 0x6c 0x6c 0x6f

    "world"

    bytes

    0x57 -  ASCII_STRING

    0x00 0x05 - len

    0x77 0x6f 0x72 0x6c 0x64

     
    List 10 = 0x0a

    List s = new LinkedList();
    s.add("hello");
    s.add("world");

    typeId0x0a
    len 0x02

    "hello"

    bytes

    0x57 - ASCII_STRING

    0x00 0x05

    0x68 0x65 0x6c 0x6c 0x6f

    "world"

    bytes

    0x57 -  ASCII_STRING

    0x00 0x05 - len

    0x77 0x6f 0x72 0x6c 0x64

     
    ArrayList 65=0x41

    List s = new ArrayList();
    s.add("hello");
    s.add("world");

    typeId0x41
    len 0x02

    "hello"

    bytes

    0x57 - ASCII_STRING

    0x00 0x05

    0x68 0x65 0x6c 0x6c 0x6f

    "world"

    bytes

    0x57 -  ASCII_STRING

    0x00 0x05 - len

    0x77 0x6f 0x72 0x6c 0x64

     
    PDX_SERIALIZATION 93=0x5D  Java Object can implement PdxSerializable interface to serialize data in pdx format
    PDX_SERIALIZER    

    The application can implement PdxSerializer interface and then install with geode cache

    to serialize data in pdx format.

    DATA_SERIALIZATION     Java Object can implement DataSerializable interface to serialize data.
    USER_SERIALIZATION     

    The application can implement DataSerializer interface and then install with geode cache

    to serialize data.

    JAVA_SERIALIZATIOn44= 0x2C    


    Calculate Collection/Array Size

    Geode calculates the size of collection or Array in following way

    Write Array Length functionRead Array Length Function
    Code Block
     public static final byte NULL_ARRAY = -1; // array is null
      /**
       * @since GemFire 5.7
       */
      private static final byte SHORT_ARRAY_LEN = -2; // array len encoded as unsigned short in next 2
                                                      // bytes
      /**
       * @since GemFire 5.7
       */
      public static final byte INT_ARRAY_LEN = -3; // array len encoded as int in next 4 bytes
      private static final int MAX_BYTE_ARRAY_LEN = ((byte) -4) & 0xFF;
      public static void writeArrayLength(int len, DataOutput out) throws IOException {
        if (len == -1) {
          out.writeByte(NULL_ARRAY);
        } else if (len <= MAX_BYTE_ARRAY_LEN) {
          out.writeByte(len);
        } else if (len <= 0xFFFF) {
          out.writeByte(SHORT_ARRAY_LEN);
          out.writeShort(len);
        } else {
          out.writeByte(INT_ARRAY_LEN);
          out.writeInt(len);
        }
      }
    Code Block
      public static int readArrayLength(DataInput in) throws IOException {
        byte code = in.readByte();
        if (code == NULL_ARRAY) {
          return -1;
        } else {
          int result = ubyteToInt(code);
          if (result > MAX_BYTE_ARRAY_LEN) {
            if (code == SHORT_ARRAY_LEN) {
              result = in.readUnsignedShort();
            } else if (code == INT_ARRAY_LEN) {
              result = in.readInt();
            } else {
              throw new IllegalStateException("unexpected array length code=" + code);
            }
          }
          return result;
        }
      }

     

    Write And Read String

    The String is serialized in following way. It distinguishes ASCII string and its size.

    Write StringRead String
    Code Block
    public static void writeString(String value, DataOutput out) throws IOException {
        InternalDataSerializer.checkOut(out);
        final boolean isDebugEnabled = logger.isTraceEnabled(LogMarker.SERIALIZER);
        if (isDebugEnabled) {
          logger.trace(LogMarker.SERIALIZER, "Writing String \"{}\"", value);
        }
        if (value == null) {
          if (isDebugEnabled) {
            logger.trace(LogMarker.SERIALIZER, "Writing NULL_STRING");
          }
          out.writeByte(DSCODE.NULL_STRING);
        } else {
          // [bruce] writeUTF is expensive - it creates a char[] to fetch
          // the string's contents, iterates over the array to compute the
          // encoded length, creates a byte[] to hold the encoded bytes,
          // iterates over the char[] again to create the encode bytes,
          // then writes the bytes. Since we usually deal with ISO-8859-1
          // strings, we can accelerate this by accessing chars directly
          // with charAt and fill a single-byte buffer. If we run into
          // a multibyte char, we revert to using writeUTF()
          int len = value.length();
          int utfLen = len; // added for bug 40932
          for (int i = 0; i < len; i++) {
            char c = value.charAt(i);
            if ((c <= 0x007F) && (c >= 0x0001)) {
              // nothing needed
            } else if (c > 0x07FF) {
              utfLen += 2;
            } else {
              utfLen += 1;
            }
            // Note we no longer have an early out when we detect the first
            // non-ascii char because we need to compute the utfLen for bug 40932.
            // This is not a performance problem because most strings are ascii
            // and they never did the early out.
          }
          boolean writeUTF = utfLen > len;
          if (writeUTF) {
            if (utfLen > 0xFFFF) {
              if (isDebugEnabled) {
                logger.trace(LogMarker.SERIALIZER, "Writing utf HUGE_STRING of len={}", len);
              }
              out.writeByte(DSCODE.HUGE_STRING);
              out.writeInt(len);
              out.writeChars(value);
            } else {
              if (isDebugEnabled) {
                logger.trace(LogMarker.SERIALIZER, "Writing utf STRING of len={}", len);
              }
              out.writeByte(DSCODE.STRING);
              out.writeUTF(value);
            }
          } else {
            if (len > 0xFFFF) {
              if (isDebugEnabled) {
                logger.trace(LogMarker.SERIALIZER, "Writing HUGE_STRING_BYTES of len={}", len);
              }
              out.writeByte(DSCODE.HUGE_STRING_BYTES);
              out.writeInt(len);
              out.writeBytes(value);
            } else {
              if (isDebugEnabled) {
                logger.trace(LogMarker.SERIALIZER, "Writing STRING_BYTES of len={}", len);
              }
              out.writeByte(DSCODE.STRING_BYTES);
              out.writeShort(len);
              out.writeBytes(value);
            }
          }
        }
      }
    Code Block
      public static String readString(DataInput in, byte header) throws IOException {
        if (header == DSCODE.STRING_BYTES) {
          int len = in.readUnsignedShort();
          if (logger.isTraceEnabled(LogMarker.SERIALIZER)) {
            logger.trace(LogMarker.SERIALIZER, "Reading STRING_BYTES of len={}", len);
          }
          byte[] buf = new byte[len];
          in.readFully(buf, 0, len);
          return new String(buf, 0); // intentionally using deprecated constructor
        } else if (header == DSCODE.STRING) {
          if (logger.isTraceEnabled(LogMarker.SERIALIZER)) {
            logger.trace(LogMarker.SERIALIZER, "Reading utf STRING");
          }
          return in.readUTF();
        } else if (header == DSCODE.NULL_STRING) {
          if (logger.isTraceEnabled(LogMarker.SERIALIZER)) {
            logger.trace(LogMarker.SERIALIZER, "Reading NULL_STRING");
          }
          return null;
        } else if (header == DSCODE.HUGE_STRING_BYTES) {
          int len = in.readInt();
          if (logger.isTraceEnabled(LogMarker.SERIALIZER)) {
            logger.trace(LogMarker.SERIALIZER, "Reading HUGE_STRING_BYTES of len={}", len);
          }
          byte[] buf = new byte[len];
          in.readFully(buf, 0, len);
          return new String(buf, 0); // intentionally using deprecated constructor
        } else if (header == DSCODE.HUGE_STRING) {
          int len = in.readInt();
          if (logger.isTraceEnabled(LogMarker.SERIALIZER)) {
            logger.trace(LogMarker.SERIALIZER, "Reading HUGE_STRING of len={}", len);
          }
          char[] buf = new char[len];
          for (int i = 0; i < len; i++) {
            buf[i] = in.readChar();
          }
          return new String(buf);
        } else {
          String s = "Unknown String header " + header;
          throw new IOException(s);
        }
      }
      1. HUGE_STRING (88 = 0x58) ASCII Length is 4 bytes

      2. UTF_STRING (42 = 0x2A) UTF; Length is 2 bytes

      3. HUGE_UTF_STRING (89 = 0x59) UTF; Length is 4 bytes

    1. Array (52 = 0x34) ??

    2. BYTE_ARRAY (46 = 0x2E)

     byte[] {1,2};

     

     

    ...

    0x2E

    ...

    0x02(len)

    ...

    0x01

    ...

    0x02

     

    1. SHORT_ARRAY (47 = 0x2F)

     short[] {1,2};

     

     

    ...

    0x2F

    ...

    0x02(len)

    ...

    0x00

    ...

    0x01

    ...

    0x00

    ...

    0x02

     

    1. INTEGER_ARRAY (48 = 0x30)

     int[] {1,2};

     

     

    ...

    0x30

    ...

    0x02(len)

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x01

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x02

     

    1. LONG_ARRAY (49 = 0x31)

     long[] {1};

     

     

    ...

    0x31

    ...

    0x01(len)

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x01

     

    1. FLOAT_ARRAY (50 = 0x32)

     float[] {2.0f};

     

     

    ...

    0x32

    ...

    0x01(len)

    ...

    0x40

    ...

    0x00

    ...

    0x00

    ...

    0x00

     

    1. DOUBLE_ARRAY (51 = 0x33)

     double[] {2.0d}

     

     

    ...

    0x33

    ...

    0x01(len)

    ...

    0x40

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

    ...

    0x00

     

    1. STRING_ARRAY (64 = 0x40)

     String[] s = new String[]{"hello", "world"};

     [64, 2, 87, 0, 5, 104, 101, 108, 108, 111, 87, 0, 5, 119, 111, 114, 108, 100]

    1. Map (67 = 0x43)

     Map s = new HashMap<>();

        s.put("hello", "world");

     [67, 1, 87, 0, 5, 104, 101, 108, 108, 111, 87, 0, 5, 119, 111, 114, 108, 100]

    1. Set (66 = 0x42)

     Set s = new HashSet();

        s.add("hello");

        s.add("world");

     [66, 2, 87, 0, 5, 119, 111, 114, 108, 100, 87, 0, 5, 104, 101, 108, 108, 111]

    ...

    List

    ...

    ArrayList

    ...

    JSON_STRING ??

    ...

    JSON_BYTE_ARRAY ??

    ...

    PDX_SERIALIZATION (93 = 0x5D)

    ...

    DATA_SERIALIZATION (37 = 0x25)

    ...