This page is meant as a template for writing a CEP. To create a CEP choose Tools->Copy on this page and modify with your content and replace the heading with the next CEP number and a description of your issue. Replace anything in italics with your own description.



Status

Current state:  "Withdrawn" – CEP is not the proper form for this change and change can not be made within the confines of a non-breaking change.

Discussion threadhere

JIRACASSANDRA-8959

Released: <Cassandra Version>

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).



Scope

Provide an algorithm for implementations of sparse data serialization.

Goals

  • To provide a more space efficient method of serialising sparse collections.

  • To support current serialisation strategy without modification.

Non Goals

  • This change does not modify the encoding types that are not collections. They continue to be encoded as they are now.

Approach

  • Provide multiple encodings for collections based on the number of null values in the collection.

Timeline

todo

Mailing list / Slack channels

Mailing list: 

Slack channel: 

Discussion threads: 

Related JIRA tickets

JIRA(s): 




Motivation

The current serialization format for UDTs and List collections has a fixed overhead of 4 bytes per defined field (encoding the size of the field). This is inefficient for sparse UDTs and Lists (hereafter just called collections) - ones with many defined fields, but few of them present.

The motivation for this change is to reduce the space necessary to store specific classes of sparse collections.

Audience

From the user perspective this change will reduce the storage requirements for some collections. This change is otherwise transparent to the user.

From the operator perspective this is transparent.

From the developer perspective this change provides an algorithm to efficiently serialize Cassandra data.

Proposed Changes

Current State

The current implementation writes the number of items in the collection (integer) and then serializes each item in the collection. Nulls in the collection are serialized with a length of -1. Upon deserialization any length less than 0 is considered a null.

Proposed Changes

Add 2 sparse encodings

Bitmap encoding

The bitmap encoding works best for sparse collections where the ratio of populated items to unpopulated items is fairly high. This implementation comprises an array of bitmap values. The bitmaps comprise 64bit unsigned long values that are interpreted a 64 bit flags where an enabled bit represents the presence of the index in the resulting collection. All disabled bits are interpreted as null values in equivalent position on the final deserialized collection.

To serialize this structure a count of the number of 64 bit bitmaps is first written followed by the array of bitmaps.

The BitMap class in org.apache.commons.collections4.bloomfilter package contains functions for manipulating this structure. [https://github.com/apache/commons-collections/blob/master/src/main/java/org/apache/commons/collections4/bloomfilter/BitMap.java]

Array Encoding

The array encoding works best for very sparse collections. The encoding works by creating an array of indices to non-null values in the list. To serialize this structure the number of items in the list (integer) is written followed by the indices in the list.

Add a flag integer to determine the type of encoding

Currently collection serializations start with an integer indicating the number of items in the collection followed by each item encoded in a similar way. As noted above nulls are encoded as an object of length -1. To signal that an encoding is in use, the first object in the collection will be specified with a length < 0, the value will determine which encoding is in use as specified in the Encoding Detection section below.

Encoding Detection

Encoding only applies to the contents of a collection. If the length of the first item in the collection is less than -1 then custom encoding is in use.

Current length special meanings

Type

Special Value

Meaning

Collection

-1

Null

UserType

-1

Null

Tuple

-1

Null



Future length special meanings

Type

Special value

Meaning

Collection

-1

Null

Collection

-2

Bitmap Encoding

Collection

-3

Array Encoding

Collection

< -3

Error

UserType

-1

Null

UserType

-2

Bitmap Encoding

UserType

-3

Array Encoding

UserType

< -3

Error

Tuple

-1

Null

Tuple

-2

Bitmap Encoding

Tuple

-3

Array Encoding

Tuple

< -3

Error



When serializing the data null references will be counted and if the bitmap or array encoding of nulls is more efficient the proper negative code value from the above table will be written as the length of the first item. Following the encoding flag will be the encoding data structure and then the actual data.

When deserializeing the data the buffer will be unencoded and the null values reinserted as -1 integers so that there is no change from the existing model. A ByteBuffer is created and populated from the encoding structure originalData interspersed with nulls as directed by the encoding. The resulting buffer will be no different than the original unserialized buffer.

Bitmap encoding structure

int length = -2

int numberOfItemsInCollection;

long[] bitmaps // = new long[ (numberOfItemsInCollection % 64)+1];

struct {

int length;

byte[] data;

} originalData []; // original data structure populated with non-null data.



Array encoding structure

int length = -3

int numberOfItemsInCollection;

int nonNullCount;

int nonNullIndices[];

struct {

int length;

byte[] data;

} originalData []; // original data structure populated with non-null data.



Add a method to apply encodings

The method will determine if applying an encoding will yield a smaller serialized form and set the flags accordingly.

The calculations are:



Define:

Ni = Number of items, the size of the array

Nn = Number of nulls in the array

Ne = Number of populated entries = Ni-Nn

Eb = Encoding space for bitmap encoding in bytes

Ea = Encoding space for array encoding in bytes

Ds = Data space = Ne * size_of_item



The Bitmap encoding space calculation:

The number of 64 bit bitmaps required is = (Ni % 64) + 1

Eb = (2 * int_size) + (long_size * bitmap_count) = 8 + 8 * ((Ni % 64) + 1)

bitmap encoding is more efficient than standard if:

Nn * 4 > Eb

Nn > 2 + 2*((Ni % 64) + 1)



The Array encoding space calculation.

Ea = (2* int_size) + (int_size * Ne) = 8 + 4*Ne

array encoding more efficient than standard if:

Nn * 4 > Ea

Nn > 2 + Ne



Array encoding is more efficient than Bitmap encoding if

Ea < Eb

8 + 4*Ne < 8 +8 * ((Ni % 64)+1)

4*Ne < 8 * ((Ni % 64)+1)

Ne < 2*((Ni % 64)+1)



Total space for encoded data

Ea + Ne*item_size

or

Eb + Ne*item_size



Total space for unencoded data = (Ni * 4) + Ne*item_size

New or Changed Public Interfaces

No changes to public interfaces.

Compatibility, Deprecation, and Migration Plan

  • There should be no impact to existing users.

  • New system will read older serialized data properly

  • Older systems will not read newly serialized sparse data properly.

  • Older behaviour will continue for non-sparse collections.

Test Plan

  • Unit tests will be implemented to verify that the system works both with big endian and little endian byte buffers.
  • Test should also verify that the encodings are only applied when they will result in preserving space.

Rejected Alternatives

  • Using the top 3 high order bits of the length integer for the encoding flags. This would still yield a reasonable upper limit on the size but the use of a the entire byte provides the ability to implement more encodings without impacting the limit.

  • Using the high order byte as a flag byte. This would require different processes for big-endian vs little-endian systems.