This page is meant as a template for writing a CEP. To create a CEP choose Tools->Copy on this page and modify with your content and replace the heading with the next CEP number and a description of your issue. Replace anything in italics with your own description.
Status
Current state: "Withdrawn" – CEP is not the proper form for this change and change can not be made within the confines of a non-breaking change.
Discussion thread: here
JIRA: CASSANDRA-8959
Released: <Cassandra Version>
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Scope
Provide an algorithm for implementations of sparse data serialization.
Goals
To provide a more space efficient method of serialising sparse collections.
To support current serialisation strategy without modification.
Non Goals
This change does not modify the encoding types that are not collections. They continue to be encoded as they are now.
Approach
Provide multiple encodings for collections based on the number of null values in the collection.
Timeline
todo
Mailing list / Slack channels
Mailing list:
Slack channel:
Discussion threads:
Related JIRA tickets
JIRA(s):
Motivation
The current serialization format for UDTs and List collections has a fixed overhead of 4 bytes per defined field (encoding the size of the field). This is inefficient for sparse UDTs and Lists (hereafter just called collections) - ones with many defined fields, but few of them present.
The motivation for this change is to reduce the space necessary to store specific classes of sparse collections.
Audience
From the user perspective this change will reduce the storage requirements for some collections. This change is otherwise transparent to the user.
From the operator perspective this is transparent.
From the developer perspective this change provides an algorithm to efficiently serialize Cassandra data.
Proposed Changes
Current State
The current implementation writes the number of items in the collection (integer) and then serializes each item in the collection. Nulls in the collection are serialized with a length of -1. Upon deserialization any length less than 0 is considered a null.
Proposed Changes
Add 2 sparse encodings
Bitmap encoding
The bitmap encoding works best for sparse collections where the ratio of populated items to unpopulated items is fairly high. This implementation comprises an array of bitmap values. The bitmaps comprise 64bit unsigned long values that are interpreted a 64 bit flags where an enabled bit represents the presence of the index in the resulting collection. All disabled bits are interpreted as null values in equivalent position on the final deserialized collection.
To serialize this structure a count of the number of 64 bit bitmaps is first written followed by the array of bitmaps.
The BitMap class in org.apache.commons.collections4.bloomfilter package contains functions for manipulating this structure. [https://github.com/apache/commons-collections/blob/master/src/main/java/org/apache/commons/collections4/bloomfilter/BitMap.java]
Array Encoding
The array encoding works best for very sparse collections. The encoding works by creating an array of indices to non-null values in the list. To serialize this structure the number of items in the list (integer) is written followed by the indices in the list.
Add a flag integer to determine the type of encoding
Currently collection serializations start with an integer indicating the number of items in the collection followed by each item encoded in a similar way. As noted above nulls are encoded as an object of length -1. To signal that an encoding is in use, the first object in the collection will be specified with a length < 0, the value will determine which encoding is in use as specified in the Encoding Detection section below.
Encoding Detection
Encoding only applies to the contents of a collection. If the length of the first item in the collection is less than -1 then custom encoding is in use.
Current length special meanings
Type | Special Value | Meaning |
Collection | -1 | Null |
UserType | -1 | Null |
Tuple | -1 | Null |
Future length special meanings
Type | Special value | Meaning |
Collection | -1 | Null |
Collection | -2 | Bitmap Encoding |
Collection | -3 | Array Encoding |
Collection | < -3 | Error |
UserType | -1 | Null |
UserType | -2 | Bitmap Encoding |
UserType | -3 | Array Encoding |
UserType | < -3 | Error |
Tuple | -1 | Null |
Tuple | -2 | Bitmap Encoding |
Tuple | -3 | Array Encoding |
Tuple | < -3 | Error |
When serializing the data null references will be counted and if the bitmap or array encoding of nulls is more efficient the proper negative code value from the above table will be written as the length of the first item. Following the encoding flag will be the encoding data structure and then the actual data.
When deserializeing the data the buffer will be unencoded and the null values reinserted as -1 integers so that there is no change from the existing model. A ByteBuffer is created and populated from the encoding structure originalData interspersed with nulls as directed by the encoding. The resulting buffer will be no different than the original unserialized buffer.
Bitmap encoding structure
int length = -2
int numberOfItemsInCollection;
long[] bitmaps // = new long[ (numberOfItemsInCollection % 64)+1];
struct {
int length;
byte[] data;
} originalData []; // original data structure populated with non-null data.
Array encoding structure
int length = -3
int numberOfItemsInCollection;
int nonNullCount;
int nonNullIndices[];
struct {
int length;
byte[] data;
} originalData []; // original data structure populated with non-null data.
Add a method to apply encodings
The method will determine if applying an encoding will yield a smaller serialized form and set the flags accordingly.
The calculations are:
Define:
Ni = Number of items, the size of the array
Nn = Number of nulls in the array
Ne = Number of populated entries = Ni-Nn
Eb = Encoding space for bitmap encoding in bytes
Ea = Encoding space for array encoding in bytes
Ds = Data space = Ne * size_of_item
The Bitmap encoding space calculation:
The number of 64 bit bitmaps required is = (Ni % 64) + 1
Eb = (2 * int_size) + (long_size * bitmap_count) = 8 + 8 * ((Ni % 64) + 1)
bitmap encoding is more efficient than standard if:
Nn * 4 > Eb
Nn > 2 + 2*((Ni % 64) + 1)
The Array encoding space calculation.
Ea = (2* int_size) + (int_size * Ne) = 8 + 4*Ne
array encoding more efficient than standard if:
Nn * 4 > Ea
Nn > 2 + Ne
Array encoding is more efficient than Bitmap encoding if
Ea < Eb
8 + 4*Ne < 8 +8 * ((Ni % 64)+1)
4*Ne < 8 * ((Ni % 64)+1)
Ne < 2*((Ni % 64)+1)
Total space for encoded data
Ea + Ne*item_size
or
Eb + Ne*item_size
Total space for unencoded data = (Ni * 4) + Ne*item_size
New or Changed Public Interfaces
No changes to public interfaces.
Compatibility, Deprecation, and Migration Plan
There should be no impact to existing users.
New system will read older serialized data properly
Older systems will not read newly serialized sparse data properly.
Older behaviour will continue for non-sparse collections.
Test Plan
- Unit tests will be implemented to verify that the system works both with big endian and little endian byte buffers.
- Test should also verify that the encodings are only applied when they will result in preserving space.
Rejected Alternatives
Using the top 3 high order bits of the length integer for the encoding flags. This would still yield a reasonable upper limit on the size but the use of a the entire byte provides the ability to implement more encodings without impacting the limit.
Using the high order byte as a flag byte. This would require different processes for big-endian vs little-endian systems.