Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current stateAcceptedAdopted

Discussion thread: Link

JIRA

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyKAFKA-8326

...

I believe there are many use cases where List Serde could be useful.:Ex.

...

For instance, aggregate grouped (by key) values together in a list to do other subsequent operations on the collection.

...

First, we need to specify that we are going to use a list serde:

default.key/value.serde = org.apache.kafka.common.serialization.Serdes$ListSerde

Then, we need to introduce two brand new configurations and here I'm proposing these four extra properties:

CommonClientConfigs

...

.class

...

:

...

 DEFAULT_LIST_KEY/

...

VALUE_SERDE_TYPE_CLASS = "default.list.key.serde.type"

Ex. default.list.key/value.serde.type = java.util.ArrayList

CommonClientConfigs.class

...

: DEFAULT_LIST_KEY/

...

VALUE_SERDE_INNER_CLASS = "default.list.key.serde.inner"

Ex. default.list.key/value.serde.inner = org.apache.kafka.common.serialization.Serdes$IntegerSerde

Other proposed properties:

ConsumerConfig.classLIST_KEY_DESERIALIZER_TYPE_CLASS_CONFIG = "list.key.deserializer.type"

ConsumerConfig.classLIST_KEY_DESERIALIZER_INNER_CLASS_CONFIG = "list.key.deserializer.inner"

ProducerConfig.classLIST_KEY_SERIALIZER_INNER_CLASS_CONFIG = "list.key.serializer.inner"

P.S. We do not need a type class config for the serializer since we do not care about the type of the list class during serialization.

P.P.S. Properties default.list.key/value.* will be ignored as long as default.key/value.serde is not set to org.apache.kafka.common.serialization.Serdes$ListSerde

Serialization Strategy

For the performance purposes the following serialization strategy was put in place. Depending on the type of an inner serde (a list's element type) the serialization will be performed in the following ways:

  1. If an inner serde has one of the following serializers (LongSerializer.class, IntegerSerializer.class, ShortSerializer.class, FloatSerializer.class, DoubleSerializer.class), then the final payload will not contain each element's size encoded since the size of presented types is static (8 bytes, 4 bytes, 2 bytes, etc)
  2. If the inner serde doesn't have one of the serializers listed above, then a size of each element will be encoded in the final payload (see below)

P.S. Properties default.list.key/value.* will be ignored as long as default.key/value.serde is not set to org.apache.kafka.common.serialization.Serdes$ListSerde

Serialization Strategy

For the performance purposes the following serialization strategies were put in place:

enumSerializationStrategy {
    CONSTANT_SIZE,
    VARIABLE_SIZE;
}

Depending on the type of an inner serde (a list's element type) the serialization will be performed in the following ways:

  1. For SerializationStrategy.CONSTANT_SIZE, if an inner serde has one of the following serializers (ShortSerializer.class, IntegerSerializer.class, FloatSerializer.class, LongSerializer.class, DoubleSerializer.class, UUIDSerializer.class), then the final payload will not contain each element's size encoded since sizes of presented types are static (2 bytes, 4 bytes, 8 bytes, etc.)
  2. For SerializationStrategy.VARIABLE_SIZE, if the inner serde doesn't have one of the serializers listed above, then a size of each element will be encoded in the final payload (see below)

Additionally, there are two different ways of serializing NULL values within the payload:

  1. For SerializationStrategy.CONSTANT_SIZE, the list serializer will generate a null index list that contains indexes of all null entries within the payload
  2. For SerializationStrategy.VARIABLE_SIZE, the list serializer instead will write Serdes.ListSerde.NULL_ENTRY_VALUE (-1 by default) for the size of a null entry
                                                                                                            
        CONSTANT_SIZE                                    VARIABLE_SIZE                                      
                                                                                                            
+----------------------------+                   +----------------------------+                             
|    SerializationStrategy   |                   |    SerializationStrategy   |                             
|            Flag            |                   |            Flag            |                             
|----------------------------|                   |----------------------------|                             
|    NullIndexList.size()    |                   |     PayloadList.size()     |                             
|----------------------------|                   |----------------------------|                             
|        Null index 1        |                   |      Size of entry 1       |                             
|----------------------------|                   |----------------------------|                             
|        Null index 2        |                   |                            |                             
|----------------------------|                   |          Entry 1           |                             
|            ...             |                   |                            |                             
|----------------------------|                   |----------------------------|                             
|     PayloadList.size()     |                   |      Size of entry 2       |                             
|----------------------------|                   |----------------------------|                             
|                            |                   |                            |                             
|          Entry 1     Case  1    |                   |      Case 2   Entry 1           
|                             
|                            |               
    |     +------------------+                    +------------------+   |     
         |                  |        -
|----------------------------|       |                  |        
     Int |   Size of list   ||----------------------------|                       Int |   Size of list
|   |        
         |        |          |         |      |                  |    |    
         |------------------|               |------------------|  
|         
 Entry 2       |    |              |     |          |                  |        
         |     Entry 1      
|           Int |  Size of entry 1 |        
  |       |            |      |                |      |            |        
         |--------
|----------|               |------------------|          
         |                            |                             
|                            |          
         |     Entry  2      |               |           Entry    1      |        
|          |                  |               |    |            ...  |        
   |      |------------------|               |------------------|        
|          |                  |               |    |              |        
         |                  |           Int 
|   Size of entry 2 |        
         |    |              |     |          |                  |         
         |           
|       |     ...          |------------------|   |     
         |     |             |               |                  |        
   
|      |                  |    |           |     Entry 2  |    |        
         |       ... |       |               |       
|           |        
         |                   |               |------------------|        
     |    |                  |       
|        |                  |  |      
         |     |             |               |                  |        
   
|      |                  |    |           |       ... |       |        
         |    |              |               
|                   |        
 |        |           |       |               |      |            |        
         
+----------------------------+                   +----------------------------+                             

Compatibility, Deprecation, and Migration Plan

...