Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
{
  "asHex": "D3J5",
  "asString": "10.12345"
  "asNumber": 10.2345
}

Public Interfaces

Two new configurations will be added to the JsonConverterConfig:

json.decimal.serialization

...

.

...

format

This configuration will be supported in the JsonConverter and will be used to determine the serialization format of decimals.

...

Both of these will be string values that are managed by a new enumeration:

Code Block
    public enum SerializationFormat {
        BINARY,
        TEXT,
        NUMERIC;

        public static SerializationFormat forName(String name) {
            return SerializationFormat.valueOf(name.toUpperCase(Locale.ROOT));
        }
    }

As of this change, only BINARY, TEXT and NUMERIC values will be supported. The defaults for both of these values default value will be BINARY to maintain backwards compatibility. 

json.decimal.deserialization.text.format

This configuration will be supported in the JsonConverter and will be used to disambiguate between base64 encoded binary and textual representations of decimal values. As of this change, BINARY and TEXT will both be supported (numeric values will be automatically deserialized and will not be affected by this configuration).

Proposed Changes

JsonConverter will be configurable with the new values. If the values are present, then it will attempt to serialize and deserialize the input values based on the configuration values listed above respectively. 

...

This change is backwards compatible, and no functionality will be deprecated. Users must be careful when enabling the new serialization functionality to ensure that all downstream data consumers can read data serialized in the new format. Rolling upgrades from BINARY to TEXT will require five steps, and will be impossible in some scenarios (e.g. infinite retention topics):

  1. Upgrade all consumers to the new code, keeping the BINARY deserialization option
  2. Upgrade all producers to the new code, and use NUMERIC as the serialization option (consumers will be able to automatically deserialize numeric values)
  3. Wait for retention period on the topic to pass
  4. Change the consumer to use TEXT to deserialize strings
  5. Change the producer to use TEXT to serialize strings

Rejected Alternatives

  • Encoding the serialization in the schema for Decimal LogicalType. This is good because it means that the deserializer will be able to decode based on the schema and one converter can handle different topics encoded differently as long as the schema is in line. The problem is that this is specific to only JSON and changing the LogicalType is not the right place.
  • Automatically detecting the serialization format. While it is possible to automatically differentiate NUMERIC from TEXT and BINARY, it is not always possible to differentiate between TEXT from BINARY. Take, for example, the string "12" - this is both a valid decimal (12) and a valid hex string which represents a decimal (1.8).