You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Status

Current state: "Under Discussion"

Discussion thread: Google Doc



JIRA: TBD

Released: TBD

Motivation

Apache Thrift (along with protocol-buf ) is widely adopted as a de facto standard of high throughput network traffic protocol. Historically, Companies like Pinterest have been utilizing thrift to encode strongly typed Kafka messages, and persist to object storage as sequence files in the data warehouse. 

Major benefits of this approach were that  

  • Versioned thrift schema files served as a schema registry where producers and consumers across languages could encode/decode with the latest schema.
  • Minimize overhead of maintaining translation ETL jobs which flatten schema or adding additional accessory fields during ingestion
  • Lower storage footprint


Other than missing out optimization comes with storage format conversion, running jobs against unsupported thrift format also poses a challenge of maintenance and upgrades flink jobs given 

  • lack of backward-compatible thrift encoding/decoding support in Flink
  • lack of inference Table schema DDL support 

Proposed Changes

Compatibility, Deprecation, and Migration Plan

Test Plan

Describe in few sentences how the FLIP will be tested. We are mostly interested in system tests (since unit-tests are specific to implementation details). How will we know that the implementation works as expected? How will we know nothing broke?

Rejected Alternatives

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.

  • No labels