Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Append Discussion Thread

...

Page properties


Discussion threadhttps://lists.apache.org/thread/9mt07mnbwf1rwftzsbxz3jkcrp8dvkl5
Vote thread
JIRA

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-33873

Release


Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

HyperLogLog is a probabilistic data structure used for estimating the cardinality of a dataset, which is the number of unique elements in a set. In production, hyperloglog is used in a wide range of scenarios, such as calculating unique visitors. For this data structure/algorithm, You can find more information at: https://redis.io/docs/data-types/probabilistic/hyperloglogs/

The Redis stream (another Redis data structure) connector is in progress, see FLIP-254. I think it is possible to create a sink connector for HyperLogLog.

For different data structures of Redis, there are different ways to use them. Rather than maintaining a multi-functional connector, it is more appropriate to develop a corresponding connector for each feasible data structure.

Public Interfaces

For irreversible data structures, it is not possible to read the original data from them. Therefore, the Redis HyperLogLog connector will only consist of Sink. The following interface will be used: 

...

This is a new feature, no compatibility, deprecation, or migration plan is expected.

Test Plan

We will add the following tests:

  • Unit tests
  • Integration tests that perform end-to-end tests against a Redis HyperLogLog test container

Rejected Alternatives

N/A