Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

StatusStatus

...

Page properties

...


Discussion thread

Discussion threadhere (<- link to https://mail-archives.apache.org/mod_mbox/flink-dev/)

...


Vote thread
JIRA

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-5734

Pull-request : https://github.com/apache/flink/pull/3511

...

Release


Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

Execution environment is also important aspect that we can leverage. Because Flink serializes the key of a record to MemorySegment in big-endian format regardless endianness of the worker, when retrieving it back for comparison it need to reverse the bytes first. In this case, we compensate this at serialization process in which we will need to do byte-reversing only once for each record instead of every comparison. This means that the number of byte-reversing will become O(n) instead of O(n log n).

...

These optimizations are studied in FLINK-4867 and the results are promising.


 

Sorter \ No. records

10,000

100,000

1,000,000

Original NormalizedKeySorter

4.70 ms

61.53 ms

609.00 ms

SwapViaPutGetLong

3.34 ms

49.88 ms

471.30 ms

UseLittleEndian

4.20 ms

56.15 ms

578.14 ms

CompareUnrollLoop

4.29 ms

53.59 ms

605.32 ms


Table 1 : Comparison between optimization ideas from 
FLINK-4867 and original NormalizedKeySorter

 

Another potential improvement could be eliminating expensive mathematical operators such as division and modulo, for example, when computing memory segment index.  One proposed solution that trying to improvement this problem is  Greg Hogan ’s FLINK-3722. Another possible ones could be using bitwise operators or replacing divisors with a constant that we know at compile-time [1]. From our experiment, we have found that Greg's solution is the best one, so we will base our implementation on his pull-request.


 

Sorter \ No. records

10,000

100,000

1,000,000

Original NormalizedKeySorter

4.34 ms

61.79 ms

582.88 ms

Flink-3722 NormalizedKeySorter

3.00 ms

35.90 ms

386.95 ms

DividedByConstant

3.49 ms

45.14 ms

436.22 ms

UsingBitwiseOperators

3.33 ms

42.22 ms

419.98 ms


Table 2 : Comparison between 3 approaches of optimization for division and modulo
 


Because these optimization techniques depend on knowledge of each execution context, for example, data-type of sorting key and endianness of machine, we would need to apply applicable optimizations on the fly. In other words, we will dynamically generate the most efficient sorter for each context on each worker. 

External Libraries 

In this proposal, we plan to use 2 external libraries that are the same as FLINK-3599 : Code Generation in Serializers is using, namely

  • Janino: A minimal Java compiler that we will use to compile the generated code.
  • FreeMarker : a Java template engine.


...