Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

StatusStatus

...

Page properties

...


Discussion thread

Discussion threadhttp://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-18-Code-Generation-for-improving-sorting-performance-tc16486.html

...


Vote thread
JIRA

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-5734

Pull-request : https://github.com/apache/flink/pull/3511

Released: TBD

Release


Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

These optimizations are studied in FLINK-4867 and the results are promising.

...


Sorter \ No. records

10,000

100,000

1,000,000

Original NormalizedKeySorter

4.70 ms

61.53 ms

609.00 ms

SwapViaPutGetLong

3.34 ms

49.88 ms

471.30 ms

UseLittleEndian

4.20 ms

56.15 ms

578.14 ms

CompareUnrollLoop

4.29 ms

53.59 ms

605.32 ms


Table 1 : Comparison between optimization ideas from 
FLINK-4867 and original NormalizedKeySorter

 

Another potential improvement could be eliminating expensive mathematical operators such as division and modulo, for example, when computing memory segment index.  One proposed solution that trying to improvement this problem is  Greg Hogan ’s FLINK-3722. Another possible ones could be using bitwise operators or replacing divisors with a constant that we know at compile-time [1]. From our experiment, we have found that Greg's solution is the best one, so we will base our implementation on his pull-request.


 

Sorter \ No. records

10,000

100,000

1,000,000

Original NormalizedKeySorter

4.34 ms

61.79 ms

582.88 ms

Flink-3722 NormalizedKeySorter

3.00 ms

35.90 ms

386.95 ms

DividedByConstant

3.49 ms

45.14 ms

436.22 ms

UsingBitwiseOperators

3.33 ms

42.22 ms

419.98 ms


Table 2 : Comparison between 3 approaches of optimization for division and modulo
 


Because these optimization techniques depend on knowledge of each execution context, for example, data-type of sorting key and endianness of machine, we would need to apply applicable optimizations on the fly. In other words, we will dynamically generate the most efficient sorter for each context on each worker. 

External Libraries 

In this proposal, we plan to use 2 external libraries that are the same as FLINK-3599 : Code Generation in Serializers is using, namely

  • Janino: A minimal Java compiler that we will use to compile the generated code.
  • FreeMarker : a Java template engine.


...