Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

...

Page properties


Discussion thread

...

...

...

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-23687

...

Release


Motivation

Look up join is commonly used feature in Flink SQL. We have received many optimization requirements on look up join. For example:
1. Suggests left side of lookup join do a hash partitioner to raise cache hint ratio

...

Code Block
languagesql
SELECT /*+ SHUFFLE_HASH('Orders', 'Customers') */ o.order_id, o.total, c.country, c.zip
FROM Orders AS o
JOIN Customers FOR SYSTEM_TIME AS OF o.proc_time AS c
ON o.customer_id = c.id;


Note:

  1. Table names name in SHUFFLE_HASH hint is required to avoid mistake hint propagation if there are multiple lookup joinsbuild table name. Lookup join only supports dimension table to be build table, does not support left side to be build table.
  2. The hint only provides a suggestion to the optimization, it is not an enforcer.

...

  1. the correlate is a look up join, other correlate would ignore the hint
  2. the correlate has "Orders" and has  "Customers" as the input table names. In theory, it's better to support both table names and alias names, but the alias name of subquery or table would not be lost by the sql converter and sql optimizer. So here we only support table names.dimension table name. 

The code below shows how we define hint strategy for hash lookupJoin.

Code Block
languagejava
titleSHUFFLE_HASH hint strategy
      builder
        .hintStrategy("SHUFFLE_HASH",
            HintStrategy.builder(
                HintPredicates.and(HintPredicates.CORRELATE, isLookupJoin(), lookupJoinWithFixedTableNamewithBuildTableName())))
        .build();

Note,

it has a blocker on Calcite version upgrade.

...

Code Block
languagesql
SELECT /*+ USE_HASH('Orders', 'Customers') */ o.order_id, o.total, c.country, c.zip
FROM Orders AS o
JOIN Customers FOR SYSTEM_TIME AS OF o.proc_time AS c
ON o.customer_id = c.id;

...