Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagesql
titleAfter
linenumberstrue
SELECT dept.name
FROM dept JOIN (SELECT emp.dept_id FROM emp GROUP BY emp.dept_id HAVING COUNT(*) > 10) emp_subquery ON dept.id=emp_subquery.dept_id

Correlation

Uncorrelated subquery may need to scan huge result set. If underlying cache is PARTITIONED, then multiple nodes need to be scanned. If underlying cache is REPLICATED, then scan is local. If previous optimizations are not applicable, we can try to push-down condition from outer table to subquery. If condition is equality and both outer and inner columns are co-located, then potentially distributed query is reduced to co-located query. The downside is that subquery must be evaluated multiple times, once for every tuple from outer relation. This might be less than optimal comparing to subquery materialization explained below.  MySQL refer to this as IN-to-EXISTS optimization [2].

Example 5: Converting non-correlated subquery to correlated via condition push-down

Code Block
languagesql
titleBefore
linenumberstrue
SELECT emp.name
FROM emp
WHERE emp.dept_id IN (SELECT id FROM dept WHERE state='CA')


Code Block
languagesql
titleAfter
linenumberstrue
SELECT emp.name
FROM emp
WHERE EXISTS (SELECT id FROM dept WHERE dept.id=emp.dept_id AND state='CA')

Materialization

If subquery is not correlated, then it can be executed only once. Result is cached in temporary table and then re-used. If expected number of returned records is small, then it makes sense to execute the request before other query parts, and send obtained result to other map nodes. If result is scalar, then entire query is replaced with a single result.

This way some queries which were too complex to extract target partitions from may be reduced to simpler form, where extraction is possible.

[1] https://mariadb.com/kb/en/library/table-pullout-optimization/

Correlation

TODO

Materialization

TODO[2] https://dev.mysql.com/doc/refman/8.0/en/subquery-optimization-with-exists.html

Partition Extraction

TOODTODO

Proposed Changes

TODO

Tickets

...