...

Current SQL engine has a number of critical limitations:

Query execution is hard-coded to a primitive map-reduce flow (SQL query is split into a 'map query' and 'reduce query'). This flow is fundamentally incomplete as it is impossible to execute the following queries:
- SELECT t1.a, t2.b FROM t1, t2 WHERE t1.id = t2.id - silently returns wrong results in case of t1 and t2 are not co-located
- SELECT * FROM person p WHERE p.salary > (SELECT avg(salary) from person) - doesn’t work at all because it cannot be executed in two steps
No data co-location control, i.e. arbitrary data can be returned silently
Low control on how query executes internally, as a result we have limited possibility to implement improvements/fixes.
Limited execution modes: either two-phase execution (default) or "distributed joins" which adds one more phase. This leads to the impossibility "by design" of executing some queries, see
- Jira
  server ASF JIRA
  columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
  serverId 5aa69414-a9e9-3523-82ec-879b028fb15b
  key IGNITE-11448
Query execution is done using H2 query engine which itself has several issues
- Low control on how query executes internally, as a result, we have limited possibility to implement improvements/fixes (H2 is outside of ASF)
- H2 is a local database and has no notion of distributed query execution
- H2 lacks the
Lack of
- proper planner which will take in count both data distribution and data statistics
- H2 optimizer is very primitive. It can do only predicates push down, join order choosing and also some minor optimizations. It lacks
of
- many useful optimizations like
this one
- the following
  Jira
  server ASF JIRA
  columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
  serverId 5aa69414-a9e9-3523-82ec-879b028fb15b
  key IGNITE-6085

Classical database query execution is done using roughly the following steps [1], [2]

SQL text is parsed and validated, which produces an Abstract Syntax Tree (AST) representing the query
Query rewrite (output: rewritten SQL AST) - need to (for example) turn DML queries into selects, returning set of rows to change (queries like 'INSERT INTO table1 SELECT * FROM table 2').
AST is transformed into Query Plan (output: query execution graph, which is identical to the original query but operates in terms of relational algebra)
Query execution graph is optimized (output: optimized query execution graph with is semantically identical to the original query)
Query execution (output: resulting cursor)

The key point in the aforementioned plan is to have a relational algebra execution plan which can undergo arbitrary equivalence transformations by the rules of relational algebra. There is a well-studied optimization approach used in many production systems [3] for optimizing query execution plans.

Description

The approach to solve the limitations implies more complex execution flow that brings a new abstraction: idea of this IEP is to introduce all missing intermediate steps into the query execution flow and operate on query execution graph.

This graph consists of all execution steps (relational operators) like join, filter, sort, etc. The execution graph may be transformed into another one saving query semantic during query optimisation process using relational algebra and transformation rules. After transformation the graph is split into a set of dependent subgraphs where an output of dependent subgraph is an aggregated input of depending one (in other words we have more than one map-reduce steps).

Optimized query execution graph (or query plan) used to build final execution tasks.

So, the query flow gets a number of additional steps and will consist of next steps:

...

.

...

Example:

Initial query:


SELECT t1.name, t2.name as projectName FROM Persons t1, Projects t2 where t1.id == t2.responsiblePersonId

...

http://apache-ignite-developers.2346864.n4.nabble.com/New-SQL-execution-engine-td43724.html

Reference Links

[1]

[2] https://www.youtube.com/playlist?list=PLSE8ODhjZXja7K1hjZ01UTVDnGQdx5v5U

[3] https://cs.uwaterloo.ca/~david/cs848/volcano.pdf

https://calcite.apache.org/

...

Page tree

Versions Compared

Old Version 8

New Version 9

Key

Description

Example:

Initial query:

Reference Links

Page tree

Page History

Versions Compared

Old Version 8

New Version 9

Key

Description

Example:

Initial query:

Reference Links