...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
SELECT * FROM emp WHERE emp.id IN (100, :1) => ({name=100}, {name=:1}) => (P1, :1) |
Range conditions could be converted to IN statements if column is of integer type.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
SELECT * FROM emp WHERE emp.id BETWEEN 100 AND 102
=> {id BETWEEN 100 AND 102} => {100, 101, 102} => (P1, P2, P3) |
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
SELECT * FROM emp WHERE emp.id > 100 AND emp.id <= 102
=> {id > 100 AND id <= 102} => {101, 102} => (P1, P2) |
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
(P1) OR (P2) => (P1, P2) (P1) OR (ALL) => (ALL) (P1) OR () => (P1) (P1, P2) OR (P2, P3) => (P1, P2, P3) (:1) OR (:2) => (:1, :2) (P1, :1) OR (P2, :2) => (P1, P2, :1, :2) |
TODO
TODO
...
TODO
TODO
TODO
TODO
...
TODO
This technique is used excessively by both distributed databases and classical RDBMS vendors (as a part of their partitioning/sharding solutions). First, query is analyzed and optionally re-written to allow for better partition pruning. Second, information of interesting partitions are extracted from query for every table (cache) and subquery (for complex multi-stage distributed queries).
WHERE conditions and JOIN-s are relatively simple targets for partition extraction. The main difficulty is subqueries, as they potentially introduce unlimited nesting that complicates analysis a lot. So the main goal of query rewrite is to minimize number of subqueries. Main techniques are JOIN conversion, correlation and materialization. Interestingly, all these techniques not only simplify partition extraction, but also allow optimizer to choose better execution plan for local execution, as JOINs are much easier to be analyzed and reordered than subqueries. As we shall see below, efforts are usually focused on non-correlated subqueries as they are the most costly for straightforward execution.
...