You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Currently, AsterixDB code base has 98 rules total. 83 rules are in use by Asterix and are grouped in collections.

We have 12 collections and the role of each collection is detailed below. Additionally, there are 6 rules for Hivesterix, 16 rules for Vxquery, 5 rules are abstract rules or extensions and 4 rules are currently not used.

These are our collections (1-9 logical rules, 10-12 physical rules):

  1. TypeInference (3) - validation of inputs types and filter conditions + unnesting.

  2. Normalization (17) - extractions, simplification, functions and operators transformations

  3. CondPushDownAndJoinInference (25) - rearrangement of operators order and transformations

  4. LoadFields (9) - functions and field access improvements.

  5. FuzzyJoin (2) - InferTypes + just one rule related to FuzzyJoin (currently disabled)

  6. Consolidation (9) - Minimization of the plan

  7. AccessMethod (6) - Access operators introduce indexes, join removed

  8. PlanCleanup (6) - most were already fired + adding early projects

  9. DataExchange (1) - prior to physical plan, conversions from local to unpartitioned?

  10. PhysicalRewritesAllLevel (13) - physical transformations

  11. PhysicalRewritesTopLevel (6) - optimization of the physical plan

  12. PrepareForJobGen (4) - Adds one to one exchange and rearranges.


We are calling all our collection rules, in a slightly different order than the above:

  1. TypeInference

  2. Normalization

  3. CondPushDownAndJoinInference

  4. LoadFields

  5. Fuzzy

  6. Normalization - repeated

  7. CondPushDownAndJoinInference - repeated

  8. LoadFields - repeated

  9. DataExchange

  10. PhysicalRewritesAllLevel

  11. PhysicalRewritesTopLevel

  12. PrepareForJobGen

  13. PhysicalRewritesAllLevel

  14. PhysicalRewritesTopLevel

  15. PrepareForJobGen

Abstract rules:

AbstractDecorrelationRule implemented by IntroJoinInsideSubplanRule

AbstractExtractExprRule implemented ExtractDistinctByExpressionsRule,  ExtractGbyExpressionsRule, ExtractOrderExpressionsRule

AbstractIntroduceAccessMethodRule implemented by IntroduceJoinAccessMethodRule, IntroduceSelectAccessMethodRule

AbstractIntroduceCombinerRule implemented by IntroduceAggregateCombinerRule, IntroduceGroupByCombinerRule

InlineVariablesRule - AsterixInlineVariablesRule actually extends this rule


Rules used by Hive or Piglet:

InsertProjectBeforeWriteRule - used by Hive

IntroduceEarlyProjectRule - used by Hive

LocalGroupByRule - used by Hive

PushProjectIntoDataSourceScanRule - used by Hive and Piglet

RemoveRedundantProjectionRule - used by Hive

RemoveRedundantSelectRule - used by Hive



Types of rules:

SequentialFixpointRuleController(false) = you don’t do a DFS and apply the rule to the operator only, however you reiterate on the group of rules  until you  get no change

SequentialFixpointRuleController(true) = you do a DFS

SequentialOnceRuleController(true) = you apply the rule once

There are 60 beyond-compare sessions ready to show the plan before and after the rule was applied. Note that not all rules produce an apparent change in the plan.

I am figuring out a way to expose them on this wiki in a clean way - WIP
  • No labels