Currently, AsterixDB code base has 98 rules total. 83 rules are in use by Asterix and are grouped in collections.
We have 12 collections and the role of each collection is detailed below. Additionally, there are 6 rules for Hivesterix, 16 rules for Vxquery, 5 rules are abstract rules or extensions and 4 rules are currently not used.
These are our collections (1-9 logical rules, 10-12 physical rules):
TypeInference (3) - validation of inputs types and filter conditions + unnesting.
Normalization (17) - extractions, simplification, functions and operators transformations
CondPushDownAndJoinInference (25) - rearrangement of operators order and transformations
LoadFields (9) - functions and field access improvements.
FuzzyJoin (2) - InferTypes + just one rule related to FuzzyJoin (currently disabled)
Consolidation (9) - Minimization of the plan
AccessMethod (6) - Access operators introduce indexes, join removed
PlanCleanup (6) - most were already fired + adding early projects
DataExchange (1) - prior to physical plan, conversions from local to unpartitioned?
PhysicalRewritesAllLevel (13) - physical transformations
PhysicalRewritesTopLevel (6) - optimization of the physical plan
PrepareForJobGen (4) - Adds one to one exchange and rearranges.
We are calling all our collection rules, in a slightly different order than the above:
TypeInference
Normalization
CondPushDownAndJoinInference
LoadFields
Fuzzy
Normalization - repeated
CondPushDownAndJoinInference - repeated
LoadFields - repeated
DataExchange
PhysicalRewritesAllLevel
PhysicalRewritesTopLevel
PrepareForJobGen
PhysicalRewritesAllLevel
PhysicalRewritesTopLevel
PrepareForJobGen
Abstract rules:
AbstractDecorrelationRule implemented by IntroJoinInsideSubplanRule
AbstractExtractExprRule implemented ExtractDistinctByExpressionsRule, ExtractGbyExpressionsRule, ExtractOrderExpressionsRule
AbstractIntroduceAccessMethodRule implemented by IntroduceJoinAccessMethodRule, IntroduceSelectAccessMethodRule
AbstractIntroduceCombinerRule implemented by IntroduceAggregateCombinerRule, IntroduceGroupByCombinerRule
InlineVariablesRule - AsterixInlineVariablesRule actually extends this rule
Rules used by Hive or Piglet:
InsertProjectBeforeWriteRule - used by Hive
IntroduceEarlyProjectRule - used by Hive
LocalGroupByRule - used by Hive
PushProjectIntoDataSourceScanRule - used by Hive and Piglet
RemoveRedundantProjectionRule - used by Hive
RemoveRedundantSelectRule - used by Hive
Types of rules:
SequentialFixpointRuleController(false) = you don’t do a DFS and apply the rule to the operator only, however you reiterate on the group of rules until you get no change
SequentialFixpointRuleController(true) = you do a DFS
SequentialOnceRuleController(true) = you apply the rule onceThere are 60 beyond-compare sessions ready to show the plan before and after the rule was applied. Note that not all rules produce an apparent change in the plan.