Indexed Edit-distance Join Improvement

Currently, when the optimizer sees an edit-distance-check() function and an applicable n-gram index on the predicate, it creates a combination of index-nested-loop-join plan and non-indexed-nested-loop-join plan. This is because there may be a panic case (calculated T in T-occurrence problem is less than or equal to zero so that we can't utilize an index and need to do a scan.). So, There should be two paths (1: T>0 - index-nl, 2: T<=0 - non-indexed-nl). The "before" section in the attachement shows this plan. The issues of the current approach is that regardless of the T value, all incoming frames from the sub-tree are propagated to the both branches. Therefore, T is calculated twice for the same tuple. These issues can be solved if we use the SPLIT operator and let SPLIT operator sends each tuple to only one branch using T value. The proposed solution is shown in the "after" part. T will be calculated only once and each tuple will be propagated to the only one branch.

Page tree

Indexed Edit-distance Join Improvement