Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This optimization tries to intersect multiple secondary indexes if the select conditions introduce them. Previously, as in the IntroduceSelectAccessMethodRule, we would pick the first index to contribute to the access path when there were multiple indexes available. Due to the lack of statistical information, the first one may not be the best choice. Moreover, even we chose the index of the lowest selectivity, it still may not be the best solution. Because we can further reduce the selectivity by intersecting it with the other secondary indexes. Having intersection into the plan will avoid the worst path. Furthermore, if we have statistical information later we can have another option to take into consideration.keep improving the decision by whether to introduce the intersection or not.

Table of Contents

Optimization Rule

The logical changes are in the IntroduceSelectAccessMethodRule. After we analyzed the interesting functions and indexes, we pair them up as one to one mapping. (E.g. BTreeAccessMethod -> BTreeIndex On Salary.).

If the primary index appears in the mapping, we will simply use it as the access path. Because usually the primary index lookup is very rare, if it indeed happens, then it should have a very high chance to be a high selectivity path. Plus, as a clustered index, primary index search is the fastest one. 

If multiple secondary indexes are selected, then we will let each of them to contribute a secondary index to primary index path. A new interface "createSecondaryToPrimaryPlan()" is added to the IAccessMethod for this purpose. (The implementation is required to be functional. Otherwise, different access methods may introduce the conflict states.) Then we use an Intersection logical operator as a join point to intersect the primary keys coming from different secondary index path. 

The following example shows we intersect the BTree,RTree and NgramInvertedIndex on primary key before goes to primary index lookup.

...

Physically, each ETS node will introduce a thread. Thus, the intersection operator must synchronize the upstream input threads in order to generate the correct result. In order to have a pipeline operateoperation, the intersection is implemented as in a sort-merge manner. Therefore, each input is required to be sorted. The synchronization is handled by the thread of input zeroNo.0, which means the thread 0 will call the writer.open/nextFrame/close functions. If we authorize arbitrary threads to do such taskpush forward, the downstream operator will be confused to synchronize on , especially in synchronizing their locks. The core logical intersection function is as below:

  1. do 
    1. find the max input: maxinput id of the maximum record
    2. for each input i
      1. if record < max keep popping 
      2. if record == max matches maxkeep popping until it matches max. then match++; continue
      3. if > max, break
    3. If match == inputArity
      1. output max record
  2. while all inputs haven't been no input is closed.

If any of the input is fully consumed, the operator is closed.

...

Code Block
use dataverse twitter; 
let $ts_start := datetime("2015-11-23T17:$min_start:00.000Z") 
let $ts_end := datetime("2015-11-23T17:$min_end:03.000Z") 
let $ms_start := date("2010-$month_start-01") 
let $ms_end := date("2010-$month_end-28") 
let $result := for $t in dataset ds_tweets 
               where $t.user.create_at >= $ms_start and $t.user.create_at < $ms_end 
               and $t.create_at >= $ts_start and $t.create_at < $ts_end 
               and $t.place = "Unite State" return $t return count($result) 
               return $t
return count($result)

...

Each test will run three times:

  • 1st. with BTree index on TweetUser.create_at only, 
  • 2nd. with BTree index on UserTweet.create_at onlyat only
  • 3rd. with both indexes presents, consequently, the intersection is introduced.

The entire result is shared in google sheet

In Memory case:

Each query will run ten times. We record the time by average the last fives.  The time unit is Milliseconds.

Table 1. Fix the User.create_at $month_start = 01, $month_end = 02, increasing the Tweets.create_at selectivity

   ScanUserCreateAtIndexTweetCreateAt IndexIntersectionReductionSpeedUp
resultmonthminutesTime (Avg last 5)    
43101--0200-0992914213252602.61%54
92801--0200-1992914322656602.84%55
145801--0200-2993414432873491.31%97
199701--0200-3993214342780441.06%79
250401--0200-4993014252892351.21%54
298901--0200-59933141631109221.70%29

Table 2. Fix the Tweet.create_at $min_start = 00, $min_end = 09, increasing the User.create_at selectivity

   ScanUserCreateAtIndexTweetCreateAt IndexIntersectionReduction
resultmonthminutesTime (Avg last 5)    
43101--0200-0992914213252602.61%54
67001--0300-0993218912948622.79%69
92901--0400-0992824012461502.81%03
114001--0500-0993129112869461.09%86
147101--0600-0993336712665481.41%94
185901--0700-0993244912584321.80%49
216601--0800-0993152512687301.95%45
243801--0900-0993258012794251.98%35
268201--1000-09939648127104181.11%22
301101--1100-09934710125110121.00%14
334601--1200-0993378112712051.51%06

Table 3. Both Tweet.create_at and User.create_at increasing the selectivity

   ScanUserCreateAtIndexTweetCreateAt IndexIntersectionReduction
resultmonthminutesTime (Avg last 5)    
43101--0200-0992914213252602.61%54
142901--0300-1993319022867642.74%84
194501--0400-1993323922667703.35%37
368601--0500-2993429432283713.77%54
481601--0600-29936370324102683.52%18
832001--0700-39931453425123713.06%46
1220201--0800-49932522529146723.03%58
1379101--0900-49934582527157703.21%36
1848901--1000-59937644630191693.68%30

We can see that intersection is the best one choice under current above settings. The total time reduction is from 5% to 70%speed up to the fast single index path is up to 3.5 times. If the selectivities of two indexes are vary indexes vary a lot in the selectivity, then the benefit of intersection may not that much. If the two indexes are of the samilar similar selectivity the intersection can achieve 60% ~ 70% total time reductiontwo to three times faster.

On disk case

The test dataset is changing to the 8.2G dataset. In order to flush the cache, we load the same dataset to another ds_copy dataset. Every time when we run the selection, we scan this 8.2G ds_copy once to invalidate the cache pages. Due to the slowness of the on disk case, we warm up the query only once and record the average time of the next three times.

Table 4. Fix User.create_at condition for one month and increase the Tweet.creat_at range. 

   Scanuser time Indextweet time indexintersectionoverhead
resultmonthhourTime (Avg last 5)    
116601--0200--011134231124691397161515.60%
195301--0200--02 1118431628191517.63%
243801--0200--03 1123071802214318.92%
280901--0200--04 1108932105245316.53%
334701--0200--05 11953131842503-21.39%
438101--0200--06 1109122552306019.91%
580201--0200--07 1114883126393525.88%
759401--0200--08 1114844148499220.35%
984601--0200--09 1114995299624317.81%

Table 5.Fix Tweet.create_at condition for one hour and increase the User.creat_at range.

   Scanuser time Indextweet time indexintersectionoverhead
resultmonthhourTime (Avg last 5)    
116601--0200--011134231124691397161515.60%
175101--0300--01 1102071375180831.49%
226201--0400--01 1109651462173818.88%
285001--0500--01 1122911371183533.84%
375301--0600--01 1115871289189046.63%
467901--0700--01 1117691320202953.71%
554401--0800--01 1122091340207654.93%
625001--0900--01 1130291370227065.69%
695501--1000--01 1125351310232977.79%

Though the two access methods have very different execution time, the intersection tends to catch with the fastest one. The overhead of intersection compares to the fastest path is from 15% to 78%. While its speedup compares to the slowest one is about 5~10 times faster. 

Why the Tweet.creat_at access path is so fast?  

The answer is that the order of primary key (Tweet.id) is consistent with the order of Tweet.create_at. We speculate that the Tweet.id was generated by the Tweet.create_at. Thus, this secondary index is actually clustered as the primary index. As the consequence, the IO time to fetch each record is clustered. The general performance of the secondary index lookup should be as slow as the User.create_at access path.

Why the intersection is slower than the Tweet.create_at index access path?

Because we only have one disk. First, the Tweet.create_at path has to wait for the User.create_at to finish a frame to operate the intersection. These two index search is battling the disk read. Second, although the intersection itself can be finished as long as one of the input is done, we can not stop the other index scan based on our push model. Hence, the primary search is also competing on the disk resource with the two index searches. 

Intersect Unclustered Secondary Indexes

As shown in the previous result, the index on Tweet.create_at is a clustered secondary index, which is a special case for the secondary index. To test a more general case, we create an RTree on the Tweet.place.boudingbox which is a rectangle area. We create a circle area around LA county. By increasing the radius, we can increase the selectivity of that RTree. The query is as below

Code Block
use dataverse twitter; 
let $ms_start := date("2010-$month_start-01") 
let $ms_end := date("2010-$month_end-28") 
let $region := create-circle(create-point(-118.125,33.939), $radius)
let $result := for $t in dataset ds_tweets 
               where $t.user.create_at >= $ms_start and $t.user.create_at < $ms_end 
               and spatial-intersect($t.place.bounding_box, $region)
               and $t.place = "Unite State" 
               return $t
return count($result)

Table 6. Fix User.create_at condition for one month and increase the $radius range. 

   Scanuser time IndexRtree Indexintersectionspeedup
resultmonthradiusTime (Avg last 5)    
139001--020.01 111087106159929311.4235446
155101--020.02 1113061071271001210.69986017
157501--020.03 1120241081431027810.52179412
617101--020.04  111264318503.493375196
619301--020.05  112916320013.528514734
668901--020.06  111673339523.289143497
690001--020.07  111012349463.176672581
690001--020.08  111570349373.193462518

The experiment is slow. Stay tuned. 

Review Patch: 

https://asterix-gerrit.ics.uci.edu/#/c/577  and  https://asterix-gerrit.ics.uci.edu/#/c/578

...