...
Physically, each ETS node will introduce a thread. Thus, the intersection operator must synchronize the upstream input threads in order to generate the correct result. In order to have a pipeline operation, the intersection is implemented in a sort-merge manner. Therefore, each input is required to be sorted. The synchronization is handled by the thread of input No.0, which means the thread 0 will call the writer.open/nextFrame/close functions. If we authorize arbitrary threads to push forward, the downstream operator will be confused, especially in synchronizing their locks. The core logical intersection function is as below:
- do
- find the max input: maxinput id of the maximum record
- for each input i
- if record < max keep popping
- if record == max keep popping until it matches max. then match++; continue
- if > max, break
- If match == inputArity
- output max record
- while no input is closed.
...
Scan | user time Index | Rtree Index | intersection | speedup | |||
result | month | radius | Time (Avg last 5) | ||||
1390 | 01--02 | 0.01 | 111087 | 106159 | 9293 | 11.4235446 | |
1551 | 01--02 | 0.02 | 111306 | 107127 | 10012 | 10.69986017 | |
1575 | 01--02 | 0.03 | 112024 | 108143 | 10278 | 10.52179412 | |
6171 | 01--02 | 0.04 | 111264 | 31850 | 3.493375196 | ||
6193 | 01--02 | 0.05 | 112916 | 32001 | 3.528514734 | ||
6689 | 01--02 | 0.06 | 111673 | 33952 | 3.289143497 | ||
6900 | 01--02 | 0.07 | 111012 | 34946 | 3.176672581 | ||
6900 | 01--02 | 0.08 | 111570 | 34937 | 3.193462518 |
The experiment is slow. Stay tuned.
...