THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
Table of Contents |
---|
(I) Experiment of the necessity of TimeseriesMetadata
Jira | ||||||
---|---|---|---|---|---|---|
|
...
- Although the index area structure with no TimeseriesMetadata speeds up a little in raw data query,
it reduces the speed a lot in aggregation query. => We should reserve TimeseriesMetadata. - The time cost does not change in the data area of TsFile.
(II) Experiment about combine Chunk and Page
Jira | ||||||
---|---|---|---|---|---|---|
|
...
- one level index in one TsFile.
- Suitable for small Chunk (Mass Timeseries) scenario, in which 1 chunk has only 1~2 pages
(Note: Since 0.12, If one Chunk has only one Page, then PageStatistics will be removed, we only store statistics in ChunkMetadata)
(III) Experiment about how to store PageHeader
Jira | ||||||
---|---|---|---|---|---|---|
|
...
(b) combine PageHeader with ChunkHeader
当前的读取方式为:将 Chunk 全都读取到内存后读取。
如果按照这样的方式,则 (a) (b) 两者所用的时间相同。
如果按照精细方式进行逐块读取,分析如下:
For raw data query in a Chunk:
...
需要读 (n - m) 个 PageHeader,m 个 PageData,seek 1 次。耗时为 n * th + m * (td - th) + ts
前者比后者耗时多Δt = (n - 1) * ts + m * (th - ts),由于 n >= 1, th > ts(读 PageHeader 也需要 seek, 因此 th > ts),
因此 Δt >0,后者耗时一定比前者少。
举例:
假设 Chunk 中有6个 Page,其中前两个 Page 是不符合时间过滤要求的
...