(I) Experiment of the necessity of TimeseriesMetadata
After we store TimeseriesMetadata together with ChunkMetadata, the necessity of TimeseriesMetadata needs to be reconsidered. We need some experiments for decision.
TimeseriesMetadata for Aggregation query and raw data query under different circumstances for one timeseries in one tsfile.
Each chunk has 100 points. Each query contains 500 TsFiles.
(1) with TimeseriesMetadata: origin TimeseriesMetadata
(2) without TimeseriesMetadata: TimeseriesMetadata has no statistics
Writing:
String path = "/home/fit/szs/data/data/sequence/root.sg/0/" + chunkNum + "/test" + fileIndex + ".tsfile"; File f = FSFactoryProducer.getFSFactory().getFile(path); if (f.exists()) { f.delete(); } try (TsFileWriter tsFileWriter = new TsFileWriter(f)) { // only one timeseries tsFileWriter.registerTimeseries( new Path(Constant.DEVICE_PREFIX, Constant.SENSOR_1), new UnaryMeasurementSchema(Constant.SENSOR_1, TSDataType.INT64, TSEncoding.RLE)); // construct TSRecord for (int i = 1; i <= chunkNum * 100; i++) { TSRecord tsRecord = new TSRecord(i, Constant.DEVICE_PREFIX); DataPoint dPoint1 = new LongDataPoint(Constant.SENSOR_1, i); tsRecord.addTuple(dPoint1); // write TSRecord tsFileWriter.write(tsRecord); if (i % 100 == 0) { tsFileWriter.flushAllChunkGroups(); } } }
Raw data query:
for (int fileIndex = 0; fileIndex < fileNum; fileIndex++) { // file path String path = "/home/fit/szs/data/data/sequence/root.sg/0/" + chunkNum + "/test" + fileIndex + ".tsfile"; // raw data query try (TsFileSequenceReader reader = new TsFileSequenceReader(path); ReadOnlyTsFile readTsFile = new ReadOnlyTsFile(reader)) { ArrayList<Path> paths = new ArrayList<>(); paths.add(new Path(DEVICE1, "sensor_1")); QueryExpression queryExpression = QueryExpression.create(paths, null); long startTime = System.nanoTime(); QueryDataSet queryDataSet = readTsFile.query(queryExpression); while (queryDataSet.hasNext()) { queryDataSet.next(); } costTime += (System.nanoTime() - startTime); } }
Aggregation query:
long totalStartTime = System.nanoTime(); for (int fileIndex = 0; fileIndex < fileNum; fileIndex++) { // file path String path = "/home/fit/szs/data/data/sequence/root.sg/0/" + chunkNum + "/test" + fileIndex + ".tsfile"; // aggregation query try (TsFileSequenceReader reader = new TsFileSequenceReader(path)) { Path seriesPath = new Path(DEVICE1, "sensor_1"); long startTime = System.nanoTime(); TimeseriesMetadata timeseriesMetadata = reader.readTimeseriesMetadata(seriesPath, false); long count = timeseriesMetadata.getStatistics().getCount(); costTime += (System.nanoTime() - startTime); } } System.out.println( "Total raw read cost time: " + (System.nanoTime() - totalStartTime) / 1000_000 + "ms"); System.out.println("Index area cost time: " + costTime / 1000_000 + "ms");
chunk number | 1 | 2 | 3 | 5 | 8 | 10 | 15 | 20 | 25 | |
raw | with timeseriesMetadata | 210 | 230 | 237 | 250 | 276 | 297 | 309 | 344 | 374 |
116 | 131 | 142 | 156 | 185 | 197 | 220 | 255 | 282 | ||
without timeseriesMetadata | 219 | 223 | 242 | 267 | 287 | 302 | 334 | 357 | ||
131 | 136 | 155 | 182 | 200 | 219 | 251 | 274 | |||
count(*) | with timeseriesMetadata | 89 | 90 | 91 | 93 | 93 | 93 | 94 | 97 | 97 |
15 | 16 | 16 | 16 | 16 | 16 | 16 | 17 | 17 | ||
without timeseriesMetadata | 122 | 123 | 127 | 127 | 127 | 127 | 128 | 130 | ||
50 | 50 | 50 | 50 | 51 | 52 | 52 | 53 |
(II) Experiment about combine Chunk and Page
Do we need Chunk and Page, or reserve one is ok?