Experiments of TsFile index area

(I) Experiment of the necessity of TimeseriesMetadata

Unable to render Jira issues macro, execution error.

After we store TimeseriesMetadata together with ChunkMetadata, the necessity of TimeseriesMetadata needs to be reconsidered. We need some experiments for decision.

TimeseriesMetadata for Aggregation query and raw data query under different circumstances for one timeseries in one tsfile.

Each chunk has 100 points. Each query contains 500 TsFiles.

(1) with TimeseriesMetadata: origin TimeseriesMetadata

(2) without TimeseriesMetadata: TimeseriesMetadata has no statistics

Writing:

        String path =
            "/home/fit/szs/data/data/sequence/root.sg/0/"
                + chunkNum
                + "/test"
                + fileIndex
                + ".tsfile";
        File f = FSFactoryProducer.getFSFactory().getFile(path);
        if (f.exists()) {
          f.delete();
        }

        try (TsFileWriter tsFileWriter = new TsFileWriter(f)) {
          // only one timeseries
          tsFileWriter.registerTimeseries(
              new Path(Constant.DEVICE_PREFIX, Constant.SENSOR_1),
              new UnaryMeasurementSchema(Constant.SENSOR_1, TSDataType.INT64, TSEncoding.RLE));

          // construct TSRecord
          for (int i = 1; i <= chunkNum * 100; i++) {
            TSRecord tsRecord = new TSRecord(i, Constant.DEVICE_PREFIX);
            DataPoint dPoint1 = new LongDataPoint(Constant.SENSOR_1, i);
            tsRecord.addTuple(dPoint1);
            // write TSRecord
            tsFileWriter.write(tsRecord);
            if (i % 100 == 0) {
              tsFileWriter.flushAllChunkGroups();
            }
          }
        }

Raw data query:

for (int fileIndex = 0; fileIndex < fileNum; fileIndex++) {
      // file path
      String path =
          "/home/fit/szs/data/data/sequence/root.sg/0/"
              + chunkNum
              + "/test"
              + fileIndex
              + ".tsfile";

      // raw data query
      try (TsFileSequenceReader reader = new TsFileSequenceReader(path);
          ReadOnlyTsFile readTsFile = new ReadOnlyTsFile(reader)) {

        ArrayList<Path> paths = new ArrayList<>();
        paths.add(new Path(DEVICE1, "sensor_1"));

        QueryExpression queryExpression = QueryExpression.create(paths, null);

        long startTime = System.nanoTime();
        QueryDataSet queryDataSet = readTsFile.query(queryExpression);
        while (queryDataSet.hasNext()) {
          queryDataSet.next();
        }

        costTime += (System.nanoTime() - startTime);
      }
    }

Aggregation query:

long totalStartTime = System.nanoTime();
    for (int fileIndex = 0; fileIndex < fileNum; fileIndex++) {
      // file path
      String path =
          "/home/fit/szs/data/data/sequence/root.sg/0/"
              + chunkNum
              + "/test"
              + fileIndex
              + ".tsfile";

      // aggregation query
      try (TsFileSequenceReader reader = new TsFileSequenceReader(path)) {
        Path seriesPath = new Path(DEVICE1, "sensor_1");
        long startTime = System.nanoTime();
        TimeseriesMetadata timeseriesMetadata = reader.readTimeseriesMetadata(seriesPath, false);
        long count = timeseriesMetadata.getStatistics().getCount();
        costTime += (System.nanoTime() - startTime);
      }
    }
    System.out.println(
        "Total raw read cost time: " + (System.nanoTime() - totalStartTime) / 1000_000 + "ms");
    System.out.println("Index area cost time: " + costTime / 1000_000 + "ms");

chunk number		1	2	3	5	8	10	15	20	25
raw	with timeseriesMetadata	210	230	237	250	276	297	309	344	374
	with timeseriesMetadata	116	131	142	156	185	197	220	255	282
	without timeseriesMetadata		219	223	242	267	287	302	334	357
	without timeseriesMetadata		131	136	155	182	200	219	251	274
*count()**	with timeseriesMetadata	89	90	91	93	93	93	94	97	97
	with timeseriesMetadata	15	16	16	16	16	16	16	17	17
	without timeseriesMetadata		122	123	127	127	127	127	128	130
	without timeseriesMetadata		50	50	50	50	51	52	52	53

(II) Experiment about combine Chunk and Page

Unable to render Jira issues macro, execution error.

Do we need Chunk and Page, or reserve one is ok?

Space shortcuts

Page tree

(I) Experiment of the necessity of TimeseriesMetadata

(II) Experiment about combine Chunk and Page