Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Do we need Chunk and Page, or reserve one is ok?


How many points can a chunk have when chunk size = 64K, 1M, 2M, 3M, and 4M?

(1) Write one timeseries in one TsFile, with long data type , random data.

(2) And adjust the number of points  by the size of chunk.

Code Block
      try (TsFileWriter tsFileWriter = new TsFileWriter(f)) {
        // only one timeseries
        tsFileWriter.registerTimeseries(
            new Path(Constant.DEVICE_PREFIX, Constant.SENSOR_1),
            new UnaryMeasurementSchema(Constant.SENSOR_1, TSDataType.INT64, TSEncoding.RLE));

        // construct TSRecord
        for (int i = 1; i <= 7977; i++) { // change here
          TSRecord tsRecord = new TSRecord(i, Constant.DEVICE_PREFIX);
          DataPoint dPoint1 = new LongDataPoint(Constant.SENSOR_1, random.nextLong());
          tsRecord.addTuple(dPoint1);
          // write TSRecord
          tsFileWriter.write(tsRecord);
        }
      }


Here are the results:

chunk size

~64K

~1M

~2M

~3M

~4M

points number

7,977

125,000

260,000

390,000

520,000

page number

1

16

32

49

66

page size (uncompressed)

65398
=63.86K

65398
=63.86K

65398
=63.86K

65398
=63.86K

65398
=63.86K

page size (compressed)

64275
=62.77K

64275
=62.77K

64275
=62.77K

64275
=62.77K

64275
=62.77K


Discuss the scenarios below: (only one timeseries)

1. For a scenario that generates 5 data points per second. (high frequency)

One day will generate 432,000 points (about 54 pages). Therefore, 1 chunk has 54 pages (about 3.4M). In scenarios like this, chunk and page is both necessary.

2. For a scenario that generates one data point per second. (second frequency)

One day will generate 86,400 points (about 11 pages). Therefore, 1 chunk has 11 pages (about 693K). In this scenario, chunk and page is both necessary.

3. For a scenario that generates 5 data points per minute. (one chunk one day) (low frequency)

One day will generate 7200 points (about 1 pages). Therefore, 1 chunk has 1 page (about 56.6K). In this scenario, chunk and page should only reserve one.

4. For a scenario that generates one data points per minute. (one chunk one week) (minute frequency)

One week will generate 10080 points (about 1.3 pages). Therefore, 1 chunk has 1~2 pages (about 79.3K). In this scenario, chunk and page should only reserve one.



(III) Experiment about how to store PageHeader

...