March 21, 2021, 2021, Sunday.

Time: 2021-03-21 9:00 PM Beijing Time

WeMeet url: https://meeting.tencent.com/s/xoV6xnkBw5oT
Host: Yuyuan Kang

Status Updates

1. Announce

Apache IoTDB 0.11.13 is still in progress

2. Bug Fixes

  • ISSUE-2505 ignore PathNotExistException in recover and change recover error to warn
  • IOTDB-1119 Fix C++ SessionDataSet bug when reading value buffer
  • Fix SessionPool does not recycle session and can not offer new Session due to RunTimeException
  • ISSUE-2588 Fix dead lock between deleting data and querying in parallel
  • ISSUE-2546 Fix first chunkmetadata should be consumed first
  • IOTDB-1126 Fix unseq tsfile is deleted due to compaction
  • IOTDB-1137 MNode.getLeafCount error when existing sub-device
  • ISSUE-2624 ISSUE-2625 Avoid OOM if user don't close Statement and Session manually
  • ISSUE-2639 Fix possible NPE during end query process
  • Alter IT for An error is reported and the system is suspended occasionally
  • IOTDB-1149 print error for -e param when set maxPRC<=0
  • IOTDB-2648 Last query not right when having multiple devices
  • Delete mods files after compaction
  • ISSUE-2687 fix insert NaN bug
  • ISSUE-2598 Throw explicit exception when time series is unknown in where clause
  • Fix timeseriesMetadata cache is not cleared after the TsFile is deleted by a compaction
  • ISSUE-2611 An unsequence file that covers too many sequence file causes OOM query
  • IOTDB-1135 Fix count timeseries bug when the paths are nested
  • ISSUE-2709 IOTDB-1178 Fix cache is not cleared after compaction
  • ISSUE-2746 Fix data overlapped bug after the elimination unseq compaction process
  • Fix getObject method in JDBC should return an Object
  • IOTDB-1188 Fix IoTDB 0.11 unable to delete data bug
  • Fix when covering a tsfile resource with HistoricalVersion = null, it鈥檒l throw a NPE
  • fix the elimination unseq compaction may loss data bug after a delete operation is executed

3. New Features

  • Add explain sql support

4. Improvements

  • IOTDB-1140 optimize regular data encoding
  • Add more log for better tracing
  • Add backgroup exec for cli -e function
  • Add max direct memory size parameter to env.sh

5. Closed development

Aligned Timeseries & Device Template

Goal: Eliminate duplicate timestamps, eliminate duplicate definitions of leaf nodes, thereby reducing metadata memory usage and metadata storage

Solutions:

1) By aligning the time series, duplicate timestamps can be eliminated.

2) Add device templates to eliminate duplicate definitions of leaf nodes

Typical Scenarios: To manage 5 billion device * 29 measurements, the measurements on a device are aligned by timestamp

1) SQL statement for aligned timeseries:

create aligned timeseries root.sg.d1.(s1 FLOAT, s2 INT32)

create aligned timeseries root.sg.d1.(s3 FLOAT, s4 INT32) with encoding=(RLE, Grollia), compression=SNAPPY

insert into root.sg.d1(time, s1, s2, s3, s4) values(1, 1, 2.0, 1, 2.0)

insert into root.sg.d1(time, (s1, s2), s5) values(2, 1, 2.0, 3)

select s1, s2 from root.sg.d1

select s1, s3 from root.sg.d1

2) SQL statement for device template:

set storage group root.beijing

create device template temp1(
(s1 INT32 with encoding=Gorilla, compression=SNAPPY),
(s2 FLOAT with encoding=RLE, compression=SNAPPY)
)

set device template temp1 to root.beijing

create device template temp2(
((s1 FLOAT, s2 INT32) aligned with encoding=(Gorilla, RLE), compression=SNAPPY),
(s3 FLOAT with encoding=RLE, compression=SNAPPY)


show device template
+--------+-------------+----------+----------+-------------+
|template| measurement | datatype | encoding | compression |
+--------+-------------+----------+----------+-------------+
| temp1   | s1                | INT32      | Gorilla     | SNAPPY      |
| temp1   | s2                | FLOAT     | RLE          | SNAPPY      |
| temp2   | s1                | FLOAT     | Gorilla     | SNAPPY      |
| temp2   | s2                | INT32     | RLE          | SNAPPY      |
| temp2   | s3                | FLOAT     | RLE          | SNAPPY     |
+--------+-------------+----------+----------+-------------+


Data Management

1) Meta data:

  •  Implement device template (a group of measurements)
  • Eliminate redundant definition of the leaf node

2) Data in Memory:

  • New data structure for multiple columns sharing a same timestamp
  • Eliminate redundant timestamp
  • Reduce memory consumption

3) Data on disk:

  • Designed new TsFile format
  • Support data of multiple columns sharing a same timestamp
  • Reduce disk consumption

Result:

       One of the scenarios:


Write efficiency (row/sec)

Single-state, old strategy

190,624

Single-state, new strategy

487,656

Distributed, 3 nodes 1 replica, new strategy

1,095,331


6. Open Floor
new ideas, feedback, suggestions.

1) [Discussion] Monthly Contributor Award

February‘s “Contributors of The Month”

  • sunjincheng121(Jincheng Sun), with 132571 lines changed and 8 mails sent;
  • HTHou(Haonan Hou), with 2116 lines changed and 4 mails sent;
  • neuyilan(Houliang Qi), with 2692 lines changed and 3 mails sent.

There are also 22 other contributors who contributed to IoTDB this month.

7 New contributors submitted their first PR in IoTDB

  • wuzhaojie(Zhaojie Wu),
  • WilliamSong11(Yuxiang Song),
  • GLBB,
  • THUMarkLau(Xuxin Liu),
  • chenjun40,
  • 543202718(Haoyu Wang)
  • jxlgzwh(Wenhao Zhong).

2) [Proposal] Hackthon (April)

  • No labels