1. Requirement

1.1. Backgroud

An automotive equipment provider supplies its own in-vehicle sensor assemblies to automotive assemblers. Such assemblies consist of about 30 individual sensors that are all of the same model, i.e., each assembly contains the same individual sensor.

The total number of components that need to be supported is about 1 million, and the equipment provider needs to store, query and analyze the data collected by these sensors to realize its business value.

1.2. Challenge

Analyzing the problem background and the actual situation of IoTDB for metadata storage overhead analysis, it is known that schema manager needs to maintain 30*1 million that is 30 million working nodes and related metadata. According to the previous experience, each working node memory occupation is about 300 bytes, and the overall memory occupation will reach 9GB.

Then we analyze the test and deployment environment, the total machine memory of the test and deployment environment is 32GB. There are a lot of I/O operations in the database, we need to keep a certain amount of off-heap memory and system memory. So the JVM's own heap memory is set at 20GB is more reasonable. Combined with the above analysis, metadata will account for 45% of the memory size, which will bring huge memory pressure to write and query. Under the case of good read and write performance, IoTDB tends to occupy only 10% of the total heap memory for metadata, and 45% of the memory in this scenario is bound to affect the read and write performance of the database.

1.3. Target

Eliminate duplicate definitions of leaf nodes in the metadata tree, thus reducing metadata memory usage and storage and improving the overall read and write performance of IoTDB.

1.4. Analysis

The key point is that these components are all the same model, which means that the sensor types in each component are exactly the same.Bbut the metadata of each sensor in each component is recorded in IoTDB, which is actually a memory redundancy. We should store only one copy of these same metadata to save precious memory.
Today's industrial manufacturing companies are using large-scale, batch production to control costs and risks, and the case of multiple of the same sensor component is universal. So the above solution has some industrial scenario universality.

2. Solution

2.1 Example

Provide users with a physical quantity template function to save all sensor metadata of a class of devices in one template, and mount the template to a storage group or device group node, indicating that all devices under that storage group or device group are of the same model with the same sensor type, thus eliminating memory redundancy.

Examples of user interfaces are as follows.

create schema template car_template(
 (s1 INT32 encoding=Gorilla compression=SNAPPY),
 (s2 FLOAT encoding=RLE compression=SNAPPY)，
 ) 

set schema template car_template to root.beijing

The above statement creates a physical quantity template named car_template with two sensors s1, s2. s1 is a 32-bit integer with Gorilla encoding and Snappy compression, and s2 is a 32-bit floating point number with RLE encoding and Snappy compression. Then we mount the physical quantity template to the storage group root.beijing.

After executing the above operation, the user can write data normally, and the system will determine the metadata to be written according to the physical quantity template. Writing data that is belonged to template can still automatically create a time series, consistent with the default behavior of the system.

2.2 Effection on MTree

Two fields are added to each MNode, a pointer to a physical quantity template (8 bytes), and a boolean value to determine if the template is used (1 byte).

3. Test and Evalution

In the actual test, according to the above calculated device metadata memory occupation is only 20M. Considering that IoTDB metadata management uses related HashMap and other data structures, the memory occupied by IO reading and writing in Mlog, the upper limit of metadata memory is set to 1G in the actual scenario (the part that cannot be used up will also be used by other parts of the system). The metadata occupies 5% of the memory ratio in the heap, which is in line with the normal read and write memory load of IoTDB. There is no OOM phenomenon in the test and long test, which proves that the physical volume template does reduce the memory redundancy.

Space shortcuts

Page tree

1. Requirement

1.1. Backgroud

1.2. Challenge

1.3. Target

1.4. Analysis

2. Solution

2.1 Example

2.2 Effection on MTree

3. Test and Evalution

Space shortcuts

Page tree

Schema Template

1. Requirement

1.1. Backgroud

1.2. Challenge

1.3. Target

1.4. Analysis

2. Solution

2.1 Example

2.2 Effection on MTree

3. Test and Evalution