org.apache.iotdb.tsfile.write.*
The writing process of TsFile is shown in the following figure:
Among them, each device corresponds to a ChunkGroupWriter, and each sensor corresponds to a ChunkWriter.
File writing is mainly divided into three operations, marked with 1, 2, 3 on the figure
1、Writing memory buffer
2、Persistent ChunkGroup
3、Close file
1、Writing memory buffer
TsFile file layer has two write interfaces
TsFileWriter.write(TSRecord record)
Write a device with a timestamp and multiple measurement points.
TsFileWriter.write(Tablet tablet)
Write multiple timestamps and multiple measurement points on one device.
When the write interface is called, the data of this device will be delivered to the corresponding ChunkGroupWriter, and each measurement point will be delivered to the corresponding ChunkWriter for writing. ChunkWriter completes coding and packaging (generating a page).
2、Persistent ChunkGroup
TsFileWriter.flushAllChunkGroups()
When the data in the memory reaches a certain threshold, the persistence operation is triggered. Each persistence will persist all the data of all devices in the current memory to the TsFile file of the disk. Each device corresponds to a ChunkGroup and each measurement point corresponds to a Chunk.
After the persistence is complete, the corresponding metadata information is cached in memory for querying and generating the metadata at the end of the file.
3、File Close
TsFileWriter.close()
Based on the metadata cached in memory, TsFileMetadata is generated and appended to the end of the file (TsFileWriter.flushMetadataIndex()
), and the file is finally closed.
One of the most important steps in constructing TsFileMetadata is to construct MetadataIndex tree. As we have mentioned before, the MetadataIndex is designed as tree structure so that not all the TimeseriesMetadata
need to be read when the number of devices or measurements is too large. Only reading specific MetadataIndex nodes according to requirement and reducing I/O could speed up the query. The whole process of constructing MetadataIndex tree is as below:
org.apache.iotdb.tsfile.file.metadata.MetadataIndexConstructor
MetadataIndexConstructor.constructMetadataIndex()
The method of input include:
Map<String, List<TimeseriesMetadata>> deviceTimeseriesMetadataMap, which indicates the map from device to its
TimeseriesMetadata
TsFileOutput out
The whole method contains three parts:
In measurement index level, each device and its TimeseriesMetadata in
deviceTimeseriesMetadataMap
is converted intodeviceMetadataIndexMap
. Specificly, for each device:
Initialize a
queue
for MetadataIndex nodes in this deviceInitialize a leaf node of measurement index level, which is
LEAF_MEASUREMENT
typeFor each TimeseriesMetadata:
Serialize
Add an entry into
currentIndexNode
everyMAX_DEGREE_OF_INDEX_NODE
entriesAfter storing
MAX_DEGREE_OF_INDEX_NODE
entries, addcurrentIndexNode
intoqueue
, and pointcurrentIndexNode
to a new MetadataIndexNode
Generate upper-level nodes of measurement index level according to the leaf nodes in
queue
, until the final root node (this method will be described later), and put the "device-root node" map intodeviceMetadataIndexMap
Next, determine whether the number of devices exceeds
MAX_DEGREE_OF_INDEX_NODE
. If not, the root node of MetadataIndex tree could be generated and return
Initialize the root node of MetadataIndex tree, which is
INTERNAL_MEASUREMENT
typeFor each entry in
deviceMetadataIndexMap
:Serialize
Convert it into an entry and add the entry into
metadataIndexNode
Set the
endOffset
of root node and return it
If the number of devices exceed
MAX_DEGREE_OF_INDEX_NODE
, the device index level of MetadataIndex tree is generated
Initialize a
queue
for MetadataIndex nodes in device index levelInitialize a leaf node of device index level, which is
LEAF_DEVICE
typeFor each entry in
deviceMetadataIndexMap
:Serialize
Convert it into an entry and add the entry into
metadataIndexNode
After storing
MAX_DEGREE_OF_INDEX_NODE
entries, addcurrentIndexNode
intoqueue
, and pointcurrentIndexNode
to a new MetadataIndexNode
Generate upper-level nodes of device index level according to the leaf nodes in
queue
, until the final root node (this method will be described later)Set the
endOffset
of root node and return it
MetadataIndexConstructor.generateRootNode
The method of input include:
Queue<MetadataIndexNode> metadataIndexNodeQueue
TsFileOutput out
MetadataIndexNodeType type, which indicates the internal nodes type of generated tree. There are two types: when the method is called by measurement index level, it is INTERNAL_MEASUREMENT; when the method is called by device index level, it is INTERNAL_DEVICE
The method needs to generate a tree structure of nodes in metadataIndexNodeQueue, and return the root node:
New
currentIndexNode
in specifictype
When there are more than one nodes in the queue, loop handling the queue. For each node in the queue:
Serialize
Convert it into an entry and add the entry into
currentIndexNode
After storing
MAX_DEGREE_OF_INDEX_NODE
entries, addcurrentIndexNode
intoqueue
, and pointcurrentIndexNode
to a new MetadataIndexNode
Return the root node in the queue when the queue has only one node
MetadataIndexConstructor.addCurrentIndexNodeToQueue
The method of input include:
MetadataIndexNode currentIndexNode
Queue<MetadataIndexNode> metadataIndexNodeQueue
TsFileOutput out