Sync Tool
Overview
This document mainly introduces the requirements definition and module design of synchronization tools.
Scenes
The requirements of synchronization tools are mainly the following:
In a production environment, Apache IoTDB collects data generated by data sources (industrial equipment, mobile terminals, etc.) and stores them locally. Since the data sources may be distributed in different places, there may be multiple Apache IoTDBs responsible for collecting data at the same time. For each IoTDB, it needs to synchronize its local data into the data center. The data center is responsible for collecting and managing data from multiple Apache IoTDBs.
With the widespread application of the Apache IoTDB system, users need to load and apply the tsfile files generated by some Apache IoTDB instances to the data directory of another Apache IoTDB instance to achieve data synchronization according to the target business needs.
The synchronization module exists in the form of an independent process on the sending end, and is located in the same process as the Apache IoTDB on the receiving end.
Supports one sender to synchronize data with multiple receivers and one receiver can receive data from multiple senders at the same time, but you need to ensure that the data synchronized by multiple senders does not conflict (that is, there can only be one data source for one device), otherwise Need to prompt conflict.
Goals
The synchronization tool can be used to transfer and load data files between two Apache IoTDB instances. When network instability or downtime occurs, ensure that files can be completely and correctly transferred to the data center.
Directory Structure
For the convenience of explanation, suppose the application scenario is that the node 192.168.130.15
synchronizes data with the node192.168.130.16: 5555
, and the node 192.168.130.15
receives data synchronized from the node192.168.130.14
. Since the node 192.168.130.15
serves as both a sending end and a receiving end, the following describes the directory structure with the node192.168.130.15
.
Directory structure design
Directory structure description
The sync-sender folder contains temporary files, status logs, etc. during the data synchronization when this node is used as the sender.
The sync-receiver folder contains temporary files, status logs, and so on during which the node receives data and loads it as a receiver.
The schema / sync folder holds the synchronization information that needs to be persisted.
Sender
data / sync-sender
is the sender's folder. The folder name in this directory represents the IP and rpc port of the receiver. In this example, there is a receiver192.168.130.16: 5555
. Each folder contains the following Several files:
last_local_files.txt Records a list of all local tsfile files that have been synchronized after the synchronization task ends, and is updated after each synchronization task ends.
snapshot During data synchronization, this folder contains hard links to all tsfile files to be synchronized.
sync.log Record the task progress log of the synchronization module for system downtime recovery. The structure of this file will be explained in detail later.
Receiving end
sync-receiver
is the folder of the receiving end. The folder name in this directory represents the IP and UUID of the sending end, and it indicates the data files and file loading logs received from the sending end. In this example, there is a sending end. 192.168.130.14
, and its UUID isa45b6e63eb434aad891264b5c08d448e
. Each folder contains the following files:
load.log This file records the task progress log loaded by the tsfile file, and is used when the system is recovered from downtime.
data This folder contains the tsfile file that has been received from the sender.
Others
The schema / sync
folder contains the following information:
As a sender, the file lock sync.lock of the sender instance is intended to ensure that the same sender can only start one sender instance to the same receiver, that is, there is only one process that synchronizes data to the receiver. The directory 192.168.130.16_5555 / sync_lock in the figure indicates the instance lock synchronized to the receiving end 192.168.130.16_5555. Each time it is started, it will first check whether the file is locked. If the lock indicates that there is already a sender that synchronizes data to the receiver, then stop this instance.
When acting as the sender, the unique identifier of the sender UUID
uuid.txt
Each sender has a unique identifier for the receiver to distinguish between different sendersAs the sender, the synchronization progress of each receiver's schema
sync_schema_pos
Because the schema log
mlog.txt
data is appended, which records the change process of all meta-information, the current position is recorded after each synchronization of the schema, and direct incremental synchronization can reduce the repeated schema transmission after the next synchronization.As the receiver, all information
device_owner.log
of each device in the receiver In the application of the synchronization tool, one receiver can receive data from multiple senders at the same time, but no conflict can occur, otherwise the receiver will not be able to guarantee the correctness of the data. Therefore, it is necessary to record which sender is synchronizing each device, following the first-come-first-served principle.
The reason for placing this information separately under the schmea folder is that an Apache IoTDB instance can have multiple data file directories, that is, there can be multiple data directories, but there is only one schema folder, and this information is shared by a sender instance The information in the data folder indicates the synchronization status in the file directory and belongs to the subtask information (each data file directory is a subtask).
Sync tool sender
Statement of needs
At regular intervals, the latest data collected by the sender is returned to the receiver. At the same time, for the update and deletion of historical data, this part of information is synchronized to the receiving end.
The synchronization data must be complete. If the data file is incomplete or damaged due to factors such as network instability and machine failure during the transmission, it needs to be repaired during the next transmission.
Module design
File management module
package
org.apache.iotdb.db.sync.sender.manage
File selection
The function of file selection is to select the list of closed tsfile files in the current Apache IoTDB instance (the corresponding .resource
file, without the.modification
file and the .merge
file) and after the last synchronization task ends There are two parts in the recorded tsfile file list: the deleted tsfile file list and the newly added tsfile file list. And hard link all newly added files to prevent operations such as file deletion caused by system operation during synchronization.
File cleanup
When receiving the notification of the end of the task of the file transfer module, execute the following command:
Load the list of file names in the last_local_files.txt file into memory to form a set, and parse log.sync line by line to delete and add the set
Write the list of file names in memory to the
current_local_files.txt
fileDelete last_local_files.txt file
-Renamed
current_local_files.txt
tolast_local_files.txt
Delete the sequence folder and sync.log file
File transfer module
package
org.apache.iotdb.db.sync.sender.transfer
Synchronization schema
Before synchronizing the data file, first synchronize the newly added schmea information and update sync_schema_pos
.
Sync data file
For each file path, call the file management module to obtain a list of deleted files and a list of newly added files, and then perform the following process:
Start synchronization task, record
sync start
insync.log
Start syncing the list of deleted files. Record
sync deleted file names start
insync.log
Notify the receiving end of the list of file names to be deleted synchronously
Delete each file name in the list 4.1. Transfer file name to receiver (example
1581324718762-101-1.tsfile
) 4.2. Successful transfer, record1581324718762-101-1.tsfile
insync.log
Start to synchronize the list of newly added tsfile files. Record the sync deleted file names end and sync tsfile start in sync.log.
Notify receiver to start syncing files
For each tsfile in the new list: 7.1. Transfer the file to the receiver in blocks (example
1581324718762-101-1.tsfile
) 7.2. If the file transfer fails, try multiple times. If it tries more than a certain number of times (configurable by the user, the default is 5), abandon the file transfer; if the transfer is successful, record1581324718762-101-1 'in
sync.log. tsfile
Notify the receiving end of the synchronization task, and record
sync tsfile end
andsync end
insync.log
Invoke file management module to clean up files
End synchronization task
Recovery module
package
org.apache.iotdb.db.sync.sender.recover
Process
Each time the sending end of the synchronization tool starts a synchronization task, first check whether there is a corresponding receiving end folder under the sending end folder. If not, it means that no synchronization task has been performed with the receiving end and skip the recovery module; otherwise, The files in the folder perform the recovery algorithm:
If
current_local_files.txt
exists, skip to step 2; if not, skip to step 3If
last_local_files.txt
exists, delete thecurrent_local_files.txt
file and skip to step 3; if not, skip to step 7If
sync.log
exists, go to step 4; if not, go to step 8Load the list of file names in the last_local_files.txt file into memory to form a set, and parse the line by line sync.log to delete and add the set
Write the list of file names in memory to the
current_local_files.txt
fileDelete
last_local_files.txt
fileRenamed
current_local_files.txt
tolast_local_files.txt
Delete the sequence folder and the
sync.log
fileAlgorithm ends
Sync tool receiver
Statement of needs
Because the receiver needs to receive files from multiple senders at the same time, it is necessary to distinguish files from different senders and manage these files in a unified manner.
The receiving end receives the file from the transmitting end and verifies the file name, the file data, and the MD5 value of the file. After the file is received, the file is stored locally at the receiving end, and the received tsfile file is checked for the MD5 value and the end of the file is checked. If the check is passed correctly, the file is retransmitted.
For the data file sent by the sender (which may include operations such as updating the old data and inserting new data), this part of data needs to be merged into the local file of the receiver.
Module design
File transfer module
package
org.apache.iotdb.db.sync.receiver.transfer
Process
The file transfer module is responsible for receiving the file name and file transmitted from the sender. The process is as follows:
Received the synchronization start instruction from the sender, and checked whether there is a sync.log file. If it exists, it means that the data of the last synchronization has not been loaded, and the synchronization task is rejected; otherwise, sync.start is recorded in the sync.log.
Received the sender's instruction to start synchronous deletion of the file name list, and recorded sync deleted file names start in sync.log
Receive the delete file name transmitted by the sender in turn 3.1. Received the file name transmitted by the sender (example
1581324718762-101-1.tsfile
) 3.2. Successfully received, record1581324718762-101-1.tsfile
insync.log
and submit it to the data load module for processingReceived the instruction to start the synchronous transmission of the file, and recorded
sync deleted file names end
andsync tsfile start
insync.log
Receive the tsfile files transmitted by the sender in turn 5.1. Receive the file transmitted by the sender in blocks (example
1581324718762-101-2.tsfile
) 5.2. Verify the file. If the verification fails, delete the file and notify the sender of the failure; otherwise, record 158513214787662-101-2.tsfile in sync.log and submit it to the data load module for processingReceived the sync task end command from the sender, and recorded
sync tsfile end
andsync end
insync.log
Create empty file sync.end
File loading module
package
org.apache.iotdb.db.sync.receiver.load
File deletion
For files that need to be deleted (example 1581324718762-101-1.tsfile
), search forsequence tsfile list
in memory to see if the file exists, and if so, delete the file from the list maintained in memory and Files on disk are deleted. After successful execution, record delete 1581324718762-101-1.tsfile
inload.log
.
Load new file
For the file that needs to be loaded (example 15813214718762-101-1.tsfile), first use device_owner.log to check whether the file meets the application scenario, that is, whether the same device data is transmitted with other senders causing conflicts. , Then reject the loading and send an error message to the sender; otherwise, update the device_owner.log information.
After meeting the requirements of the application scenario, insert the file into the appropriate position in the sequence tsfile list and move the file to the data / sequence directory. After successful execution, record load 1581324718762-101-1.tsfile
inload.log
. After each file is loaded, check whether the sync.end file is included in the synchronized directory. If the file is included and the sequence folder is empty, delete the sync.log file, and then delete the load.log and sync.end files.
Recovery module
package
org.apache.iotdb.db.sync.receiver.recover
Process
When the ApacheIoTDB system is started, each sub-folder under the sync folder is checked in turn, and each sub-file represents the synchronization task of the sender represented by the folder name. Perform a recovery algorithm based on the files in each subfolder:
If the
sync.log
file does not exist, go to step 4; if it does, go to step 2Scan the sync.log log line by line, and perform the corresponding delete file operation and load file operation. If the operation has been recorded in the
load.log
file, it indicates that the operation has been completed and the operation is skipped. Go to step 3Delete file
sync.log
Delete file
load.log
Delete file
sync.end
Algorithm ends