Data Archiving Overview¶
Large volumes of business data with low access frequency can be archived and stored. The Data Archiving Service supports the archiving of data and the synchronizing of the archived data to the target database and specified directories, achieving data backup.
The major components and architecture of the Data Archiving service is shown in the figure below.
Features¶
Archiving Real-time Data and Offline Data
The Data Archiving Service supports the archiving of both real-time data ingested from devices or generated by stream processing jobs of specified models and data integrated from the offline message channel.
Archiving Real-time Alert Records
The Data Archiving Service supports the archiving of specified fields of real-time alert records (both history and active alert records) of specified models.
The Data Archiving Service also supports the archiving of history data that is stored in TSDB. The data archiving job for TSDB data is offline type and needs to be triggered manually.
Setting the Properties of Archived Files
Properties of archived files, including file type, encoding, column delimiter, compression, and size limit, can be set based on your data usage scenarios. In this way, the archived files will be ready for future development and analysis.
Customized Archiving Cycles
You can set customized data archiving cycles (1 hour, 12 hours, or 24 hours) based on the data amount and business requirements on data archiving efficiency. The longer the archiving cycle is, the more data can be processed in the cycle, and the number of small files caused by data latency can be significantly reduced.
Setting the Storage System and Path
The archived files will be synchronized to the specified storage system (BLOB or HDFS) and stored in the configured path, thus achieving data backup.
Resource Preparation¶
Data Archiving Resource
Before configuring data archiving jobs, ensure that your OU has requested for the Data Archiving resource through the EnOS Management Console > Resource Management page. The resource specification determines the amount of data records that can be archived per second by all data archiving jobs that are running. For more information about requesting for the Data Archiving resources, see Data Archiving Resource Specification.
When you do not need to archive data with the Data Archiving service, you can delete and release the requested Data Archiving Resource through the Resource Management page to save costs.
Limitations¶
When using Data Archiving, the following limitations should be noted.
Number of Supported Archiving Jobs
The maximum number of data archiving jobs that can be created for an organization is 10.
Archive File Generation
When a data archiving job is submitted, the system starts reading data from the specified message channel. However, if no data is cached in the archiving cycle when the archiving job is submitted, no archive file will be generated.
Data Retention In Case of Job Failure
The data retention time of the current message channel is 3 days by default. In case of job failure, data archiving jobs must be troubleshooted and restarted in time to avoid data loss.