Data Archiving Overview


Large volumes of business data with low access frequency can be archived and stored. The Data Archiving Service supports the archiving of data and the synchronizing of the archived data to the target database and specified directories, achieving data backup.


The major components and architecture of the Data Archiving service is shown in the figure below.


../../_images/archiving_arch.png

Features


Archiving Real-time Data and Offline Data

The Data Archiving Service supports the archiving of both real-time data ingested from devices or generated by stream processing jobs of specified models and data integrated from the offline message channel.


Archiving Real-time Alert Records

The Data Archiving Service supports the archiving of specified fields of real-time alert records (both history and active alert records) of specified models.


update Archiving Data Stored in TSDB

The Data Archiving Service also supports the archiving of history data that is stored in TSDB. The data archiving job for TSDB data is offline type and needs to be triggered manually.


Setting the Properties of Archived Files

Properties of archived files, including file type, encoding, column delimiter, compression, and size limit, can be set based on your data usage scenarios. In this way, the archived files will be ready for future development and analysis.


Customized Archiving Cycles

You can set customized data archiving cycles (1 hour, 12 hours, or 24 hours) based on the data amount and business requirements on data archiving efficiency. The longer the archiving cycle is, the more data can be processed in the cycle, and the number of small files caused by data latency can be significantly reduced.


Setting the Storage System and Path

The archived files will be synchronized to the specified storage system (BLOB or HDFS) and stored in the configured path, thus achieving data backup.

Resource Preparation


Data Archiving Resource

Before configuring data archiving jobs, ensure that your OU has requested for the Data Archiving resource through the EnOS Management Console > Resource Management page. The resource specification determines the amount of data records that can be archived per second by all data archiving jobs that are running. For more information about requesting for the Data Archiving resources, see Data Archiving Resource Specification.


When you do not need to archive data with the Data Archiving service, you can delete and release the requested Data Archiving Resource through the Resource Management page to save costs.

Limitations

When using Data Archiving, the following limitations should be noted.


Number of Supported Archiving Jobs

The maximum number of data archiving jobs that can be created for an organization is 10.


Archive File Generation

When a data archiving job is submitted, the system starts reading data from the specified message channel. However, if no data is cached in the archiving cycle when the archiving job is submitted, no archive file will be generated.


Data Retention In Case of Job Failure

The data retention time of the current message channel is 3 days by default. In case of job failure, data archiving jobs must be troubleshooted and restarted in time to avoid data loss.