Stream Processing Resource¶
The stream processing service can fully satisfy the processing needs of the asset and device real-time data as well as the historical data that is integrated through offline message channels.
Based on Apache Spark™ Streaming, customized and optimized by EnOS, EnOS stream processing service has high scalability, high throughput, and high fault tolerance. EnOS is also committed to adopting common algorithms for streaming processing in the IoT field, enabling developers to complete the development of streaming processing tasks through simple template configurations.
In addition, the streaming processing service has adopted multiple sets of calculation templates and general operators in the energy field, helping developers to develop data processing solutions quickly without the need to code, thereby greatly improving data development efficiency and lowering the development threshold.
For more information, see Stream Processing.
Resource Application Scenario¶
Before installing the stream processing templates and StreamSets calculator libraries, or before configuring stream processing jobs, you need to apply for the Stream Processing resource. There are three resource types for Stream Processing resources, namely: Stream Designing, Cluster Processing, and Standalone Processing.
Pipeline Design: Pipeline Design resources are used to create and design Pipeline Design tasks. When performing native drag-and-drop stream task designing and debugging, you need to temporarily run related tasks or install the related library packages.
Cluster Processing: When the amount of stream set is large and the processing performance is high, you can use the Cluster Processing resource.
Standalone Processing: When the amount of stream set is small and the cost control is strict, you can use the Standalone Processing resource.
The Pipeline Design resources can be requested based on the computing unit (CU). Different specifications under different resource types correspond to different data processing capabilities. In the same resource type, the higher the specification, the higher the processing efficiency, and the larger the amount of data processed per unit time. The resource specifications and corresponding data processing capabilities are as per the below.
Note
The maximum number of resource instances that can be applied for each resource type under each OU is 1.
Pipeline Design¶
Specification |
Description |
---|---|
CU |
1 CU = 1 Core CPU + 2 GB Memory. 1 CU stream designing resource supports the installation of 3 lib packages. Available options are 1 - 100 CU by default. |
Cluster Processing¶
Resource Type |
Specification |
Description |
---|---|---|
Container Resource |
CU |
1 CU = 1 Core CPU + 2 GB Memory. A flow task requires about 1 CU of container resource, which are mainly used for task submissions, indicator collections, etc. |
Cluster Resource |
CU |
1 CU = 1 Core CPU + 2 GB Memory. 1 CU can process 2,700 to 3,700 records per second for simple operations such as single-stream filtering, string conversion, etc. For complex operations such as WINDOW, UDF, and HTTP requests, 1 CU can process 900 to 3,300 records per second. |
Standalone Processing¶
Specification |
Description |
---|---|
CU |
1 CU = 1 Core CPU + 2 GB Memory. Available options are 1 - 2000 CU by default. |
Note
1 CU = 1 Core CPU + 2 GB Memory. For simple data processing jobs such as single-pipeline filtering and string conversion, 1 CU resource can process 6,000 - 16,000 records per second. - For complex data processing jobs such as WINDOW, UDF, and HTTP requests, 1 CU resource can process 2,000 - 10,000 records per second.