Normalizer


A data processing stage, the main functions of which are data format conversion and data filtering. The Normalizer has two types: Origin Normalizer and Cal Normalizer.


Origin Normalizer converts data in the MEASURE_POINT_ORIGIN_{OUID} Kafka Topic. Its functions include measurement point splitting, input point filtering, and data format conversion. Details are as follows:

  • Measurement point splitting: Slit a data record of multiple measurement points in the payload to multiple records.

  • Input point filtering: Filter the input data records that are needed by the user based on the lineage of the user pipeline when it was published.

  • Output data format conversion: Convert the output data format to meet the requirement of the MEASURE_POINT_INTERNAL_{OUID} Kafka Topic, so that the data can be processed correctly by downstream operators. For details, see the data format requirement of Kafka Topics in the EDH Kafka Consumer Documentation.


Cal Normalizer converts data in the MEASURE_POINT_CAL_{OUID} Kafka Topic. Its functions include output point filtering and output data format conversion. Details are as follows:

  • Output point filtering: Filter the output data records that are needed by the user based on the lineage of the user pipeline when it was published.

  • Output data format conversion: Convert the output data format to meet the requirement of the MEASURE_POINT_CAL_{OUID} Kafka Topic, so that the data can be processed correctly by downstream operators. For details, see the data format requirement of Kafka Topics in the EDH Kafka Consumer Documentation.

Configuration

The configuration tabs for this stage are General and Basic.

General

Name

Required?

Description

Name

Yes

The name of the stage.

Description

No

The description of the stage.

Stage Library

Yes

The streaming operator library to which the stage belongs.

Required Fields

No

The fields that the data records must contain. If the specified fields are not included, the record will be filtered out.

Preconditions

No

The conditions that must be satisfied by the data records. Records that do not meet the conditions will be filtered out. For example, ${record:value('/value') > 0}. For the syntax of EL expressions, see `Expression Language https://docs.streamsets.com/portal/datacollector/latest/help/datacollector/UserGuide/Expression_Language/ExpressionLanguage_overview.html>`__.

On Record Error

Yes

The processing method for error data.

  • Discard: Error data will be discarded and ignored

  • Send to Error: Error messages will be reported

  • Stop Pipeline: The pipeline will be stopped

Basic

Name

Required?

Description

Normalizer Type

Yes

Normalizer type.


Sample configuration is as follows:

../../../_images/normalizer.png