Last Record Appender


This stage appends the last record of the same device and same point to the attr field of the current record. The record to be appended must match with the specified condition. The procedure consists of the following steps.

  1. Append last record to the attr field of the current record. If the arriving point is the first point, there is no last record.

  2. Evaluate the current record with the specified condition. If the record matches with the condition, replace the current record with last record. If the condition is set to *, always replace the current record with last record.

  3. This stage cannot guarantee idempotence of the calculation results due to failure retries caused by any reasons, such as cluster node exceptions.

Configuration

The configuration tabs for this stage are General, Basic, Input/Output, and CacheConfig.

General

Name

Required?

Description

Name

Yes

The name of the stage.

Description

No

The description of the stage.

Stage Library

Yes

The streaming operator library to which the stage belongs.

Required Fields

No

The fields that the data records must contain. If the specified fields are not included, the record will be filtered out.

Preconditions

No

The conditions that must be satisfied by the data records. Records that do not meet the conditions will be filtered out. For example, ${record:value('/value') > 0}. For the syntax of EL expressions, see `Expression Language https://docs.streamsets.com/portal/datacollector/latest/help/datacollector/UserGuide/Expression_Language/ExpressionLanguage_overview.html>`__.

On Record Error

Yes

The processing method for error data.

  • Discard: Error data will be discarded and ignored

  • Send to Error: Error messages will be reported

  • Stop Pipeline: The pipeline will be stopped

Basic

Name

Required?

Description

Quality Filter

No

Filter the data according to the data quality. Only records that meet the quality conditions will be processed by this stage.

Input/Output

Name

Required?

Description

Input Point

Yes

Specify the input point of the records, using the format {modelId}::{pointId}. For the same configuration row, the modelId must be the same, and the input point and the output point must be different.

Conditions

Yes

Specify the conditions for record appending.

  • * (Every record will be the last record relative to the current record)

  • EL Expression (the EL expression must be conditional statements, with a result of Boolean type)

Output Point

Yes

Specify the output point of the records, using the format {modelId}::{pointId}. For the same configuration row, the modelId must be the same, and the input point and the output point must be different.

CacheConfig

Name

Required?

Description

Cache Type

Yes

Select the storage type for cache data. Options are Redis and Local storage.

  • Redis: The advantage is that the cached data will not be lost after the stream processing pipeline is paused, restarted, or retried. The disadvantage is that the data processing speed is slow and that it is sensitive to network performance. It is recommended that the network delay should be less than 1ms. Otherwise, the data processing performance will be affected.

  • Local: The advantage is that the data processing speed is fast. The disadvantage is that the cached data will be lost after the stream processing pipeline is paused, restarted, or retried.

Output Results

Whether the last record will be appended to the attr field of the current record will be based on the data attributes and the specified conditions.

Output Example

Without last record

../../../_images/last_record_result_1.png

With last record

../../../_images/last_record_result_2.png