Unit 4. Calculating Power Loss


In this tutorial, you can calculate the power loss of offshore wind turbines according to the following operation logic.


../_images/logic.png


This unit describes how to calculate the power loss of offshore wind turbines by the following ways:

  • In AI Lab, use the sample codes to calculate power loss in the Notebook instance you created in Unit 3.
  • In AI Pipelines, import a sample pipeline to calculate power loss.
  • In AI Pipelines, design a low-code pipeline from scratch to calculate power loss.

Option 1. Calculating in a Notebook Instance


You can use sample codes in a Notebook instance to calculate the power loss of offshore wind turbines and upload the calculation results into Hive.

Step 1. Running Codes to Calculate


In Unit 3, you have already uploaded the sample code file (power-loss.ipynb) that can be used to calculate power loss. Run the uploaded codes to calculate the power loss of offshore wind turbines by the following steps:

  1. Log in to EnOS Management Console and select AI Studio > AI Lab on the left navigation pane.
  2. Select power-loss on the Notebook Instance tab to open the instance.
  3. Open the power-loss.ipynb in the power-loss folder. You can refer to the comments in the file to get familiar with the operation logic.
  4. Select the double-arrow icon i_rerun on the tool bar to restart kernel and re-run the whole notebook.


You can see the result file named power-loss.csv appear in the directory after successfully ru-run the whole notebook.

Step 2. Uploading Results to Internal Storage


  1. In the power-loss instance page, open the terminal you created in Step 1. Running Codes to Calculate.

  2. Enter and run the following code in the terminal for uploading the result file to internal storage.

    hdfs dfs -put -f power-loss.csv /user/data_{ouid}/{directory}
    

    Where:

    • {ouid} refers to the current OU ID. You can get the current OU ID by hovering over the OU name on the top toolbar.
    • {directory} refers to the target folder in HDFS to store the file. If the folder you designated does not exist, a new folder will be created with the name you entered.

Option 2. Calculating Based on a Sample Pipeline


If you want to get a quick start on how to process data with operators in AI Pipelines, you can design a pipeline by importing the built-in sample pipeline to calculate power loss of offshore wind turbines.

Step 1. Exporting Sample Pipeline


To design your own pipeline based on a sample pipeline, export the sample pipeline first by the following steps:

  1. Log in to EnOS Management Console and select AI Studio > AI Pipelines on the left navigation pane.
  2. On the Sample Pipeline tab, select power-loss-calculation > Pipeline View.
  3. Select Export to export the sample pipeline configuration in a JSON file.

Step 2. Importing Sample Pipeline


To reuse the information of the sample pipeline such as the structure, operators, and global parameters, import the exported JSON files by the following steps:

  1. Select AI Studio > AI Pipelines on the left navigation pane.
  2. Select New Experiment on the Custom Pipeline tab.
  3. Enter power-loss as the name of the experiment in the popup window.
  4. Select OK to create the experiment, and you can see the canvas for designing a pipeline.
  5. Select Import import_icon to import the exported sample pipeline configuration.

Step 3. Checking Global Parameters


For a more efficient and simpler pipeline design process, you can set globally applicable parameters as global parameters to avoid repetitive configuration. The global parameters used in this tutorial are included in the sample pipeline configuration file, select Workflow setting i_setting to check the global parameters.


Name Type Value Description
wtg_10m_dts string sample-10min-run-demo Specify the sample data of generated power in 10 minutes.
dwntm_recds_dts string sample-downtime-records-demo Specify the sample data of shutdown time periods.
sur_wtg_dts string sample-surround-info-demo Specify the sample data of the distances between the shutdown turbines and surrounding turbines.
hadoop_resource boolean true or false Determine whether to export the result file to Hive and HDFS.
hive_source hive_source Select the Data Warehouse Storage resource of your OU from the dropdown list. Specify the target Hive source where you upload the result file.
hdfs_source hdfs_source Select the File Storage HDFS resource of your OU from the dropdown list. Specify the target HDFS source where you upload the result file.
ouid string Enter your OU ID, which you can get by hovering over the OU name on the top navigation bar. Specify your OU ID.

Step 4. Checking Pipeline Operators


In this sample pipeline, you can see the following operators.


Operator Name Description
Calculate Power Loss This PythonEx operator is used to calculate the power loss of offshore wind turbines based on the specified sample datasets and generate the result file named power-loss.
Export to Hive? This Condition operator is used to export the result file to the specified Hive source, if the operator expression is Reference | hive_source | != | Reference | <null>. [1]
Upload to HDFS? This Condition operator in the “Export to Hive?” sub-canvas is used to upload the result file to the specified HDFS source, if the operator expression is Reference | hdfs_source | != | Reference | <null>. [1]
PythonCode This PythonCode operator in the “Export to HDFS?” sub-canvas is used to create a csv table containing calculating results.
Upload csv to HDFS This HDFS Uploader operator in the “Export to HDFS?” sub-canvas is used to upload the result csv file to the specified HDFS.
Create Hive Table This Hive operator in the “Export to HDFS?” sub-canvas is used to create a Hive table containing calculating results.


[1] <null> refers to leaving the value here blank.

Step 5. Running the Pipeline


  1. On the power-loss pipeline canvas, select Run i_run on the top toolbar.
  2. Select OK to run the pipeline in the pop-up window.

Option 3. Calculating by designing a Pipeline from scratch


You can also design a low-code pipeline with operators from scratch to calculate the power loss of offshore wind turbines and upload the calculating results into Hive. For more information on the operators you will use, see Operator Reference.

Step 1. Creating an Experiment


  1. Select AI Studio > AI Pipelines on the left navigation pane.
  2. Select New Experiment on the Custom Pipeline tab.
  3. Enter power-loss as the name of the experiment in the popup window.
  4. Select OK to create the experiment, and you can see the canvas for designing a pipeline.

Step 2. Adding Global Parameters


For a more efficient and easier pipeline design process, you can set the parameters that are globally applied in this pipeline as global parameters to avoid repetitive configuration by the following steps:

  1. On the power-loss pipeline canvas, select Workflow Setting i_setting to open the Workflow Setting panel.
  2. Select Add Parameter to add the following global parameters in the Configuration Parameters section.


Name Type Value Description
wtg_10m_dts string sample-10min-run-demo Specify the sample data of generated power in 10 minutes.
dwntm_recds_dts string sample-downtime-records-demo Specify the sample data of shutdown time periods.
sur_wtg_dts string sample-surround-info-demo Specify the sample data of the distances between the shutdown turbines and surrounding turbines.
hadoop_resource boolean true or false Determine whether to export the result file to Hive and HDFS.
hive_source hive_source Select the Data Warehouse Storage resource of your OU from the dropdown list. Specify the target Hive source where you upload the result file.
hdfs_source hdfs_source Select the File Storage HDFS resource of your OU from the dropdown list. Specify the target HDFS source where you upload the result file.
ouid string Enter your OU ID, which you can get by hovering over the OU name on the top navigation bar. Specify your OU ID.

Step 3. Configuring Operators


To calculate the power loss of offshore wind turbines, you need to add the following operators.


Operator Description
PythonEx Calculate the power loss of offshore wind turbines.
Condition 1 Decide whether to upload results to Hive.
Condition 2 Decide whether to upload results to HDFS.
PythonCode Upload results to Hive.
Hive Create a Hive table to store the results.
HDFS Uploader Upload results to HDFS in .csv format.


Configuring a PythonEx Operator to Calculate Power Loss


  1. On the power-loss pipeline canvas, drag a PythonEx operator from the left operator list.

  2. Select the operator and configure the following operator parameters in the Input Parameter section on the right panel.


    Parameter Reference/Declaration Description
    workspace Declaration Select the Notebook instance folder you created in Unit3. Preparing Codes.
    entrypoint Declaration Select power-loss.py.
    requirements Declaration Select requirement.txt.


  3. Select Add Parameter in the Input Parameter section for 3 times to add 3 input parameters and configure the following information.


    Name Type Reference/Declaration Value Description
    sur_wtg_dts string Reference sur_wtg_dts Specify the sample dataset for surrounding turbine information.
    dwntm_recds_dts string Reference dwntm_recds_dts Specify the sample dataset for turbine shutdown records.
    wtg_10m_dts string Reference wtg_10m_dts Specify the sample dataset for turbine generated power.


  4. Select Add Parameter in the Output Parameter section, add a new output parameter with result_file as the parameter name and file as the type.

Configuring a Condition Operator to Store Results to Hive


  1. On the power-loss pipeline canvas, drag a Condition operator from the left operator list.
  2. Connect the output port of the PythonEx operator to the input port of this Condition operator.
  3. Select the operator and configure the following expression for the Condition operator on the right panel.
Reference | hive_source | != | Declaration | <null>


After configuring this Condition operator as above, you need to double-select the operator to open its sub-canvas and add another Condition operator to upload results to HDFS.

Configuring a Condition Operator to Upload Results to HDFS


  1. On the sub-canvas of the Condition operator you created above, drag a Condition operator from the left operator list.
  2. Select the operator and configure the following expression for the Condition operator on the right panel.
Reference | hive_source | != | Declaration | <null>


After configuring this Condition operator as above, you need to double-select the operator to open its sub-canvas and add the following operators:

  • A PythonCode operator to upload results to Hive
  • A Hive operator to create a Hive table
  • A HDFS Uploader operator to upload the results to HDFS in a .csv file

Configuring a PythonCode Operator to Upload Results to Hive


  1. On the sub-canvas of the second Condition operator you created above, drag a PythonCode operator from the left operator list.

  2. Enter the following codes for the code parameter on the right panel.

    import json
    import argparse
    from pathlib import Path
    
    # Define an ArgumentParser
    parser = argparse.ArgumentParser()
    parser.add_argument("--ouid", type=str, required=True)
    parser.add_argument("--sql_statements", type=str, required=True)
    parser.add_argument("--hdfs_dest", type=str, required=True)
    
    args = parser.parse_args()
    
    target_sqls = [f"""create external table if not exists `power_loss`( `site_id` string comment 'Site', `device_id` string comment 'Turbine', `iec_group_id` string comment 'Status ID', `iec_level4_description` string comment 'Status Description', `start_time` timestamp comment 'Start Time', `end_time` timestamp comment 'End Time', `sur_avb_wtg1` string comment 'Surrounding Turbine 1', `sur_avb_wtg2` string comment 'Surrounding Turbine 2', `power-loss1` string comment 'Power Loss 1 (kWh)', `power-loss2` string comment 'Power Loss 2(kWh)', `power-loss-avg` string comment 'Average Power Loss (kWh)' ) COMMENT 'Power Loss Table'  ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ( 'field.delim'=',', 'serialization.format'=',') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/user/data_{args.ouid}/hbjtdemo/power-loss' TBLPROPERTIES ( 'skip.header.line.count'='1', 'timestamp.formats' = 'yyyy-MM-dd HH:mm:ss')"""]
    
    Path(args.sql_statements).parent.mkdir(parents=True, exist_ok=True)
    with open(args.sql_statements, 'w') as f:
     json.dump(target_sqls, f)
    
    Path(args.hdfs_dest).parent.mkdir(parents=True, exist_ok=True)
    with open(args.hdfs_dest, 'w') as f:
     f.write(f"/user/data_{args.ouid}/hbjtdemo/power-loss")
    
  3. Select Add Parameter in the Input Parameter section add an input parameter and configure the following information.


    Name Type Reference/Declaration Value Description
    ouid string Reference ouid Specify the ID of the current OU.


  4. Select Add Parameter in the Output Parameter section twice to add 2 output parameters and configure the following information.


    Name Type
    sql_statement list
    hdfs_dest string

Configuring a Hive Operator to Create a Hive Table


  1. On the sub-canvas of the second Condition operator you created above, drag a Hive operator from the left operator list.

  2. Connect the output port of the PythonCode operator to the input port of this Hive operator.

  3. Select this Hive operator and configure the following operator parameters in the Input Parameter section on the right panel.


    Parameter Reference/Declaration Description
    data_source_name Reference Select hive_source.
    sqls Reference Select pythonCode.sql_statement.


Configuring an HDFS Uploader Operator to Upload Results


  1. On the sub-canvas of the second Condition operator you created above, drag an HDFS Uploader operator from the left operator list.

  2. Connect the output ports of both the PythonCode operator and the Hive operator to the input port of this HDFS Uploader operator.

  3. Select this Hive operator and configure the following information in the Input Parameter section on the right panel.


    Parameter Reference/Declaration Description
    data_source_name Reference Select hdfs_source.
    file Reference Select PythonEX.result_file.
    dest Declaration Select PythonCode.hdfs_dest.​


  4. Select Add Parameter in the Input Parameter section to add an input parameter and configure the following information.


    Name Type Reference/Declaration Value Description
    overwrite boolean Declaration true Overwrite if the table with the same name exists.

Step 4. Running the Pipeline


  1. On the sub-canvas of the second Condition operator you created above, select Run i_run on the top toolbar.
  2. Select OK to run the pipeline in the pop-up window.


Besides running the power-loss pipeline by Run i_run, you can also schedule the pipeline to run it on a schedule automatically. For more information, see Configure Schedules for Pipelines.

(Optional) Step 5. Viewing Pipeline Running Status


When the pipeline is running, you can view the pipeline running status by the following steps:

  1. On the AI Pipelines > Custom Pipeline tab, select the Run Instance of the power-loss pipeline.
  2. Select the instance name on the instance list.
  3. View the running status of the power-loss pipeline on the following tabs:
    • On the DAG Graph tab, you can view the running results of running instance, including the running progress, input and output parameters, current running logs, and Pod information of operators.
    • On the Detail tab, you can view the details, running parameters, and advanced configuration of the running instance.
    • On the Gantt tab, you can view the running status distribution diagram of each operator and the running progress.