Using Pipeline Operators

In Data Integration, pipeline operators represent different tasks and activities that can be used in a pipeline.

The type of task operators you can use in a pipeline are:

  • Integration and data loader task operators that let you run data integration solutions within a pipeline. You can configure inputs for the operators. Task operator inputs are similar to parameters that are defined at a task or data flow level.
  • SQL task operators that let you run SQL stored procedures within a pipeline. You can configure values for the parameters in the stored procedures.
  • OCI Data Flow task operators that let you run OCI Data Flow applications within a pipeline.
  • REST task operators that let you run REST API endpoints within a pipeline. You can reconfigure the values of any of the parameters that are used in the REST task.
  • Pipeline task operators that let you run another pipeline within a pipeline.

For all task operators, you can select design-time tasks from projects in the current workspace, and published tasks from any Application in the current workspace. With published REST tasks and OCI Data Flow tasks, you can also select a task from any Application in another workspace in the same compartment or another compartment.

For tasks that run in parallel, you can use a merge operator and specify a condition to handle subsequent downstream operations. To take output from any operator and pass it to the next operator, you can use an expression operator.

You use a designer that's similar to the data flow designer to create a pipeline. The designer opens with a start operator and an end operator already placed on the canvas for you. There can only be one start operator and one end operator in a pipeline. A pipeline must include at least one task operator to be valid. You can add any number of tasks, then connect them in a sequence between the start operator and end operator. From the Operators panel, drag operators onto the canvas to design the pipeline. Then use the Properties panel to configure the properties for each operator.

Tasks that are connected directly to the start operator always run. Subsequent tasks in the sequence can be configured to run based on the condition of the previous operator. For example, consider a pipeline that has the sequence Start > Task A > Task B > End. Task A always runs. For Task B, you can use the Incoming Link Condition property on the Properties panel to configure the task to always run or to run only when the status of Task A meets a specific run condition.

To connect operators, hover over an operator until you see the connector (small circle) on the right side of the operator. Then drag the connector to the next operator you want to connect to. A connection is valid when a line connects the operators after you drop the connector.

In general, an operator has only one inbound port, and one or more outbound ports for processes to flow through the pipeline. For example, you can connect the same SQL task operator outbound port to inbound ports on two separate expression operators. Only the end operator and merge operator can have multiple inbound ports.

You can quickly duplicate a task or expression operator that has been added to a pipeline. To duplicate the operator, right-click the operator icon on the canvas and select Duplicate from the menu that appears. Then rename the identifier of the duplicated operator in the Properties panel. If the original operator is connected to other operators, the connections and any references to a previous operator's outputs are not copied to the duplicated operator.

Start Operator and End Operator

When you begin to create a pipeline, the designer opens with a start operator and an end operator already placed on the canvas for you. There can only be one start operator and one end operator in a pipeline.

The start operator does not have any properties that you can configure.

With the end operator, you can configure the Incoming link condition property to specify one of the following rules for the status of a pipeline task run:

  • All completed: The pipeline task status displays as Success even when one of the tasks in the pipeline fails.
  • All success: The pipeline task status displays as Success when all tasks in the pipeline complete successfully.
  • All failed: The pipeline task status displays as Success when all tasks in the pipeline fail.

Merge Operator

For tasks that run in parallel, you can use the merge operator and specify a condition to decide how to handle subsequent downstream operations.

Adding and configuring a merge operator

A merge operator can have multiple input links (upstream) and multiple output links (downstream).

  1. From the Operators panel, drag the Merge operator onto the canvas.
  2. Under the Details tab of the Properties panel, enter a name and optional description for the merge operator.
  3. For Merge condition, you can select from the following options:
    • All success: All parallel operations that are linked upstream must complete and succeed before the next downstream operation can proceed. This option is the default.
    • All failed: All parallel operations that are linked upstream must complete and fail before the next downstream operation can proceed.
    • All completed: All parallel operations that are linked upstream must complete before the next downstream operation can proceed.
    • At least one success: At least one operation that is linked upstream must complete and succeed before the next downstream operation can proceed.
    • At least one failed: At least one operation that is linked upstream must complete and fail before the next downstream operation can proceed.
  4. Under the Output tab of the Properties panel, you can view the outputs that can be used as inputs for the next operator in the pipeline.
    The outputs available are a combination of output parameters from the system, and the outputs for each task operator that's connected to the merge operator.

Expression Operator

A pipeline expression operator lets you create new, derivative fields in a pipeline, similar to an expression operator in a data flow.

Adding and configuring an expression operator

Unlike a data flow expression operator, a pipeline expression operator does not operate on data. A pipeline expression operator lets you operate on output from the previous operator, pipeline parameters, and output generated by the system.

  1. From the Operators panel, drag the Expression operator onto the canvas.
  2. With the expression operator in focus, under the Details tab of the Properties panel, enter a name in the Identifier field, or leave the name as-is.
  3. (Optional) For Incoming link condition, you can select from the following run conditions. When the operators are connected, a colored line indicates the selected condition.
    • Always run: Runs the selected node regardless of the status of the upstream node.

      A gray line indicates the Always run condition.

      You can't change the link condition for nodes that are connected directly to the Start operator.

    • Run on success of previous operator: Runs the selected node only when the upstream node status is 'success.'

      A green line with a check mark indicates a successful condition.

    • Run on failure of previous operator: Runs the selected node only when the upstream node status is 'failure.'

      A red line with an 'x' indicates a failed condition.

  4. Under Expressions, click Add.
  5. In the Add expression panel, enter a name in the Identifier field, or leave the name as-is.
  6. Select the Data type for the expression.
  7. Depending on the type you selected, you might have to specify other values such as a Length or Scale.
  8. Create an expression in the Expression builder and click Add.
    The derived value from an expression can be used in the next operator that's connected to this pipeline expression operator.
  9. Repeat the steps to add more expressions.
  10. Under the Output tab of the Properties panel, you can view the expression outputs that can be used as inputs for the next operator in the pipeline.
    Each expression that you add has an output. An output has the same name as the expression.
  11. Under the Validation tab of the Properties panel, you can validate the operator to check for errors and warnings.
Adding an expression

Use the Expression Builder to visually select elements to create an expression for an expression operator in a pipeline. You can also enter an expression manually in the editor.

The Expression Builder is a section in the Add Expression panel.

  1. In the Expression builder, double-click to select or drag an incoming value, a parameter, or a function to build the expression.
    An expression can be a combination of system outputs from a previous operator, system parameters, user-defined parameters, and functions.
  2. For Incoming values, you can choose from system outputs of the previous operator. If the previous operator is merge, the system outputs are the combined outputs of the task operators connected to the merge operator.
  3. For Parameters, you can choose from system parameters and user-defined parameters that are created in the pipeline.
    • User defined: A user-defined parameter is a custom parameter that you create.

      You can select an existing parameter from the list of available user-defined parameters, or you can click Add to create one.

      1. In the Add parameter panel, enter a name for the parameter, and an optional description.
      2. Select the Data type for this parameter.
      3. Define the rest of the fields for this parameter based on the selected data type.
      4. Set a Default value for this parameter.
      5. Click Add.
    • System defined: At runtime, Data Integration generates certain system parameters for a pipeline. Generated system parameter values can be used in expressions but the values cannot be modified.
  4. For Functions, you can choose from available Data Integration functions for pipeline operators.
  5. In the Add Expression panel, click Add to create or update the expression.
Deleting an expression

You can delete expressions when you no longer need them.

  1. On the canvas of a pipeline, select an expression operator.
  2. With the expression operator in focus, under the Details tab of the Properties panel, select the expression you want to delete, then click Delete.
  3. In the Delete Expression dialog box, verify that you want to delete this expression, and then click Delete.

Decision Operator

Use the decision operator to write a Boolean condition that determines the branching flow in the pipeline. The branching is based on three possible outcomes, namely TRUE, FALSE, and ERROR.

Adding and configuring a decision operator

A decision operator has one input link (upstream) and three output links (downstream).

  1. From the Operators panel, drag the Decision operator onto the canvas.

    By default the decision operator icon is displayed as expanded, showing three output ports, namely TRUE, FALSE, and ERROR.

  2. Under the Details tab of the Properties panel, enter a name and optional description for the decision operator.
  3. For Decision condition, do the following:
    1. Click Add.
    2. In the Add decision condition panel, write a condition using incoming values, parameters, or functions, such that the condition expression evaluates to a Boolean value.

      For example, a condition might evaluate the run status of the previous task: PREVIOUS_TASK_1.SYS.STATUS = 'SUCCESS'

      • For Incoming values, you can choose from system outputs of the previous operator.

      • For Parameters, you can choose from system parameters and user-defined parameters that are created in the pipeline.

        • User defined: A user-defined parameter is a custom parameter that you create.

          You can select an existing parameter from the list of available user-defined parameters, or you can click Add to create one.

        • System defined: At runtime, Data Integration generates certain system parameters for a pipeline. Generated system parameter values can be used in expressions but the values cannot be modified.

      • For Functions, you can choose from available Data Integration functions for pipeline operators.

  4. (Optional) For Incoming link condition, you can select from the following run conditions. When the operators are connected, a colored line indicates the selected condition.
    • Always run: Runs the selected node regardless of the status of the upstream node.

      A gray line indicates the Always run condition.

      You can't change the link condition for nodes that are connected directly to the Start operator.

    • Run on success of previous operator: Runs the selected node only when the upstream node status is 'success.'

      A green line with a check mark indicates a successful condition.

    • Run on failure of previous operator: Runs the selected node only when the upstream node status is 'failure.'

      A red line with an 'x' indicates a failed condition.

  5. On the canvas, connect the appropriate output ports on the decision operator to a task operator or expression operator downstream.

    A decision operator output port can't be linked directly to a merge operator or an end operator.

Data Loader Task Operator

A data loader task operator lets you run a data loader task within a pipeline.

Adding and configuring a data loader task operator

A data loader task operator that's connected directly to the start operator always runs.

For a task operator that's not connected directly to the start operator, you can use the Incoming link condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.

  1. From the Operators panel, drag a Data loader operator onto the canvas.
    The operator icon on the canvas shows a default name, indicating that the operator is not yet bound to a task in the workspace.
  2. Under the Details tab of the Properties panel for the unbound task operator, click Select. In the panel that appears, use the menu to choose the type of task to add, and then select the task.

    Design tasks: To use a task in a project, first select a project from the Projects list menu. You can also start typing a project name to filter the list and select from a filtered list of matching names. Only projects in the current workspace are available for selection.

    Published tasks: Published tasks from the latest, updated Application in the current workspace and compartment are listed for selection. You can use the menus to switch to another Application in the current workspace. You can also start typing a name in the Workspaces or Applications box and select from a filtered list of matching names.

  3. In the Select a data loader task panel, select a task and click Select.
    The name on the operator icon on the canvas changes to the name of the selected task.
  4. Under the Details tab of the Properties panel for the selected task operator:
    1. Rename the operator if you want.
    2. For Incoming link condition, you can select from the following run conditions. When the operators are connected, a colored line indicates the selected condition.
      • Always run: Runs the selected task regardless of the status of the upstream operator.

        A gray line indicates the Always run condition.

        You can't change the link condition for tasks that are connected directly to the Start operator.

      • Run on success of previous operator: Runs the selected task only when the upstream operator status is 'success.'

        A green line with a check mark indicates a successful condition.

      • Run on failure of previous operator: Runs the selected task only when the upstream operator status is 'failure.'

        A red line with an 'x' indicates a failed condition.

  5. Under the Configuration tab of the Properties panel, you can:
    1. Configure run options to specify how to handle task runs that fail.
    2. Configure incoming parameter values.
  6. Under the Output tab of the Properties panel, you can view the outputs that can be used as inputs for the next operator in the pipeline.
  7. Under the Validation tab of the Properties panel, you can validate the task to check for errors and warnings in configured parameter values, if applicable.

Integration Task Operator

An integration task operator lets you run a data flow that's configured for a specific context. The data flow must be wrapped in an integration task.

Adding and configuring an integration task operator

An integration task operator that's connected directly to the start operator always runs.

For a task operator that's not connected directly to the start operator, you can use the Incoming link condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.

  1. From the Operators panel, drag an Integration operator onto the canvas.
    The operator icon on the canvas shows a default name, indicating that the operator is not yet bound to a task in the workspace.
  2. Under the Details tab of the Properties panel for the unbound task operator, click Select. In the panel that appears, use the menu to choose the type of task to add, and then select the task.

    Design tasks: To use a task in a project, first select a project from the Projects list menu. You can also start typing a project name to filter the list and select from a filtered list of matching names. Only projects in the current workspace are available for selection.

    Published tasks: Published tasks from the latest, updated Application in the current workspace and compartment are listed for selection. You can use the menus to switch to another Application in the current workspace. You can also start typing a name in the Workspaces or Applications box and select from a filtered list of matching names.

  3. In the Select an integration task panel, select a task and click Select.
    The name on the operator icon on the canvas changes to the name of the selected task.
  4. Under the Details tab of the Properties panel for the selected task operator:
    1. Rename the operator if you want.
    2. For Incoming link condition, you can select from the following run conditions. When the operators are connected, a colored line indicates the selected condition.
      • Always run: Runs the selected task regardless of the status of the upstream operator.

        A gray line indicates the Always run condition.

        You can't change the link condition for tasks that are connected directly to the Start operator.

      • Run on success of previous operator: Runs the selected task only when the upstream operator status is 'success.'

        A green line with a check mark indicates a successful condition.

      • Run on failure of previous operator: Runs the selected task only when the upstream operator status is 'failure.'

        A red line with an 'x' indicates a failed condition.

  5. Under the Configuration tab of the Properties panel, you can:
    1. Configure run options to specify how to handle task runs that fail.
    2. Configure incoming parameter values.
  6. Under the Output tab of the Properties panel, you can view the outputs that can be used as inputs for the next operator in the pipeline.
  7. Under the Validation tab of the Properties panel, you can validate the task to check for errors and warnings in configured parameter values, if applicable.

Pipeline Task Operator

A pipeline task operator lets you run a pipeline within another pipeline.

Adding and configuring a pipeline task operator

A pipeline task operator that's connected directly to the start operator always runs.

For a task operator that's not connected directly to the start operator, you can use the Incoming link condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.

  1. From the Operators panel, drag the Pipeline operator onto the canvas.
    The operator icon on the canvas shows a default name, indicating that the operator is not yet bound to a task in the workspace.
  2. Under the Details tab of the Properties panel for the unbound task operator, click Select. In the panel that appears, use the menu to choose the type of task to add, and then select the task.

    Design tasks: To use a task in a project, first select a project from the Projects list menu. You can also start typing a project name to filter the list and select from a filtered list of matching names. Only projects in the current workspace are available for selection.

    Published tasks: Published tasks from the latest, updated Application in the current workspace and compartment are listed for selection. You can use the menus to switch to another Application in the current workspace. You can also start typing a name in the Workspaces or Applications box and select from a filtered list of matching names.

  3. In the Select a pipeline task panel, select a task and click Select.
    The name on the operator icon on the canvas changes to the name of the selected task.
  4. Under the Details tab of the Properties panel for the selected task operator:
    1. Rename the operator if you want.
    2. For Incoming link condition, you can select from the following run conditions. When the operators are connected, a colored line indicates the selected condition.
      • Always run: Runs the selected task regardless of the status of the upstream operator.

        A gray line indicates the Always run condition.

        You can't change the link condition for tasks that are connected directly to the Start operator.

      • Run on success of previous operator: Runs the selected task only when the upstream operator status is 'success.'

        A green line with a check mark indicates a successful condition.

      • Run on failure of previous operator: Runs the selected task only when the upstream operator status is 'failure.'

        A red line with an 'x' indicates a failed condition.

  5. Under the Configuration tab of the Properties panel, you can:
    1. Configure run options to specify how to handle task runs that fail.
    2. Configure incoming parameter values.
  6. Under the Output tab of the Properties panel, you can view the outputs that can be used as inputs for the next operator in the pipeline.
  7. Under the Validation tab of the Properties panel, you can validate the task to check for errors and warnings in configured parameter values, if applicable.

SQL Task Operator

A SQL task operator lets you run a SQL object such as a stored procedure.

Adding and configuring a SQL task operator

A SQL task operator that's connected directly to the start operator always runs.

For a task operator that's not connected directly to the start operator, you can use the Incoming link condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.

  1. From the Operators panel, drag the SQL operator onto the canvas.
    The operator icon on the canvas shows a default name, indicating that the operator is not yet bound to a task in the workspace.
  2. Under the Details tab of the Properties panel for the unbound task operator, click Select. In the panel that appears, use the menu to choose the type of task to add, and then select the task.

    Design tasks: To use a task in a project, first select a project from the Projects list menu. You can also start typing a project name to filter the list and select from a filtered list of matching names. Only projects in the current workspace are available for selection.

    Published tasks: Published tasks from the latest, updated Application in the current workspace and compartment are listed for selection. You can use the menus to switch to another Application in the current workspace. You can also start typing a name in the Workspaces or Applications box and select from a filtered list of matching names.

  3. In the Select a SQL task panel, select a task and click Select.
    The name on the operator icon on the canvas changes to the name of the selected task.
  4. Under the Details tab of the Properties panel for the selected task operator:
    1. Rename the operator if you want.
    2. For Incoming link condition, you can select from the following run conditions. When the operators are connected, a colored line indicates the selected condition.
      • Always run: Runs the selected task regardless of the status of the upstream operator.

        A gray line indicates the Always run condition.

        You can't change the link condition for tasks that are connected directly to the Start operator.

      • Run on success of previous operator: Runs the selected task only when the upstream operator status is 'success.'

        A green line with a check mark indicates a successful condition.

      • Run on failure of previous operator: Runs the selected task only when the upstream operator status is 'failure.'

        A red line with an 'x' indicates a failed condition.

  5. Under the Configuration tab of the Properties panel, you can:
    1. Configure run options to specify how to handle task runs that fail.
    2. Configure incoming parameter values.

      The variables defined in a stored procedure are exposed as input, output, and in-out parameters. Only input parameters can be configured.

      Note

      The configured value of an input parameter must match the defined data type of that parameter. For example, you cannot provide a String value for an input parameter whose data type is NUMERIC. Also, the configured value of a NUMERIC data type input parameter cannot be NULL at runtime.
  6. Under the Output tab of the Properties panel, you can view the outputs that can be used as inputs for the next operator in the pipeline.

    Both system-generated output parameters and output parameters from the SQL stored procedure are shown. See also Output Parameters.

  7. Under the Validation tab of the Properties panel, you can validate the task to check for errors and warnings in configured parameter values, if applicable.
    Note

    A SQL task fails to run if configured NUMERIC data type input parameters have NULL as the default value. To avoid task run failures, change a NULL value to 0 (zero).

OCI Data Flow Task Operator

An OCI Data Flow task operator lets you to run an OCI Data Flow application in a pipeline.

Adding and configuring an OCI Data Flow task operator

A task operator that's connected directly to the start operator always runs.

For a task operator that's not connected directly to the start operator, you can use the Incoming link condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.

  1. From the Operators panel, drag the OCI Data Flow operator to the canvas.
    The operator icon on the canvas shows a default name, indicating that the operator is not yet bound to a task in the workspace.
  2. Under the Details tab of the Properties panel for the unbound task operator, click Select. In the panel that appears, use the menu to choose the type of task to add, and then select the task.

    Design tasks: To use a task in a project, first select a project from the Projects list menu. You can also start typing a project name to filter the list and select from a filtered list of matching names. Only projects in the current workspace are available for selection.

    Published tasks: Published tasks from the latest, updated Application in the current workspace and compartment are listed for selection. You can use the menus to switch to another Application in the current workspace, or another workspace in the same compartment or another compartment. You can also start typing a name in the Workspaces or Applications box and select from a filtered list of matching names.

  3. In the Select an OCI Data Flow task panel, select a task and click Select.
    The name on the operator icon on the canvas changes to the name of the selected task.
  4. Under the Details tab of the Properties panel for the selected task operator:
    1. Rename the operator if you want.
    2. For Incoming link condition, you can select from the following run conditions. When the operators are connected, a colored line indicates the selected condition.
      • Always run: Runs the selected task regardless of the status of the upstream operator.

        A gray line indicates the Always run condition.

        You can't change the link condition for tasks that are connected directly to the Start operator.

      • Run on success of previous operator: Runs the selected task only when the upstream operator status is 'success.'

        A green line with a check mark indicates a successful condition.

      • Run on failure of previous operator: Runs the selected task only when the upstream operator status is 'failure.'

        A red line with an 'x' indicates a failed condition.

  5. Under the Configuration tab of the Properties panel, you can:
    1. Configure incoming parameter values.

      The values of parameters that are assigned to property values in the underlying OCI Data Flow task can be reconfigured.

    2. Configure run options to specify how to handle task runs that fail.
  6. Under the Output tab of the Properties panel, you can view the outputs that can be used as inputs for the next operator in the pipeline.

    Both system-generated output parameters and output parameters from the application are shown. See also Output Parameters.

  7. Under the Validation tab of the Properties panel, you can validate the task to check for errors and warnings in configured parameter values, if applicable.

REST Task Operator

A REST task operator lets you to run a REST API endpoint in a pipeline.

Adding and configuring a REST task operator

A task operator that's connected directly to the start operator always runs.

For a task operator that's not connected directly to the start operator, you can use the Incoming Link Condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.

  1. From the Operators panel, drag the REST operator onto the canvas.
    The operator icon on the canvas shows a default name, indicating that the operator is not yet bound to a task in the workspace.
  2. Under the Details tab of the Properties panel for the unbound task operator, click Select. In the panel that appears, use the menu to choose the type of task to add, and then select the task.

    Design tasks: To use a task in a project, first select a project from the Projects list menu. You can also start typing a project name to filter the list and select from a filtered list of matching names. Only projects in the current workspace are available for selection.

    Published tasks: Published tasks from the latest, updated Application in the current workspace and compartment are listed for selection. You can use the menus to switch to another Application in the current workspace, or another workspace in the same compartment or another compartment. You can also start typing a name in the Workspaces or Applications box and select from a filtered list of matching names.

  3. In the Select a REST Task panel, select a task and click Select.
    The name on the operator icon on the canvas changes to the name of the selected task.
  4. Under the Details tab of the Properties panel for the selected task operator:
    1. Rename the operator if you want.
    2. For Incoming Link Condition, you can select from the following run conditions. When the operators are connected, a colored line indicates the selected condition.
      • Always run: Runs the selected task regardless of the status of the upstream operator.

        A gray line indicates the Always run condition.

        You can't change the link condition for tasks that are connected directly to the Start operator.

      • Run on success of previous operator: Runs the selected task only when the upstream operator status is 'success.'

        A green line with a check mark indicates a successful condition.

      • Run on failure of previous operator: Runs the selected task only when the upstream operator status is 'failure.'

        A red line with an 'x' indicates a failed condition.

  5. Under the Configuration tab of the Properties panel, you can:
    1. Configure incoming parameter values.

      The values of URL parameters and other task parameters that are defined in the underlying REST task can be reconfigured.

    2. Configure run options to specify how to handle task runs that fail.
  6. Under the Output tab of the Properties panel, you can view the outputs that can be used as inputs for the next operator in the pipeline.

    Both system-generated output parameters and output parameters from the REST response are shown. See also Output Parameters.

  7. Under the Validation tab of the Properties panel, you can validate the task to check for errors and warnings in configured parameter values, if applicable.