Using Pipeline Operators
In Data Integration, pipeline operators represent different tasks and activities that can be used in a pipeline.
The type of task operators you can use in a pipeline are:
- Integration and data loader task operators that let you run data integration solutions within a pipeline. You can configure inputs for the operators. Task operator inputs are similar to parameters that are defined at a task or data flow level.
- SQL task operators that let you run SQL stored procedures within a pipeline. You can configure values for the parameters in the stored procedures.
- OCI Data Flow task operators that let you run OCI Data Flow applications within a pipeline.
- REST task operators that let you run REST API endpoints within a pipeline. You can reconfigure the values of any of the parameters that are used in the REST task.
- Pipeline task operators that let you run another pipeline within a pipeline.
For all task operators, you can select design-time tasks from projects in the current workspace, and published tasks from any Application in the current workspace. With published REST tasks and OCI Data Flow tasks, you can also select a task from any Application in another workspace in the same compartment or another compartment.
For tasks that run in parallel, you can use a merge operator and specify a condition to handle subsequent downstream operations. To take output from any operator and pass it to the next operator, you can use an expression operator.
You use a designer that's similar to the data flow designer to create a pipeline. The designer opens with a start operator and an end operator already placed on the canvas for you. There can only be one start operator and one end operator in a pipeline. A pipeline must include at least one task operator to be valid. You can add any number of tasks, then connect them in a sequence between the start operator and end operator. From the Operators panel, drag operators onto the canvas to design the pipeline. Then use the Properties panel to configure the properties for each operator.
Tasks that are connected directly to the start operator always run. Subsequent tasks in the sequence can be configured to run based on the condition of the previous operator. For example, consider a pipeline that has the sequence Start > Task A > Task B > End. Task A always runs. For Task B, you can use the Incoming Link Condition property on the Properties panel to configure the task to always run or to run only when the status of Task A meets a specific run condition.
To connect operators, hover over an operator until you see the connector (small circle) on the right side of the operator. Then drag the connector to the next operator you want to connect to. A connection is valid when a line connects the operators after you drop the connector.
In general, an operator has only one inbound port, and one or more outbound ports for processes to flow through the pipeline. For example, you can connect the same SQL task operator outbound port to inbound ports on two separate expression operators. Only the end operator and merge operator can have multiple inbound ports.
You can quickly duplicate a task or expression operator that has been added to a pipeline. To duplicate the operator, right-click the operator icon on the canvas and select Duplicate from the menu that appears. Then rename the identifier of the duplicated operator in the Properties panel. If the original operator is connected to other operators, the connections and any references to a previous operator's outputs are not copied to the duplicated operator.
Start Operator and End Operator
When you begin to create a pipeline, the designer opens with a start operator and an end operator already placed on the canvas for you. There can only be one start operator and one end operator in a pipeline.
The start operator does not have any properties that you can configure.
With the end operator, you can configure the Incoming link condition property to specify one of the following rules for the status of a pipeline task run:
- All completed: The pipeline task status displays as Success even when one of the tasks in the pipeline fails.
- All success: The pipeline task status displays as Success when all tasks in the pipeline complete successfully.
- All failed: The pipeline task status displays as Success when all tasks in the pipeline fail.
Merge Operator
For tasks that run in parallel, you can use the merge operator and specify a condition to decide how to handle subsequent downstream operations.
A merge operator can have multiple input links (upstream) and multiple output links (downstream).
Expression Operator
A pipeline expression operator lets you create new, derivative fields in a pipeline, similar to an expression operator in a data flow.
Unlike a data flow expression operator, a pipeline expression operator does not operate on data. A pipeline expression operator lets you operate on output from the previous operator, pipeline parameters, and output generated by the system.
Use the Expression Builder to visually select elements to create an expression for an expression operator in a pipeline. You can also enter an expression manually in the editor.
The Expression Builder is a section in the Add Expression panel.
You can delete expressions when you no longer need them.
- On the canvas of a pipeline, select an expression operator.
- With the expression operator in focus, under the Details tab of the Properties panel, select the expression you want to delete, then click Delete.
- In the Delete Expression dialog box, verify that you want to delete this expression, and then click Delete.
Decision Operator
Use the decision operator to write a Boolean condition that determines the branching flow in the pipeline. The branching is based on three possible outcomes, namely TRUE, FALSE, and ERROR.
A decision operator has one input link (upstream) and three output links (downstream).
Data Loader Task Operator
A data loader task operator lets you run a data loader task within a pipeline.
A data loader task operator that's connected directly to the start operator always runs.
For a task operator that's not connected directly to the start operator, you can use the Incoming link condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.
Integration Task Operator
An integration task operator lets you run a data flow that's configured for a specific context. The data flow must be wrapped in an integration task.
An integration task operator that's connected directly to the start operator always runs.
For a task operator that's not connected directly to the start operator, you can use the Incoming link condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.
Pipeline Task Operator
A pipeline task operator lets you run a pipeline within another pipeline.
A pipeline task operator that's connected directly to the start operator always runs.
For a task operator that's not connected directly to the start operator, you can use the Incoming link condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.
SQL Task Operator
A SQL task operator lets you run a SQL object such as a stored procedure.
A SQL task operator that's connected directly to the start operator always runs.
For a task operator that's not connected directly to the start operator, you can use the Incoming link condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.
OCI Data Flow Task Operator
An OCI Data Flow task operator lets you to run an OCI Data Flow application in a pipeline.
A task operator that's connected directly to the start operator always runs.
For a task operator that's not connected directly to the start operator, you can use the Incoming link condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.
REST Task Operator
A REST task operator lets you to run a REST API endpoint in a pipeline.
A task operator that's connected directly to the start operator always runs.
For a task operator that's not connected directly to the start operator, you can use the Incoming Link Condition property to configure the task to always run or to run only when the status of the previous operator meets a specific run condition.