You can view these Events in the Events service. Set up rules to take actions when these
events are emitted, for example, emailing you a JSON file or triggering a Function. When you create rules based on Event Type, select
Data Flow as the Service Name. The actions available are described in the
Events documentation in Events Overview.
Event Types for Applications 🔗
Data Flow emits events, in the form of a JSON file,
when an Application is created, deleted, or updated.
An Application is an infinitely reusable Spark application template consisting of a Spark
application, its dependencies, default parameters, and a default runtime resource
specification. After a developer creates a Data Flow
Application, anyone can use it without worrying about the complexities of deploying it,
setting it up, or running it.
Data Flow emits events, in the form of a JSON file,
when a create Run begins or ends.
Every time a Data Flow Application is run, a Run is
created. The Data Flow Run captures the Application's output,
logs, and statistics that are automatically securely stored. Output is saved so it can be
viewed by anyone with the correct permissions using the UI or REST API. Runs give you secure
access to the Spark UI for debugging and diagnostics.
Application Events
Friendly Name
Description
Event Type
Run - Begin
Emitted when a request to trigger a Data Flow
run is submitted successfully.
com.oraclecloud.dataflow.createrun.begin
Run - End
Emitted when the submitted run request processing is completed and the run has
transitioned to a terminal state SUCCEEDED,
CANCELED, FAILED, or
STOPPED.
The Data Flow
Run-End event is created when the Data Flow
Run reaches a terminal state of SUCCEEDED, CANCELED,
FAILED, or STOPPED. The Run-End event has
the following extra fields on which the Events service can create rule filters:
lifecycleState is the Data Flow run
lifecycle states.
type is the Data Flow run type.
language is the corresponding Spark code language.
sparkVersion is the Data Flow run Spark
version used.
applicationId is the OCID of the corresponding Data Flow application for the Data Flow run.
tenantId is the OCID of the tenant that submitted the run.
The possible values for these fields are as
follows:
Learn about the Spark-related metrics available from the oci_data_flow metric
namespace.
Metrics Overview 🔗
The Data Flow metrics help you monitor the number of tasks
that completed or failed and the amount of data involved. They
are free service metrics and are available from
Service Metrics, or
Metrics Explorer. See Viewing the Metrics for more
information.
Terminology 🔗
These terms help you understand what is available with Data Flow metrics.
Namespace:
A namespace is a container for Data Flow metrics. The namespace
identifies the service sending the metrics. The namespace for Data Flow is
oci_dataflow.
Metrics:
Metrics are the fundamental concept in telemetry and monitoring. Metrics
define a time-series set of data points. Each metric is uniquely defined
by:
namespace
metric name
compartment identifier
a set of one or more dimensions
a unit of measure
Each data point has a timestamp, a value, and a count associated with
it.
Dimensions:
A dimension is a key-value pair that defines the characteristics associated
with the metric. Data Flow has five
dimensions:
resourceId: The OCID of a Data Flow Run instance.
resourceName: The name you've given the Run
resource. It's not guaranteed to be unique.
applicationId: The OCID of a Data Flow Application
instance.
applicationName: The name you've given the
Application resource. It's not guaranteed to be unique or
final.
executorId: A Spark cluster consists of a driver
and one or more executors. The driver has executorId =
driver, the executor has executorId =
1.2.3...n.
Statistics:
Statistics are metric data aggregations over specified periods of time.
Aggregations are done using the namespace, metric name, dimensions, and the
data point unit of measure within a specified time period.
Alarms:
Alarms are used to automate operations monitoring and performance. An alarm
tracks changes that occur over a specific period of time and performs one or
more defined actions, based on the rules defined for the metric.
Prerequisites 🔗
To monitor resources in Data Flow, you must be
given the required type of access in a policy written by an administrator.
The policy must give you access to the monitoring services and the resources being monitored.
This applies whether you're using the Console or the REST API with an SDK, CLI, or another tool. If you try to perform
an action, and get a message that you don't have permission or are
unauthorized, confirm with your administrator the type of access you have
been granted and which compartment to work in. For more information on user
authorizations for monitoring, see the Authentication and Authorization
section for the related service: Monitoring or Notifications.
Available Metrics 🔗
Here are the metrics available for Data Flow. The
control plane metrics are listed first, then the data plane metrics.
Control Plane Metrics
Metric Name
Display Name
Dimensions
Statistic
Description
RunTotalStartUpTime
Run Startup Time
resourceId
resourceName
applicationId
applicationName
Mean
The overall startup time for a run contains timings for resource
assignment and Spark job startup, and the time it waits in various
queues internal to the service.
RunExecutionTime
Run Execution Time
resourceId
resourceName
applicationId
applicationName
Mean
The amount of time it takes to complete a run, from the time it's
started until the time it completes.
RunTotalTime
Total Run Time
resourceId
resourceName
applicationId
applicationName
Mean
The sum of the Run startup time and Run Execution Time.
RunSucceeded
Run Succeeded
resourceId
resourceName
applicationId
applicationName
Count
Whether the run finished successfully.
RunFailed
Run Failed
resourceId
resourceName
applicationId
applicationName
Count
Whether the run failed to start.
Data Plane Metrics
Metric Name
Display Name
Dimensions
Statistic
Description
CpuUtilization
CPU Utilization
resourceId
resourceName
applicationId
applicationName
executorId
Percent
The CPU utilization by the container allocated to the driver
or executor as a percentage.
DiskReadBytes
Disk Read Bytes
resourceId
resourceName
applicationId
applicationName
executorId
Sum
The number of bytes read from all block devices by the
container allocated to the driver or executor in a given time
interval.
DiskWriteBytes
Disk Write Bytes
resourceId
resourceName
applicationId
applicationName
executorId
Sum
The number of bytes written from all block devices by the
container allocated to the driver or executor in a given time
interval.
FileSystemUtilization
File System Utilization
resourceId
resourceName
applicationId
applicationName
executorId
Percent
The file system usage by the container allocated to the
driver or executor as a percentage.
GcCpuUtilization
GC CPU Utilization
resourceId
resourceName
applicationId
applicationName
executorId
Percent
The memory usage by the Java Garbage Collector of the driver
or executor as a percentage.
MemoryUtilization
Memory Utilization
resourceId
resourceName
applicationId
applicationName
executorId
Percent
The memory usage by the container allocated to the driver or
executor as a percentage.
NetworkReceiveBytes
Network Receive Bytes
resourceId
resourceName
applicationId
applicationName
executorId
Sum
The number of bytes received from the network interface by
the container allocated to the driver or executor in a given
time interval.
NetworkTransmitBytes
Network Transmit Bytes
resourceId
resourceName
applicationId
applicationName
executorId
Sum
The number of bytes transmitted from the network interface by
the container allocated to the driver or executor in a given
time interval.
Viewing the Metrics 🔗
You can view Data Flow metrics in various
ways.
From the Console, select the
navigation menu, click Observability &
Management, and under Monitoring,
select Service Metrics. See Overview of Monitoring for how to use these
metrics.
From the Console, select the
navigation menu, click Observability &
Management, and under Monitoring,
select Metrics Explorer. See Overview of Monitoring for how to use these
metrics.
From the Console, select the
navigation menu, click Data Flow, and select
Runs. Under Resources, select
Metrics, and you see the metrics specific to
this Run. Set the Start time and End time as
appropriate, or a time period from Quick Selects. For
each chart, you can specify an Interval and the
Options as to how to display each
metric.
From the Console, select the
navigation menu, select Data Flow, and select
Applications. You see the metrics specific to
the Runs of this Application. Set the Start time and
End time as appropriate, or a time period
from Quick Selects. For each chart, you can specify
an Interval and a Statistic, and the
Options as to how to display each
metric.