Operations Insights Metrics
You can monitor for conditions where incoming data for any Operations Insights-enabled target has been delayed for last one or two days by using metrics, alarms, and notifications.
This topic covers the metrics emitted by the Operations Insights service.
Overview of Operations Insights Metrics
Operations Insights relies on a constant flow of data coming from a variety of sources such as Autonomous DBs and Enterprise Manager targets such as hosts and databases.
Required Policies
To monitor resources, you must be given the required type of access in a policy . The policy must give you access to the monitoring services as well as the resources being monitored. If you try to perform an action and get a message that you don’t have permission or are unauthorized, confirm with your administrator the type of access you've been granted and which compartment you should work in. For more information on user authorizations for monitoring, see the Authentication and Authorization section for the related service: Monitoring or Notifications.
For information on required Operations Insights policies, see Set Up Groups and Policies.
The following topics are covered:
Dimensions Common Across Operations Insights Metrics
Common Dimensions Across Metrics
The following table shows dimensions common across all metrics emitted by Operations Insights except WarehouseCpuUtilization.
Dimensions | Description |
---|---|
resourceId |
Operations Insights ID for the target. |
resourceDisplayName |
Display Name of the target. |
resourceType |
Type of resource. For example: ADB-S, ATP-D, EXTERNAL-HOST, EXTERNAL-PDB, EXTERNAL-NONCDB |
telemetrySourceType |
The source of the metric: CloudInfrastructure, EnterpriseManager, AgentService. |
telemetrySourceIdentifier |
Depending on the telemetrySourceIdentifier, this field will contain one of the following:
|
telemetrySourceEntityIdentifier |
Depending on the telemetrySourceEntityIdentifier, this field will contain one of the following:
|
associatedOCIResourceId |
The ADW OCID. This will only be populated for Autonomous Database targets. |
sourceMetricName |
Source Metric name for which the delay is being reported. |
Metrics
All Operations Insights Metrics
The following table shows all metrics emitted by Operations Insights.
Metric Name | Specific Dimensions | Description |
---|---|---|
DataFlowDelayInHrs | dataProcessingFrequencyInHrs - Frequency of data processing in hours.
Note
For additional dimensions for this metric, see Dimensions Common Across Operations Insights Metrics. |
Number of hours ago at which the data was last processed for a given target and metric. The DataFlowDelayInHrs metric lets you monitor for data flow interruptions for all enabled targets and lets you quickly and easily identify which sources are having problems. |
WarehouseCpuUtilization |
resourceId - Operations Insights ID for the warehouse resourceDisplayName - Display Name of the Operations Insights warehouse. |
CPU Utilization of the ADW provisioned for the Operations Insights warehouse in percentage. |
DaysToReachHighUtilization |
resourceMetric -
aggregateDataMeasure - Indicates what underlying aggregate measure is being used in the forecast. Currently this can be forecastModel: Indicates which forecast model is being used in the forecast. Currently this can be exceededForecastWindow - Indicates whether the number of days returned is equivalent to the amount of days being forecasted. This should be used in the alarms, like so: Note
For additional dimensions for this metric, see Dimensions Common Across Operations Insights Metrics. |
Days to reach high utilization (above default setting of 75%) for a given resource type and resource metric.
To modify utilization thresholds from the default settings see: Changing Utilization Thresholds. |
DaysToReachLowUtilization |
resourceMetric -
aggregateDataMeasure - Indicates what underlying aggregate measure is being used in the forecast. Currently this can be forecastModel: Indicates which forecast model is being used in the forecast. Currently this can be exceededForecastWindow - Indicates whether the number of days returned is equivalent to the amount of days being forecasted. This should be used in the alarms, like so: Note
For additional dimensions for this metric, see Dimensions Common Across Operations Insights Metrics. |
Days to reach low utilization (below default setting of 25%) for a given resource type and resource metric.
To modify utilization thresholds from the default settings see: Changing Utilization Thresholds. |
SQL Related Metric
The NumSqlsNeedingAttention metric assists you with SQL tuning and performance by allowing you to set alarms notifying you when SQL statements require attention.
See Specific Alarm Conditions (SQL Alarms) for examples of setting up alarms under various conditions.
The following table shows the metric related to SQL alarms.
Metric Name | Specific Dimensions | Description |
---|---|---|
NumSqlsNeedingAttention |
isDegraded (0,1) - set to 1 if response time percent change > 20% over the last 24 hours isVariant (0,1) - set to 1 if SQL variability is > 1.66 over the last 24 hours isInefficient (0,1) - set to 1 set if inefficiency > 20% over the last 24 hours isPlanChanged (0,1) - set to 1 if the SQL plan has changed over the interval isIncreasingIo (0,1) - set to 1 if IO increase > 50% over the last 24 hours isIncreasingCpu (0,1) - set to 1 if CPU increase > 50% over the last 24 hours isIncreasingWait (0,1) - set to 1 if Wait increase > 50% over the last 24 hours Note
For additional dimensions for this metric, see Dimensions Common Across Operations Insights Metrics. |
|
Data Flow Metric
Operations Insights consumes data coming from different types of sources such as Autonomous databases , Enterprise Manager targets (databases, hosts, Exadata, etc.) and Management Agent targets (external databases, hosts, etc). The data gap metric allows you to set up alarms in the event data from these sources has stopped for the last 1 or 2 days .
For examples on setting up alarms for the data flow metric, see Specific Alarm Conditions (Data Flow Delays).
Metric Name | Dimensions | Description |
---|---|---|
DataFlowDelayInHrs |
sourceIdentifier - This will be Enterprise Manager Bridge Id for Enterprise Manager target, agent Id for agent based target and OCID of ADW for Autonomous Database targets. sourceEntityIdentifier - This will be the Enterprise Manager target GUID for Enterprise Manager target, Cloud Infrastructure database Id for Management Agent based targets. associatedResourceId - This will only be populated for Autonomous Database targets and it will be the OCID of the Autonomous Database. dataProcessingFrequencyInHrs - Frequency of data processing in hours Note
For additional dimensions for this metric, see Dimensions Common Across Operations Insights Metrics. |
Number of hours ago at which the data was last processed for given target and metric |
Data Flow Metric Examples
The following table shows the possible values of the dataProcessingFrequencyInHrs dimension for different resource types.
dataProcessingFrequencyInHrs Value | Resource Example | telemetrySourceType | Description |
---|---|---|---|
1.00 | Enterprise Manager managed DB | EnterpriseManager | Loads every hour to process performance metric data accumulated in the Object Storage bucket for Enterprise Manager managed DB targets. |
3.00 |
Autonomous DB Database Cloud Service DB Enterprise Manager managed DB Enterprise Manager managed host Exadata Cell Cloud Infrastructure DB |
CloudInfrastructure EnterpriseManager AgentService |
Loads every 3 hrs to get hourly performance metric data from the Monitoring service (for Cloud Infrastructure and Autonomous DBs) , Object Storage bucket (for Enterprise Manager managed targets) or for generating hourly rollups from the raw data ingested via ingestion APIs. |
12.00 |
Autonomous DB Database Cloud Service DB Enterprise Manager managed DB Cloud Infrastructure DB |
CloudInfrastructure EnterpriseManager AgentService |
Every 12 hours, load daily performance metrics data by reading raw data from the Operations Insights data store. |
24.00 |
Enterprise Manager managed DB Enterprise Manager managed host Exadata Cell |
EnterpriseManager |
Every 24 hrs, 2 ETLs are run to process data for Enterprise Manager managed targets. One ETL to load daily performance metrics data from the Object Storage bucket for Enterprise Manager managed targets. Another ETL to load configuration metrics data from the object storage bucket for Enterprise Manager managed targets. |
Oracle Database Cloud (DBCS) Metric
The MetricCollectionErrors metric number of collection errors for given target and metric
Metric Name | Specific Dimensions | Description |
---|---|---|
MetricCollectionErrors |
associatedResourceId - This will be the DBAAS OCID for the resource. sourceMetricName - Name of the metric collection which is failing. This dimension can be one of the following:
ErrorCategory - DatabaseConnection/QueryExecution Cause - <actual ORA error code if available> or NA e.g., ORA-12850 Note
For additional dimensions for this metric, see Dimensions Common Across Operations Insights Metrics. |
Number of collection errors for given target and metric. |
Create Alarms
Setting Alarms
When a metric condition is met, you can use the Monitoring service's alarm system to alert interested parties to conditions. You can create alarms on individual resources or on an entire compartment.
Operation Insights provides convenient access to Monitoring service's alarm creation functionality directly from any fleet resource page.
- From the left pane, click Administration.
- Click on a fleet resource. (Database Fleet, Host Fleet, Exadata Fleet, Operations Insights Warehouse).
- Click on the Action menu (vertical ellipses) for a specific resource and select Add Alarms. The Add Alarms to Metrics region displays. Expand the description region below each metric to view suggested trigger parameters as well as key dimensions.
- Click Add Alarm. You'll be taken to the Monitoring service Create Alarm page with the required metric details already populated.
Note
By default, an alarm applies to an individual resource. If you want the alarm to apply to an entire compartment, remove theresourceID
. - Under Notification>Destinations, Select a topic or channel that you want to use for sending notifications when an alarm is triggered. Alternatively, you can create a topic.
- Provide an alarm name and set the suggested threshold and trigger delay.
- Click Save alarm.
Specific Alarm Conditions
SQL Alarms
You can create alerts to conditions defined for the NumSqlsNeedingAttention metric. Alarms need to be created in a specific way in order for them to clear properly. The following examples illustrate how to trigger an alarm under various alert conditions.
Alarm Condition | MQL Alarm Definition |
---|---|
You want to trigger an alarm if the total number of SQL statements across all resources, which are both degraded and have a plan change, is greater than 5. |
|
You want to trigger an alarm whenever any resource has a plan change. |
|
You want to trigger an alarm whenever resource has a plan change. |
|
Similar patterns can be used for any of the dimensions. In general, to trigger an alarm on a specific condition, the generic alarm definition syntax would look like the following:
NumSqlsNeedingAttention[3h]
{dim1="val1", dim2="val2", ....}
.absent()==0 && NumSqlsNeedingAttention[3h]
{dim1="val1", dim2="val2, ...}
.sum() > 5
You must specify both and absent condition and a threshold condition as shown above and the dimension specification must be the same in both clauses. You should only change the dimensions or the threshold value as needed and leave the other values as is.
Data Flow Delays
You can create alerts to conditions defined for the DataFlowDelayInHrs metric. The following table shows some recommended alarms you can set up along with a corresponding Monitoring Query Language (MQL) example which you can use as a template to define your alarms. For more information about setting up alarms, see Managing Alarms.
Alarm Name | MQL Alarm Definition | Description |
---|---|---|
DataFlowSourceAlarmFor1HrData |
DataFlowDelayInHrs[1h]{dataProcessingFrequencyInHrs="1.00"}.grouping(telemetrySource , sourceIdentifier).mean() > 48 Pending duration: 1h |
For a sourceType, sourceIdentifier with 1 hour data processing frequency, the mean value (across targets) of DataFlowDelayInHrs is greater than 48 hours for continuous 6 hours. This indicates that the problem is at the whole source level. |
DataFlowResourceAlarmFor1HrData |
DataFlowDelayInHrs[1h]{dataProcessingFrequencyInHrs="1.00"}.grouping(telemetrySource, resourceId,resourceDisplayName, sourceIdentifier).max() > 24 Pending duration: 1h |
For a sourceType, resource & sourceIdentifier, DataFlowDelayInHrs is more than 24 hours for continuous 1 day for the type of data for which data processing frequency is every 1 hour. |
DataFlowResourceAlarmFor3HrData |
DataFlowDelayInHrs[3h]{dataProcessingFrequencyInHrs="3.00"}.grouping(telemetrySource, resourceId, sourceIdentifier).max() > 48 Pending duration: 1h |
For a sourceType, resource & sourceIdentifier, DataFlowDelayInHrs is more than 48 hours for continuous 1 day for the type of data for which data processing frequency is every 3 hours. |
DataFlowResourceAlarmForDailyData |
DataFlowDelayInHrs[3h]{dataProcessingFrequencyInHrs="24.00"}.grouping(telemetrySource, resourceId, sourceIdentifier).mean() Pending duration: 1h |
For a sourceType, resource & sourceIdentifier, DataFlowDelayInHrs is more than 72 hours for continuous 1 day for the type of data for which data processing frequency is every 24 hours. |
About Forecast Issues
Operations insights provides metrics to help you configure alarms for high (default value >75%) or low (default value < 25%) utilization for a given resource and resource metric. Additionally you can customize these forecast metric thresholds. Helping provide more granular capacity management forecasting, allowing you to be more proactive in resource management by setting threshold values that are more relevant to a specific target type for more accurate forecasting. For more information on setting threshold values see: Changing Utilization Thresholds.
The forecast metrics are generated using at most 100 days of history data and forecast window of 90 days. You can verify the forecast from Operations Insights console by selecting 1 year in the Time Range Filter and High or Low utilization for 90 days, as shown below.



The following table shows a sample of a recommended alarm you can set up along with a corresponding Monitoring Query Language (MQL) example which you can use as a template to define your alarms. For more information about setting up alarms, see Managing Alarms.
Alarm Name | MQL | Description |
---|---|---|
DaysToReachHighUtilizationStorageLessThan30D |
DaysToReachHighUtilization[1D]{resourceMetric="STORAGE", resourceType="Database", exceededForecastWindow="false"}.grouping(telemetrySource,resourceId).mean() < 30," |
For sourceType, resourceType, resourceMetric and sourceIdentifier, DaysToReachHighUtilization is less than 30 days. |
DaysToReachHighUtilizationExaStorage |
DaysToReachHighUtilization[1D]{resourceMetric="STORAGE", resourceType="Database", exceededForecastWindow="false"}.grouping(telemetrySource,resourceId).mean() < 30, |
For sourceType, resourceType, resourceMetric and sourceIdentifier, DaysToReachHighUtilization is less than 30 days. |
For linear and seasonality aware forecasts, the forecast window is 90 days, which means that if a specific resource has a forecast of more than 90 days, by default the metric value will show 91 days. For AutoML this is forecast by number of data points available.
Using the Console
Using the Console
To view metric charts by dimension
- Open the navigation menu and click Observability & Management. Under Monitoring, click Service Metrics.
- For Metric Namespace, select oci_operations_insights.
- For Dimensions, click Add.
- For Dimension Name, select a dimension and then select a Dimension Value.
Add more dimensions as needed.
- Click Done.
The Service Metrics page displays a default set of charts for the selected metric namespace and dimension. You can also use the Monitoring service to create custom queries.
For more information about monitoring metrics and using alarms, see Monitoring. For information about notifications for alarms, see Notifications Overview.
To view metric charts using Metrics Explorer
- Open the navigation menu and click Observability & Management. Under Monitoring, click Metrics Explorer.
The Metrics Explorer page displays an empty chart with fields to build a query.
- Select a compartment.
- From Metric Namespace, select oci_operations_insights.
- From Metric Name, select a metric.
- (Optional) Refine your query.
For instructions, see To create a query.
- Click Update Chart.
The chart shows the results of your new query. You can optionally add more queries by clicking Add Query below the chart.
For more information about monitoring metrics and using alarms, see Monitoring. For information about notifications for alarms, see Notifications Overview.
Using the APIs
Using the API
For information about using the API and signing requests, see REST APIs and Security Credentials. For information about SDKs, see Software Development Kits and Command Line Interface.
Use the following APIs for monitoring:
- Monitoring API for metrics and alarms
- Notifications API for notifications (used with alarms)