Setting Up Alarms

You can use Oracle Cloud Infrastructure Monitoring service to generate alarms when metrics cross thresholds.

First familiarize yourself with the Monitoring service concepts and features by reviewing Overview of Monitoring. For more information about setting up alarms, see Managing Alarms. See Monitoring Query Language (MQL) Reference for constructing advanced queries for both monitoring as well as alarms.

Also make sure that you've set the appropriate policies to use alarm rules. Refer to Getting Started.

Before you proceed you should have created an alarm destination, e.g Notification service as well as topic(s) that define who will receive these alarms.

The following table lists metric details you will need to create alarm rules for metrics used in Stack Monitoring.

Resource Type Metric Namespace Alarm Rule Resource Group Alarm Rules Metrics Reference
Host oracle_appmgmt host Hosts Host Metrics
Non-container, container, and pluggable Oracle Databases oracle_oci_database n/a Oracle Database Oracle Database
Oracle Database System, ASM, Cluster, and Listener oracle_oci_database_cluster oracle_asm, oracle_cluster, oracle_db_node, oracle_lsnr Oracle Database Oracle Database Cluster

Oracle WebLogic Domain

Oracle WebLogic Cluster

oracle_appmgmt

weblogic_cluster

Oracle Weblogic Server WebLogic Metrics
Oracle WebLogic Server oracle_appmgmt weblogic_j2eeserver Oracle Weblogic Server WebLogic Metrics
Oracle HTTP Server (OHS) oracle_appmgmt oracle_http_server Oracle HTTP Server (OHS) Oracle HTTP Server (OHS) Metrics
Oracle Identity Manager (OIM) oracle_appmgmt oracle_oim / oracle_oim_cluster Oracle Identity Manager (OIM) Oracle Identity Manager (OIM)
Oracle Access Manager (OAM) oracle_appmgmt oracle_oam / oracle_oam_cluster Oracle Access Manager (OAM) Oracle Access Manager (OAM)
Oracle E-Business Suite oracle_appmgmt ebs_instance Oracle E-Business Suite E-Business Suite Metrics
EBS Application Listener oracle_appmgmt oracle_ebs_app_lsnr Oracle E-Business Suite E-Business Suite Metrics
EBS Concurrent Processing oracle_appmgmt oracle_ebs_conc_mgmt_service Concurrent Processing E-Business Suite Metrics
EBS Concurrent Processing - Specialized oracle_appmgmt oracle_ebs_conc_mgmt_service_specialized Concurrent Processing E-Business Suite Metrics
EBS Concurrent Processing Node oracle_appmgmt oracle_ebs_cp_node Oracle E-Business Suite E-Business Suite Metrics
EBS Forms System oracle_appmgmt oracle_ebs_forms_system Oracle E-Business Suite E-Business Suite Metrics
EBS Workflow Agent Listener oracle_appmgmt oracle_ebs_wf_agent_lsnr Oracle E-Business Suite E-Business Suite Metrics
EBS Workflow Background Engine oracle_appmgmt oracle_ebs_wf_bkgd_engine Oracle E-Business Suite E-Business Suite Metrics
EBS Workflow Group oracle_appmgmt oracle_ebs_wf_group Oracle E-Business Suite E-Business Suite Metrics
EBS Workflow Notification Mailer oracle_appmgmt oracle_ebs_wf_notification_mailer Workflow Notification Mailer E-Business Suite Metrics
Apache Tomcat oracle_appmgmt apache_tomcat Apache Tomcat Apache Tomcat Metrics
Microsoft SQL Server oracle_appmgmt sql_server Microsoft SQL Server Microsoft SQL Server Metrics
PeopleSoft Application Server Domain oracle_appmgmt oracle_psft_appserv PeopleSoft PeopleSoft Metrics
PeopleSoft Process Scheduler Domain oracle_appmgmt oracle_psft_prcs PeopleSoft PeopleSoft Metrics
PeopleSoft PIA oracle_appmgmt oracle_psft_pia PeopleSoft PeopleSoft Metrics
PeopleSoft Elasticsearch oracle_appmgmt elastic_search PeopleSoft PeopleSoft Metrics
PeopleSoft Process Monitor oracle_appmgmt oracle_psft_prcm PeopleSoft PeopleSoft Metrics
Apache HTTP Server oracle_appmgmt apache_http_server Apache HTTP Server Apache HTTP Server Metrics
OUD Directory Server oracle_appmgmt oud_directory Oracle Unified Directory Oracle Unified Directory Metrics
OUD Proxy Server oracle_appmgmt oud_proxy Oracle Unified Directory Oracle Unified Directory Metrics
OUD Replication Gateway oracle_appmgmt oud_gateway Oracle Unified Directory Oracle Unified Directory Metrics
GoldenGate oracle_appmgmt oracle_goldengate Oracle GoldenGate Oracle GoldenGate Metrics
GoldenGate ServiceManager oracle_appmgmt oracle_goldengate_service_manager Oracle GoldenGate Oracle GoldenGate Metrics
GoldenGate AdminServer oracle_appmgmt oracle_goldengate_admin_server Oracle GoldenGate Oracle GoldenGate Metrics
GoldenGate Performance Metric Server oracle_appmgmt oracle_goldengate_pm_server Oracle GoldenGate Oracle GoldenGate Metrics
GoldenGate Extract oracle_appmgmt oracle_goldengate_extract Oracle GoldenGate Oracle GoldenGate Metrics
GoldenGate Replicat oracle_appmgmt oracle_goldengate_replicat Oracle GoldenGate Oracle GoldenGate Metrics
GoldenGate DistributionServer oracle_appmgmt oracle_goldengate_distribution_server Oracle GoldenGate Oracle GoldenGate Metrics
GoldenGate Distribution Path oracle_appmgmt oracle_goldengate_distribution_path Oracle GoldenGate Oracle GoldenGate Metrics

GoldenGate Receiver Server

oracle_appmgmt oracle_goldengate_receiver_server Oracle GoldenGate Oracle GoldenGate Metrics
GoldenGate Receiver Path oracle_appmgmt oracle_goldengate_receiver_path Oracle GoldenGate Oracle GoldenGate Metrics
Custom Resource oracle_appmgmt custom_resource Process-based Custom Resource Sample Alarm Rules Process-based Custom Resource Metrics
Oracle Service Bus oracle_appmgmt oracle_servicebus Oracle Service Bus (OSB) Oracle Service Bus (OSB)
Microsoft IIS oracle_appmgmt microsoft_iis Microsoft IIS Microsoft IIS Metrics
Microsoft IIS Website oracle_appmgmt microsoft_iis_website Microsoft IIS Microsoft IIS Metrics

Best practices for common alarm scenarios

  1. Create your alarm rules in the same compartment where you have discovered your resources.
  2. To set up an alarm rule to generate an alarm when a resource is down, specify the appropriate metric namespace and resource group and use following metric and trigger rule:

    Metric Name: MonitoringStatus

    Trigger rule:

    • Operator: equal to

    • Value: 0

    • Trigger delay minutes: 3

  3. To set up an alarm rule to trigger for individual resource instances, in additional to choosing the metric, you'll also have to add metric dimensions to uniquely identify the resource.

    To uniquely identify a resource instance:

    1. You can use resourceName and resourceType OR
    2. You can use resourceId

      Most metrics define additional dimensions that can be used to set advanced alarms.

  4. Always refer to metric description found in the Metric Reference and check the evaluation time period (how often is each metric collected). When setting up alarms, make sure you provide the same value as the alarm Interval value. This can be done via Switch to Advanced Mode at the top-right corner of the alarm creation page. You can provide advanced MQL into the Query code editor section of the advanced mode page.

Hosts

Sample Alarm Rule: Host Monitoring

  • Resource Type: Host
  • Metric Namespace: oracle_appmgmt
  • Resource Group: host
Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Host Down

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()
- - Critical alarm for any host in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

High CPU Utilization (Warning/Critical)

Metric name: CpuUtilization

Warning MQL:

CpuUtilization[1m]{type="Total"}.mean() > 80

Critical MQL:

CpuUtilization[1m]{type="Total"}.mean() > 90
> 80 > 90 Warning alarm for any host in a given compartment reporting over 80% CPU utilization for past 5 minutes.

Critical alarm for any host in a given compartment reporting over 90% CPU utilization for past 5 minutes.

15

High Memory Utilization (Warning/Critical)

Metric name: MemoryUtilization

Warning MQL:

MemoryUtilization[1m]{type="Logical"}.mean() > 80

Critical MQL:

MemoryUtilization[1m]{type="Logical"}.mean() > 80
> 80 > 90 Warning alarm for any host in a given compartment reporting over 80% memory utilization for past 5 minutes.

Warning alarm for any host in a given compartment reporting over 90% memory utilization for past 5 minutes.

15

Filesystem Utilization (Warning/Critical)

Metric name: FilesystemUtilization

Warning MQL:

FilesystemUtilization[1m].mean() > 80

Critical MQL:

FilesystemUtilization[1m].mean() > 90
> 80 > 90 Warning alarm for any filesystem on any host in a given compartment reporting over 80% memory utilization.

Critical alarm for any filesystem on any host in a given compartment reporting over 90% memory utilization.

Note

For monitoring selected file systems, you can further specify the fileSystemName dimension and customize your alarms to your specific needs. For example. the following MQL FilesystemUtilization[1m]{fileSystemName = "/", osType = "Linux"}.mean() > 80 will only apply to any root filesystems on any Linux hosts in given compartment.

Oracle Database

Sample Alarm Rule: Non-Container Database

  • Resource Type: Non-Container DB

  • Metric Namespace: oracle_oci_database

  • Resource Group: n/a

Evaluation time period (minutes) Alarm Rule(metric or MQL) Warning Critical DBM Recommended Value Used? Description
30 minutes

Metric: StorageUtilizationByTablespace

  • Dimension: tablespaceContents = PERMANENT

OR

Warning MQL:

StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 75

Critical MQL:

StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 85

>75 >85 Y Warning and Critical alarm rule conditions for permanent tablespaces whose utilization is greater than 75% or 85% over the past 30 minutes.
24 hours InvalidObjects >150 >200    
15 minutes BlockingSessions >1 >10 Y Warning and Critical alarm rule conditions to trigger an alarm when the number of blocking sessions is greater than 1 or 10 over the past 15 minutes.
15 minutes UsableFRA <20 <10   Warning and Critical alarm rule conditions to trigger an alarm when the percentage of usable fast recovery area is less than 20% or 10% over the past 15 minutes.
5 minutes ProcessLimitUtilization >70 >80 Y Warning and Critical alarm rule conditions to trigger an alarm when the process utilization (%) is greater than 70% or 80% over the past 5 minutes.
5 minutes SessionLimitUtilization >90 >97    
5 minutes CPUUtilization >80 >85 Y  
5 minutes FRAUtilization >70 >75 Y  
5 minutes StorageUtilization >75 >85 Y  
1 minute MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() n/a n/a   Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.

Sample Alarm Rule: Container Database

  • Resource Type: Container DB

  • Metric Namespace: oracle_oci_database

  • Resource Group: n/a

Evaluation time period (minutes) Alarm Rule(metric or MQL) Warning Critical DBM Recommended Value Used? Description
1 minute MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() n/a n/a   Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
30 minutes

Metric: StorageUtilizationByTablespace

  • Dimension: tablespaceContents = PERMANENT

OR

Warning MQL:

StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 75

Critical MQL:

StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 85

>75 >85 Y Warning and Critical alarm rule conditions for permanent tablespaces whose utilization is greater than 85% or 90% over the past 30 minutes.
5 minutes ProcessLimitUtilization >70 >80 Y Warning and Critical alarm rule conditions to trigger an alarm when the process utilization (%) is greater than 90% or 95% over the past 5 minutes.
5 minutes SessionLimitUtilization >90 >97    
15 minutes UsableFRA <20 <10   Warning and Critical alarm rule conditions to trigger an alarm when the percentage of usable fast recovery area is less than 20% or 10% over the past 15 minutes.
5 minutes CPUUtilization >80 >85 Y  
5 minutes FRAUtilization >70 >75 Y  
5 minutes StorageUtilization >75 >85 Y  

Sample Alarm Rule: Pluggable Database

  • Resource Type: Pluggable DB

  • Metric Namespace: oracle_oci_database

  • Resource Group: n/a

Evaluation time period (minutes) Alarm Rule(metric or MQL) Warning Critical DBM Recommended Value Used? Description
1 minute MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() n/a n/a   Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5 minutes CPUUtilization >80 >85 Y  
5 minutes StorageUtilization >75 >85 Y  
15 minutes BlockingSessions >1 >10 Y Warning and Critical alarm rule conditions to trigger an alarm when the number of blocking sessions is greater than 1 or 5 over the past 15 minutes.
24 hours InvalidObjects >150 >200    
30 minutes

Metric: StorageUtilizationByTablespace

  • Dimension: tablespaceContents = PERMANENT

OR

Warning MQL:

StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 75

Critical MQL:

StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 85

>75 >85 Y Warning and Critical alarm rule conditions for permanent tablespaces whose utilization is greater than 85% or 90% over the past 30 minutes.

Sample Alarm Rule: ASM/ASM Instance

  • Resource Type: ASM

  • Metric Namespace: oracle_asm

  • Resource Group: n/a

Evaluation time period (minutes) Alarm Rule(metric or MQL) Warning Critical Description
1 minute MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() - - Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
30 minutes
DiskGroupUtilization
>85 >95  
30 minutes
DiskUtilization
>85 >95  
Sample Alarm Rule: ASM Cluster
  • Resource Type: ASM Cluster

  • Metric Namespace: oracle_cluster

  • Resource Group: n/a

Evaluation time period (minutes) Alarm Rule(metric or MQL) Warning Critical Description
1 minute MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() - - Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
30 minutes DiskGroupUtilization >85 >95  
30 minutes DiskUtilization >85 >95  

Sample Alarm Rule: Listener

  • Resource Type: Listener

  • Metric Namespace: oracle_oci_database_cluster

  • Resource Group: oracle_lsnr

Evaluation time period (minutes) Alarm Rule(metric or MQL) Warning Critical Description
1 minute MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() - - Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5 minutes RefusedConnections >1 >5  

E-Business Suite

Sample Alarm Rule: EBS

  • Resource Type: Oracle E-Business Suite
  • Metric Namespace: oracle_appmgmt
  • Resource Group: ebs_instance
Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
15

Executed Programs By Running Time (ms)

Metric name: ExecutedProgramsByRunningTime

MQL:

ExecutedProgramsByRunningTime[15m].mean() > 4000

Tip1:

You can filter the alarm to a specific application by adding ProgramName or ProgramShortName dimension filter.

> 4000 > 40000 The running time of the requests
15

Completed Requests By Application (ratio)

Metric name: CompletedRequestsByApplication

Dimension name: Category

Dimension value: Error

MQL:

CompletedRequestsByApplication[15m]{Category = "Error"}.mean() > 0.001

Tip1:

You can filter the alarm to a specific application by adding ApplicationName or A dimension filter.

MQL:

CompletedRequestsByApplication[15m]{Category = "Error", ApplicationName = "<YOUR APP NAME>"}.mean() > 0.001
> 0.001 > 0.0025

The ratio of requests that completed with error compared to all requests in given collection interval.

This means if more than 0.1% requested failed, you will get a warning, for more than 0.25% you get critical

15

Active User Sessions

Metric name: ActiveUserSessions

MQL:

ActiveUserSessions[15m].mean() > 200
> 200 > 250 The number of active user sessions

Sample Alarm Rule: EBS Application Listener

Resource Type: EBS Application Listener

Metric Namespace: oracle_appmgmt

Resource Group: oracle_ebs_app_lsnr

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Monitoring Status

MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()
n/a 0 Critical alarm for EBS Application Listener in a given compartment reporting being down or not reporting status for over 1min.

Sample Alarm Rule: EBS Concurrent Processing

Resource Type: EBS Concurrent Processing

Metric Namespace: oracle_appmgmt

Resource Group: oracle_ebs_conc_mgmt_service

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Monitoring Status

Metric name: MonitoringStatus

MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()
N/A 0 The availability status.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
15

Concurrent Requests Error Rate

Metric name: CompletedConcurrentRequests

Dimension name: "State"

Dimension value: "Errored"

MQL:

CompletedConcurrentRequests[15m]{State = "Errored"}.mean() > 0.001
> 0.001 > 0.0025 The rate of requests that completed with errors on an hourly basis. If multiplied by 100, becomes a percentage.
15

Concurrent Requests Warning Rate

Metric name: CompletedConcurrentRequests

Dimension name: "State"

Dimension value: "WithWarning"

MQL:

CompletedConcurrentRequests[15m]{State = "WithWarning"}.mean() > 0.15
> 0.0015 > 0.003 The rate of requests that completed with warning on an hourly basis. If multiplied by 100, becomes a percentage.
15

Concurrent Requests Completed Successfully (ops/evaluation time period)

Metric name: CompletedConcurrentRequests

Dimension name: "State"

Dimension value: "Successful"

MQL:

CompletedConcurrentRequests[15m]{State = "Successful"}.sum() > 2500
> 625 > 2500 The rate of requests that completed successfully on an evaluation time period (15minutes by default) basis.
15

Concurrent Requests Running

Metric name: ConcurrentRequestsByStatus

Dimension name: "State"

Dimension value: "Running"

MQL:

ConcurrentRequestsByStatus[15m]{State = "Running"}.mean() > 100
> 2500 > 10000 The number of running requests by user.
15

Concurrent Requests Pending - Normal

Metric name: ConcurrentRequestsByStatus

Dimension name: "State"

Dimension value: "PendingNormal"

MQL:

ConcurrentRequestsByStatus[15m]{State = "PendingNormal"}.mean() > 100
> 2500 > 10000 The number of pending requests by user.
15

Concurrent Requests Pending - Standby

Metric name: ConcurrentRequestsByStatus

Dimension name: "State"

Dimension value: "PendingStandBy"

MQL:

ConcurrentRequestsByStatus[15m]{State = "PendingStandBy"}.mean() > 100
> 100 > 500 The number of requests in pending stand-by status.
15

Concurrent Requests Inactive - No Manager

Metric name: ConcurrentRequestsByStatus

Dimension name: "State"

Dimension value: "InactiveNoManager"

MQL:

ConcurrentRequestsByStatus[15m]{State = "InactiveNoManager"}.mean() > 100
> 100 > 500 The number of requests in inactive no manager status.
15

Concurrent Requests Inactive - On Hold

Metric name: ConcurrentRequestsByStatus

Dimension name: "State"

Dimension value: "InactiveOnHold"

MQL:

ConcurrentRequestsByStatus[15m]{State = "InactiveOnHold"}.mean() > 100
> 100 > 500 The number of requests in inactive on hold status.
5
  • Long Running Concurrent Requests (ms)

Metric name: LongActiveConcurrentRequests

MQL:

LongActiveConcurrentRequests[5m].mean() > 43200000

Tip1:

You can filter the alarm to a Running or Pending request by adding Phase dimension filter.

MQL:

LongActiveConcurrentRequests[5m]{Phase = "Running"}.mean() > 43200000

Tip2:

You can further filter by specific program by adding ProgramName or ProgramShortName dimension filter.

MQL:

LongActiveConcurrentRequests[1m]{Phase = "Running", ProgramShortName = "<PROGRAM SHORT NAME>"}.mean() > 43200000
> 43200000 > 86400000 The elapsed time in ms for a pending or running request. Only top 10 requests are tracked. In this instance we are suggesting to get Warning after 12hrs and Critical after 24hrs.

EBS Concurrent Processing - Specialized

Resource Type: EBS Concurrent Processing - Specialized

Metric Namespace: oracle_appmgmt

Resource Group: oracle_ebs_conc_mgmt_service_specialized

Metric Metric Display Name Unit Description Collection Frequency Dimension Resource Name
MonitoringStatus Availability status

Status of the resource. Values are:

1 = Up

0 = Down

Only if ALL other managers are up, status is up. If only one manager is down, overall status is down.

1 min NA oracle_ebs_conc_mgmt_service_specialized
ConcurrentProcesingComponentStatus Concurrent Manager Status status Availability of concurrent manager 1 min Concurrent Queue Name, Description, Host Name oracle_ebs_conc_mgmt_service_specialized
CapacityUtilizationOfConcurrentManagers Concurrent Manager Capacity Utilization percent Percentage of max processes running. If manager's max processes is 10 and 5 are running, capacity utilization is 50% 1 min Manager Name oracle_ebs_conc_mgmt_service_specialized
ManagerMaxProcesses Concurrent Manager Max Processes count Maximum number of processes to be in the manager's queue. 1 min Manager Name oracle_ebs_conc_mgmt_service_specialized
ManagerRunningProcesses Concurrent Manager Running Processes count Number of running processes in the manager's queue 1 min Manager Name oracle_ebs_conc_mgmt_service_specialized

Sample Alert Rule: EBS Workflow Notification Mailer

Resource Type: EBS Workflow Notification Mailer

Metric Namespace: oracle_appmgmt

Resource Group: oracle_ebs_wf_notification_mailer

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Metric name: Monitoring Status

Monitoring Status

MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()
N/A 0 Critical alarm for EBS Concurrent Processing Specialized in a given compartment reporting being down or not reporting status for over 1min.
1

Metric name: Concurrent Manager Capacity Utilization

MQL:

CapacityUtilizationOfConcurrentManagers[1m].mean() < 100
< 50 < 100 Percentage of capacity utilization of all enabled managers.

Apache Tomcat

Sample Alarm Rule: Apache Tomcat

Resource Type: Apache Tomcat

Metric Namespace: oracle_appmgmt

Resource Group: apache_tomcat

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1-5

Apache Tomcat Down

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()
- - Critical alarm for any Apache Tomcat in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
15

High CPU Utilization (Warning/Critical)

Metric name: CPUUtilization

Warning MQL:

CPUUtilization[1m].mean() > 80

Critical MQL:

CPUUtilization[1m].mean() > 90
>80 >90

Warning alarm for any Apache Tomcat in a given compartment reporting over 80% CPU utilization for past 15 minutes.

Critical alarm for any Apache Tomcat in a given compartment reporting over 90% CPU utilization for past 15 minutes.

10

High JVM Heap Memory Utilization (Warning/Critical)

Metric name: JVMMemoryUtilization

Warning MQL:

JVMMemoryUtilization[1m]{Type = "Heap"}.mean() > 80

Critical MQL:

JVMMemoryUtilization[1m]{Type = "Heap"}.mean() > 90
>80 >90

Warning alarm for any Apache Tomcat in a given compartment reporting over 80% JVM heap memory utilization for past 10 minutes.

Warning alarm for any Apache Tomcat in a given compartment reporting over 90% JVM heap memory utilization for past 10 minutes.

1-5

High Web Request Processing Time (Warning/Critical)

Metric name: WebRequestProcessingTime

Warning MQL:

WebRequestProcessingTime[1m].mean() > 1500

Critical MQL:

WebRequestProcessingTime[1m].mean() > 3000
>1500 >3000

Warning alarm for any Apache Tomcat in a given compartment reporting over 1500ms mean web request processing time for past 1-5 minutes.

Warning alarm for any Apache Tomcat in a given compartment reporting over 3000ms mean web request processing time for past 1-5 minutes.

Microsoft SQL Server

Sample Alarm Rules: Microsoft SQL Server

Resource Type: Microsoft SQL Server

Metric Namespace: oracle_appmgmt

Resource Group: sql_server

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1-5

SQL Server Availability Status

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()
- - Critical alarm for any SQL Server in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
15

SQL Server CPU Utilization(%) (Warning/Critical)

Metric name: CPUUtilization

Warning MQL:

CpuUtilization[15m].mean() > 80

Critical MQL:

CpuUtilization[15m].mean() > 95
>80 >95 Warning alarm for any SQL Server in a given compartment reporting over 80% CPU utilization for past 15 minutes.

Critical alarm for any SQL Server in a given compartment reporting over 90% CPU utilization for past 15 minutes.

PeopleSoft

PeopleSoft Application Server

  • Resource Type: PeopleSoft Application Server Domain
  • Metric Namespace: oracle_appmgmt
  • Resource Group: oracle_psft_appserv
Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
5

Health (status)

Metric name: Health

Warning MQL:

Health[1m]{HealthState = "Warning"}.mean() = 1

Critical MQL:

Health[1m]{HealthState = "Critical"}.mean() = 1
warning = 1 critical = 1

Overall health of the application server domain.

A warning alarm will be fired if the state 'warning' is equal to 1.

A critical alarm will be fired if the state 'critical' is equal to 1.

5

Load (status)

Metric name: Load

Warning MQL:

Load[1m]{LoadState = "Medium"}.mean() = 1

Critical MQL

Load[1m]{LoadState = "Heavy"}.mean() = 1
medium = 1 heavy= 1

Overall load of the application server domain.

A warning alarm will be fired if the state 'medium' is equal to 1.

A critical alarm will be fired if the state 'heavy' is equal to 1.

5

Average Service Request Execution Time (ms)

Metric name: AverageServiceRequestExecutionTime

Warning MQL:

AverageServiceRequestExecutionTime[5m].mean()> 1000
> 1000 NA

Average time in milliseconds it takes to execute a service request.

Warning alarm is fired when in average a request takes more than a second (1000 ms) to be executed.

5

Queued Processes for Application Server (count)

Metric name: QueuedTuxedoProcesses

Dimension name: Category

Dimension value: ApplicationServer

Critical MQL:

QueuedTuxedoProcesses[5m]{Category = "ApplicationServer"}.mean() > 1
NA > 1 Number of processes that are currently in queue for the Application Server. More than 1 process in queue will fire a critical alarm.
5

Queued Processes for BRK Handler (count)

Metric name: QueuedTuxedoProcesses

Dimension name: Category

Dimension value: BRKHandler

Critical MQL:

QueuedTuxedoProcesses[5m]{Category = "BRKHandler"}.mean() > 1
NA > 1 Number of processes that are currently in queue for the BRK Handler. More than 1 process in queue will fire a critical alarm.
5

Queued Processes for BRK Dispatcher (count)

Metric name: QueuedTuxedoProcesses

Dimension name: Category

Dimension value: BRKDispatcher

Critical MQL:

QueuedTuxedoProcesses[5m]{Category = "BRKDispatcher"}.mean() > 1
NA > 1 Number of processes that are currently in queue for the BRK Dispatcher. More than 1 process in queue will fire a critical alarm.
5

Queued Processes for PUB Dispatcher (count)

Metric name: QueuedTuxedoProcesses

Dimension name: Category

Dimension value: PUBDispatcher

MQL:

QueuedTuxedoProcesses[5m]{Category = "PUBDispatcher"}.mean() > 1
NA > 1 Number of processes that are currently in queue for the PUB Dispatcher. More than 1 process in queue will fire a critical alarm.
5

Queued Processes for PUB Handler (count)

Metric name: QueuedTuxedoProcesses

Dimension name: Category

Dimension value: PUBHandler

Critical MQL:

QueuedTuxedoProcesses[5m]{Category = "PUBHandler"}.mean() > 1
NA > 1 Number of processes that are currently in queue for the PUB Handler. More than 1 process in queue will fire a critical alarm.
5

Queued Processes for SUB Dispatcher (count)

Metric name: QueuedTuxedoProcesses

Dimension name: Category

Dimension value: SUBDispatcher

Critical MQL:

QueuedTuxedoProcesses[5m]{Category = "SUBDispatcher"}.mean() > 1
NA > 1 Number of processes that are currently in queue for the SUB Dispatcher. More than 1 process in queue will fire a critical alarm.
5

Queued Processes for SUB Handler (count)

Metric name: QueuedTuxedoProcesses

Dimension name: Category

Dimension value: SUBHandler

Critical MQL:

QueuedTuxedoProcesses[5m]{Category = "SUBHandler"}.mean() > 1
NA > 1 Number of processes that are currently in queue for the SUB Handler. More than 1 process in queue will fire a critical alarm.
5

Failed Server Processes (count)

Metric name: FailedServerProcesses

Critical MQL:

FailedServerProcesses[5m].mean() > 0
NA > 0 Number of server processes that have failed or are down within the domain. If any server process fails, a critical alarm will be fired.
15

State Files (count)

Metric name: PeopleToolsStateFiles

Warning MQL:

PeopleToolsStateFiles[15m].mean() > 0
> 0 NA Number of PeopleTools state files generated in the domain logs directory. If any state file is generated, a warning alarm will be fired.

PeopleSoft Process Scheduler

  • Resource Type: PeopleSoft Process Scheduler Domain
  • Metric Namespace: oracle_appmgmt
  • Resource Group: oracle_psft_prcs
Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
5

Health (status)

Metric name: Health

Warning MQL:

Health[1m]{HealthState = "Warning"}.mean() = 1

Critical MQL:

Health[1m]{HealthState = "Critical"}.mean() = 1
warning = 1 critical = 1

Overall health of the process scheduler domain.

A warning alarm will be fired if the state 'warning' is equal to 1.

A critical alarm will be fired if the state 'critical' is equal to 1.

5

Load (status)

Metric name: Load

Warning MQL:

Load[1m]{LoadState = "Medium"}.mean() = 1

Critical MQL:

Load[1m]{LoadState = "Heavy"}.mean() = 1
medium = 1 heavy= 1

Overall load of the process scheduler domain.

A warning alarm will be fired if the state 'medium' is equal to 1.

A critical alarm will be fired if the state 'heavy' is equal to 1.

5

Queued Processes for PSPRCSRV (count)

Metric name: QueuedTuxedoProcesses

Dimension name: ProcessType

Dimension value: PSPRCSRV

Critical MQL:

QueuedTuxedoProcesses[5m]{ProcessType = "PSPRCSRV"}.mean() > 1
NA > 1 Number of processes that are currently in queue for the process scheduler (PSPRCSRV). More than 1 process in queue will fire a critical alarm.
5

Queued Processes for PSDSTSRV (count)

Metric name: QueuedTuxedoProcesses

Dimension name: ProcessType

Dimension value: PSDSTSRV

Critical MQL:

QueuedTuxedoProcesses[5m]{ProcessType = "PSDSTSRV"}.mean() > 1
NA > 1 Number of processes that are currently in queue for the distribution server (PSDSTSRV). More than 1 process in queue will fire a critical alarm.
5

Failed Processes (count)

Metric name: FailedProcesses

Critical MQL:

FailedProcesses[5m].mean() > 1
NA > 0 Number of server processes that have failed or are down within the domain. If any server process fails, a critical alarm will be fired.

PeopleSoft PIA

  • Resource Type: PeopleSoft PIA
  • Metric Namespace: oracle_appmgmt
  • Resource Group: oracle_psft_pia
Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
5

Health (status)

Metric name: Health

Warning MQL:

Health[1m]{HealthState = "Warning"}.mean() = 1

Critical MQL:

Health[1m]{HealthState = "Critical"}.mean() = 1
warning = 1 critical = 1

Overall health of the PIA.

A warning alarm will be fired if the state 'warning' is equal to 1.

A critical alarm will be fired if the state 'critical' is equal to 1.

5

Load (status)

Metric name: Load

Warning MQL:

Load[1m]{LoadState = "Medium"}.mean() = 1

Critical MQL:

Load[1m]{LoadState = "Heavy"}.mean() = 1
medium = 1 heavy= 1

Overall load of the PIA.

A warning alarm will be fired if the state 'medium' is equal to 1.

A critical alarm will be fired if the state 'heavy' is equal to 1.

5

Wait State Sockets (count)

Metric name: WaitStateSockets

Warning MQL:

WaitStateSockets[5m].mean() > 100
> 100 NA Number of web server sockets that are in WAIT state. If more than 100 web server sockets are in WAIT state, a warning alarm will be fired.
5

Fatal Errors (count)

Metric name: FatalErrors

Warning MQL:

FatalErrors[5m].mean() > 0
> 0 NA Number of fatal errors in the JOLTService servlet logs. If any error occurs in the JOLTService servlet, a warning alarm will be fired.

PeopleSoft Elasticsearch

  • Resource Type: PeopleSoft Elasticsearch
  • Metric Namespace: oracle_appmgmt
  • Resource Group: elastic_search
Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Metric name: Cluster Health

MQL:

ClusterHealth[1m]{Status = "Red"}.mean() = 1
ClusterHealth[1m]{Status = "Yellow"}.mean() = 1
yellow = 1 red = 1

Overall health of the elasticsearch cluster.

A warning alert will be triggered if the status 'yellow' is equal to 1.

A critical alert will be triggered if the status 'red' is equal to 1.

10

Metric name: Memory Utilization

MQL:

MemoryUsage[10m].mean()
> 80 > 90

Maximum configured heap of the elasticsearch node.

A warning alert will be triggered if the memory utilization is grater than 80 %.

A critical alert will be triggered if the memory utilization is grater than 90 %.

PeopleSoft Process Monitor

  • Resource Type: PeopleSoft Process Monitor
  • Metric Namespace: oracle_appmgmt
  • Resource Group: oracle_psft_prcm
Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
5

Metric Name: Active Distribution State

MQL:

ActiveDistributionState[5m]{State = "NotPosted"}.mean() > 1
NA > 1

A critical alert will be triggered if too many processes in distribution not posted state.

5

Metric Name :Run Status

MQL:

ActiveRunState[5m]{State = "NoSuccess"}.mean() > 1
NA > 1

A critical alert will be triggered if too many processes in run no success state.

5

Metric Name: Run Status

MQL:

RunStatus[5m]{Status = "Error"}.mean() > 0
NA > 0

A critical alert will be triggered if too many processes in run error state.

Oracle Weblogic Server

Sample Alarm Rule: Oracle Weblogic Server

  • Resource Type: OracleWeblogic Server
  • Metric Namespace: oracle_appmgmt
  • Resource Group: weblogic_j2eeserver
Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

WebLogic Server Down

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()

- - Critical alarm for any WebLogic Server in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

High CPU Utilization (Warning/Critical)

Metric name: CpuUtilization

Warning MQL:

CpuUtilization[5m].mean() > 80

Critical MQL:

CpuUtilization[5m].mean() > 90

> 80 > 90

Warning alarm for any WebLogic Server in a given compartment reporting over 80% CPU utilization for past 5 minutes.

Critical alarm for any WebLogic Server in a given compartment reporting over 90% CPU utilization for past 5 minutes.

5

High Heap Utilization (Warning/Critical)

Metric name: MemoryUtilization

Warning MQL:

MemoryUtilization[5m]

{Type = "Heap"}.mean() > 80

Critical MQL:

MemoryUtilization[5m]

{Type = "Heap"}.mean() > 90

> 80 > 90

Warning alarm for any WebLogic Server in a given compartment reporting over 80% Heap utilization for past 5 minutes.

Critical alarm for any WebLogic Server in a given compartment reporting over 90% Heap utilization for past 5 minutes.

5

Work Manager Stuck Threads (Warning/Critical)

Metric name:

WorkManagerStuckThreads

> 10 > 15

Warning alarm for any WebLogic Server in a given compartment reporting more than 10 work manager stuck thread for past 5 minutes.

Critical alarm for any WebLogic Server in a given compartment reporting more than 15 work manager stuck thread for past 5 minutes.

15

Connection Requests Waiting

Metric name:

ServerConnectionPoolConnections

Warning MQL:

ServerConnectionPoolConnections

[15m].mean() > 1

Critical MQL:

ServerConnectionPoolConnections

[15m].mean() > 2

>1 >2  
20

Web Request Processing Time

Metric name:

WebRequestProcessingTime

>10000 >15000  
5

Active Thread Pool Threads

Metric name:

ThreadPoolThreads

>1000 >1250  

Sample Alarm Rule: Oracle Weblogic Server Cluster

  • Resource Type: Oracle Weblogic Server Cluster

  • Metric Namespace: oracle_appmgmt

  • Resource Group: weblogic_cluster

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

WebLogic Cluster Down

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()
- - Critical alarm for any WebLogic Cluster in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.

Sample Alarm Rules: Oracle HTTP Server (OHS)

  • Resource Type: Oracle HTTP Server

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_http_server

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1 (or more to ignore random connectivity blips, e.g. 5 min)

Oracle HTTP Server Down

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()
- - Critical alarm for any Oracle HTTP Server in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

High CPU Utilization (Warning/Critical)

Metric name: CPUUtilization

Warning MQL:

CPUUtilization[1m].mean() > 80

Critical MQL:

CPUUtilization[1m].mean() > 90
>80 >90

Warning alarm for any Oracle HTTP Server in a given compartment reporting over 80% CPU utilization for past 5 minutes.

Critical alarm for any Oracle HTTP Server in a given compartment reporting over 90% CPU utilization for past 5 minutes.

5

High Memory Utilization (Warning/Critical)

Metric name: MemoryUtilization

Warning MQL:

MemoryUtilization[1m].mean() > 80

Critical MQL:

MemoryUtilization[1m].mean() > 90
>80 >90

Warning alarm for any Oracle HTTP Server in a given compartment reporting over 80% memory utilization for past 5 minutes.

Critical alarm for any Oracle HTTP Server in a given compartment reporting over 90% memory utilization for past 5 minutes.

1-5

High Web Request Processing Time (Warning/Critical)

Metric name: WebRequestProcessingTime

Warning MQL:

WebRequestProcessingTime[1m].mean() > 1500

Critical MQL:

WebRequestProcessingTime[1m].mean() > 3000
>1500 >3000

Warning alarm for any Oracle HTTP Server in a given compartment reporting over 1500ms mean web request processing time for past 1-5 minutes.

Critical alarm for any Oracle HTTP Server in a given compartment reporting over 3000ms mean web request processing time for past 1-5 minutes.

Oracle Identity Manager (OIM)

Sample Alarm Rule: Oracle Identity Manager (OIM)

  • Resource Type: Oracle Identity Manager / Oracle Identity Manager Cluster

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_oim / oracle_oim_cluster

Evaluation Time Period (minutes) Alarm Warning Critical Description
1

Metric name: Monitoring Status

MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()
---- < 1

Availability status of the OIM cluster/server.

A critical alert will be triggered if the response value is other than 1.

Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
15

Metric Name: Orchestration - Average Execution Time

MQL:

Orchestration-AverageExecutionTime[15m].mean() > 300
Orchestration-AverageExecutionTime[15m].mean() > 500
> 300 > 500

Orchestration Average Execution Time

A warning alert will be triggered if the orchestration average execution time is grater than 300 ms

A critical alert will be triggered if the orchestration average execution time is grater than 500 ms

Oracle Access Manager (OAM)

Sample Alarm Rule: Oracle Access Manager (OAM)

  • Resource Type: Oracle Access Manager / Oracle Access Manager Cluster

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_oam / oracle_oam_cluster

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Metric name: Monitoring Status

MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()
---- < 1

Availability status of the OAM cluster/server.

A critical alert will be triggered if the response value is other than 1.

Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

Metric Name: Authorization Latency

MQL:

authorizationLatency[5m].mean() > 300
authorizationLatency[5m].mean() > 500
> 500 > 800

Authorization Latency

A warning alert will be triggered if the authorization latency is grater than 500 ms

A critical alert will be triggered if the authorization latency is grater than 800 ms

Apache HTTP Server

Resource Type: Apache HTTP

ServerMetric Namespace: oracle_appmgmt

Resource Group: apache_http_server

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1 (or more to ignore random connectivity blips, e.g. 5 min)

Apache HTTP Server Down

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()
- - Critical alarm for any Apache HTTP Server in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
10

High Memory Utilization (Warning/Critical)

Metric name: MemoryUtilization

Warning MQL:

MemoryUtilization[1m].mean() > 80

Critical MQL:

MemoryUtilization[1m].mean() > 90
>80 >90

Warning alarm for any Apache HTTP Server in a given compartment reporting over 80% memory utilization for past 10 minutes.

Critical alarm for any Apache HTTP Server in a given compartment reporting over 90% memory utilization for past 10 minutes.

1-5

High Web Request Processing Time (Warning/Critical)

Metric name: WebRequestProcessingTime

Warning MQL:

WebRequestProcessingTime[1m].mean() > 1500

Critical MQL:

WebRequestProcessingTime[1m].mean() > 3000
>1500 >3000

Warning alarm for any Apache HTTP Server in a given compartment reporting over 1500ms mean web request processing time for past 1-5 minutes.

Critical alarm for any Apache HTTP Server in a given compartment reporting over 3000ms mean web request processing time for past 1-5 minutes.

Oracle Unified Directory

Sample Alarm Rule: Oracle Unified Directory(OUD)

  • Resource Type: Oracle Unified Directory

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oud_directory, oud_proxy, oud_gateway

Alarm Warning Critical Description

Metric name: Monitoring Status

MQL:

oud_base_status[1m].mean() != 1 || oud_base_status[1m].absent()
---- < 1

Availability status of the OUD server.

A critical alert will be triggered if the response value is less than 1.

Metric Name: Connection Handler State

MQL:

ConnectionHandlerState[1m].mean() < 1
---- <1

Connection Handler State

A critical alert will be triggered if the Connection Handler State is less than 1.

Metric Name: Backend Entries

MQL:

BackendEntries[5m].mean() > 30
BackendEntries[5m].mean() > 50
> 30 >50

Backend Entries

A warning alert will be triggered if the Backend Entries Time is greater than 30.

A critical alert will be triggered if the Backend Entries is greater than 50.

Metric Name: Connection Handler All Resident Time

MQL:

ConnectionHandlerAllResidentTime[5m].mean() > 300
ConnectionHandlerAllResidentTime[5m].mean() > 500
> 60 > 90

Connection Handler All Resident Time

A warning alert will be triggered if the Connection Handler All Resident Time is greater than 60.

A critical alert will be triggered if the Connection Handler All Resident Time is grater than 90.

Metric Name: Connection Handler Connections

MQL:

ConnectionHandlerConnections[5m].mean() > 30
ConnectionHandlerConnections[5m].mean() > 50
> 30 >50

Connection Handler Connections

A warning alert will be triggered if the Connection Handler Connections are greater than 30.

A critical alert will be triggered if the Connection Handler Connections are greater than 50.

Metric Name: JVM Used Memory

MQL:

JVMUsedMemory[5m].mean() > 1.5
JVMUsedMemory[5m].mean() > 3
> 1.5 > 3

JVM Used Memory

A warning alert will be triggered if the JVM Used Memory is greater than 1.5 mb.

A critical alert will be triggered if the JVM Used Memory is greater than 3 mb.

Metric Name: OS Used Memory

MQL:

OSUsedMemory[5m].mean() > 1.5
OSUsedMemory[5m].mean() > 3
> 1.5 > 3

OS Used Memory

A warning alert will be triggered if the OS Used Memory is greater than 1.5 mb.

A critical alert will be triggered if the OS Used Memory is greater than 3 mb.

Metric Name: Replication Domain State

MQL:

ReplicationDomainState[5m].mean() < 1
---- < 1

Replication Domain State

A critical alert will be triggered if the Replication Domain State is less than 1.

Metric Name: WFE Resident Time Operations Total Time

MQL:

WFEResidentTimeOperationsTotalTime[5m].mean() > 60
WFEResidentTimeOperationsTotalTime[5m].mean() > 90
> 60 > 90

WFE Resident Time Operations Total Time

A warning alert will be triggered if the WFE Resident Time Operations Total Time is greater than 60.

A critical alert will be triggered if the WFE Resident Time Operations Total Time is greater than 90.

Metric Name: Work Queue Current Backlog

MQL:

WorkQueueCurrentBacklog[5m].mean() > 15
WorkQueueCurrentBacklog[5m].mean() > 30
> 15 > 30

Work Queue Current Backlog

A warning alert will be triggered if the Work Queue Current Backlog is greater than 15.

A critical alert will be triggered if the Work Queue Current Backlog is greater than 30.

Metric Name: Extension LDAP Connections

MQL:

ExtensionLDAPConnections[5m].mean() > 30
ExtensionLDAPConnections[5m].mean() > 50
> 30 > 50

Extension LDAP Connections

A warning alert will be triggered if the Extension LDAP Connections are greater than 30.

A critical alert will be triggered if the Extension LDAP Connections is greater than 50.

Metric Name: Extension LDAP Operations Total Response Time

MQL:

ExtensionLDAPOperationsTotalResponseTime[5m].mean() > 60
ExtensionLDAPOperationsTotalResponseTime[5m].mean() > 90
> 60 > 90

Extension LDAP Operations Total Response Time

A warning alert will be triggered if the Extension LDAP Operations Total Response Time is greater than 60.

A critical alert will be triggered if the Extension LDAP Operations Total Response Time is greater than 90.

Oracle GoldenGate

Sample Alarm Rule: Goldengate

  • Resource Type: oracle_goldengate

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_goldengate

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Goldengate Down

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent()
    Critical alarm for any Goldengate in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.

Sample Alarm Rule: Goldengate AdminServer

  • Resource Type: Goldengate Admin Server

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_goldengate_admin_server

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Goldengate Admin Server Down

Metric name: MonitoringStatusCritical

MQL:

MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent()
    Critical alarm for any Goldengate AdminServer in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

High CPU Utilization (Warning/Critical)

Metric name: CPU utilization

Warning MQL:

CpuTimeUtilizationPercentage[5m].mean() > 80

Critical MQL:

CpuTimeUtilizationPercentage[5m].mean() > 90
80 90 Warning alarm for any Goldengate Admin Server in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Admin Server in a given compartment reporting over 90% CPU utilization for past 5 minutes.
5

Private memory (Warning/Critical)

Metric name: Private memory

Warning MQL:

PrivateMemory[5m].mean() > 30

Critical MQL:

PrivateMemory[5m].mean() > 40
30 40 A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes.
5

I/O read rate (Warning/Critical)

Metric name: I/O read rate

Warning MQL:

IOReadRate[5m].mean() > 10

Critical MQL:

IOReadRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes.
5

I/O write rate (Warning/Critical)

Metric name: I/O write rate

Warning MQL:

IOWriteRate[5m].mean() > 10

Critical MQL:

IOWriteRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes.
5

Dropped packet rate (Warning/Critical)

Metric name: Dropped packet rate

Warning MQL:

DroppedPacketRate[5m].mean() > 30

Critical MQL:

DroppedPacketRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Dropped packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Dropped packet rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Missing packet rate (Warning/Critical)

Metric name: Missing packet rate

Warning MQL:

MissingPacketRate[5m].mean() > 30

Critical MQL:

MissingPacketRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Missing packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Missing packet rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Packet error rate (Warning/Critical)

Metric name: Packet error rate

Warning MQL:

PacketErrorRate[5m].mean() > 30

Critical MQL:

PacketErrorRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Packet error rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet error rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Packet receive rate (Warning/Critical)

Metric name: Packet receive rate

Warning MQL:

PacketReceiveRate[5m].mean() > 30

Critical MQL:

PacketReceiveRate[5m].mean() > 40
30 40

A warning alert will be triggered if the Packet receive rate mean is grater than 30 Msgs/min for past 5 minutes.

A critical alert will be triggered if the Packet receive rate mean is grater than 40 Msgs/min for past 5 minutes.

Sample Alarm Rule: Goldengate Distribution Service

  • Resource Type: Goldengate Distribution Service

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_goldengate_distribution_server

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Goldengate Distribution Service

Metric name: MonitoringStatus

Critical MQL:MonitoringStatus

[1m].mean() == 0 || MonitoringStatus[1m].absent()
    Critical alarm for any Goldengate Distribution Service in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

High CPU Utilization (Warning/Critical)

Metric name: CPU utilization

Warning MQL:

CpuTimeUtilizationPercentage[5m].mean() > 80

Critical MQL:

CpuTimeUtilizationPercentage[5m].mean() > 90
80 90 Warning alarm for any Goldengate Distribution Service in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Distribution Service in a given compartment reporting over 90% CPU utilization for past 5 minutes.
5

Private memory (Warning/Critical)

Metric name: Private memory

Warning MQL:

PrivateMemory[5m].mean() > 30

Critical MQL:

PrivateMemory[5m].mean() > 40
30 40 A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes.
5

I/O read rate (Warning/Critical)

Metric name: I/O read rate

Warning MQL:

IOReadRate[5m].mean() > 10

Critical MQL:

IOReadRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes.
5

I/O write rate (Warning/Critical)

Metric name: I/O write rate

Warning MQL:

IOWriteRate[5m].mean() > 10

Critical MQL:

IOWriteRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes.
5

Dropped packet rate (Warning/Critical)

Metric name: Dropped packet rate

Warning MQL:

DroppedPacketRate[5m].mean() > 30

Critical MQL:

DroppedPacketRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Dropped packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Dropped packet rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Missing packet rate (Warning/Critical)

Metric name: Missing packet rate

Warning MQL:

MissingPacketRate[5m].mean() > 30

Critical MQL:

MissingPacketRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Missing packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Missing packet rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Packet error rate (Warning/Critical)

Metric name: Packet error rate

Warning MQL:

PacketErrorRate[5m].mean() > 30

Critical MQL:

PacketErrorRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Packet error rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet error rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Packet receive rate (Warning/Critical)

Metric name: Packet receive rate

Warning MQL:

PacketReceiveRate[5m].mean() > 30

Critical MQL:

PacketReceiveRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Packet receive rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet receive rate mean is grater than 40 Msgs/min for past 5 minutes.

Sample Alarm Rule: Goldengate Receiver Service

  • Resource Type: Goldengate Receiver Service

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_goldengate_receiver_server

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Goldengate Receiver Service

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent()
    Critical alarm for any Goldengate Receiver Service in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

High CPU Utilization (Warning/Critical)

Metric name: CPU utilization

Warning MQL:

CpuTimeUtilizationPercentage[5m].mean() > 80

Critical MQL:

CpuTimeUtilizationPercentage[5m].mean() > 90
80 90 Warning alarm for any Goldengate Receiver Service in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Receiver Service in a given compartment reporting over 90% CPU utilization for past 5 minutes.
5

Private memory (Warning/Critical)

Metric name: Private memory

Warning MQL:

PrivateMemory[5m].mean() > 30

Critical MQL:

PrivateMemory[5m].mean() > 40
30 40 A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes.
5

I/O read rate (Warning/Critical)

Metric name: I/O read rate

Warning MQL:

IOReadRate[5m].mean() > 10

Critical MQL:

IOReadRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes.
5

I/O write rate (Warning/Critical)

Metric name: I/O write rate

Warning MQL:

IOWriteRate[5m].mean() > 10

Critical MQL:

IOWriteRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes.
5

Dropped packet rate (Warning/Critical)

Metric name: Dropped packet rate

Warning MQL:

DroppedPacketRate[5m].mean() > 30

Critical MQL:

DroppedPacketRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Dropped packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Dropped packet rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Missing packet rate (Warning/Critical)

Metric name: Missing packet rate

Warning MQL:

MissingPacketRate[5m].mean() > 30

Critical MQL:

MissingPacketRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Missing packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Missing packet rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Packet error rate (Warning/Critical)

Metric name: Packet error rate

Warning MQL:

PacketErrorRate[5m].mean() > 30

Critical MQL:

PacketErrorRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Packet error rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet error rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Packet receive rate (Warning/Critical)

Metric name: Packet receive rate

Warning MQL:

PacketReceiveRate[5m].mean() > 30

Critical MQL:

PacketReceiveRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Packet receive rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet receive rate mean is grater than 40 Msgs/min for past 5 minutes.

Sample Alarm Rule: Goldengate Service Manager

  • Resource Type: Goldengate Service Manager

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_goldengate_service_manager

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Goldengate Service Manager

Metric name: Monitoring

StatusCritical MQL:

MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent()
    Critical alarm for any Goldengate Service Manager in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.

Sample Alarm Rule: Goldengate Performance Metric Service

  • Resource Type: Goldengate Performance Metric Service

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_goldengate_pm_server

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Goldengate Performance Metric Service

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent()
    Critical alarm for any Goldengate Performance Metric Service in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

High CPU Utilization (Warning/Critical)

Metric name: CPU utilization

Warning MQL:

CpuTimeUtilizationPercentage[5m].mean() > 80

Critical MQL:

CpuTimeUtilizationPercentage[5m].mean() > 90
80 90 Warning alarm for any Goldengate Performance Metric Service in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Performance Metric Service in a given compartment reporting over 90% CPU utilization for past 5 minutes.
5

Private memory (Warning/Critical)

Metric name: Private memory

Warning MQL:

PrivateMemory[5m].mean() > 30

Critical MQL:

PrivateMemory[5m].mean() > 40
30 40 A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes.
5

I/O read rate (Warning/Critical)

Metric name: I/O read rate

Warning MQL:

IOReadRate[5m].mean() > 10

Critical MQL:

IOReadRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes.
5

I/O write rate (Warning/Critical)

Metric name: I/O write rate

Warning MQL:

IOWriteRate[5m].mean() > 10

Critical MQL:

IOWriteRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes.

Sample Alarm Rule: Goldengate Extract

  • Resource Type: Goldengate Extract

  • Resource Type: Goldengate Extract

  • Resource Group: oracle_goldengate_extract

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Goldengate Extract

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent()
    Critical alarm for any Goldengate Extract in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

High CPU Utilization (Warning/Critical)

Metric name: CPU utilization

Warning MQL:

CpuTimeUtilizationPercentage[5m].mean() > 80

Critical MQL:

CpuTimeUtilizationPercentage[5m].mean() > 90
80 90 Warning alarm for any Goldengate Extract in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Extract in a given compartment reporting over 90% CPU utilization for past 5 minutes.
5

Private memory (Warning/Critical)

Metric name: Private memory

Warning MQL:

PrivateMemory[5m].mean() > 30

Critical MQL:

PrivateMemory[5m].mean() > 40
30 40 A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes.
5

I/O read rate (Warning/Critical)

Metric name: I/O read rate

Warning MQL:

IOReadRate[5m].mean() > 10

Critical MQL:

IOReadRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes.
5

I/O write rate (Warning/Critical)

Metric name: I/O write rate

Warning MQL:

IOWriteRate[5m].mean() > 10

Critical MQL:

IOWriteRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes.
5

Dropped packet rate (Warning/Critical)

Metric name: Dropped packet rate

Warning MQL:

DroppedPacketRate[5m].mean() > 30

Critical MQL:

DroppedPacketRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Dropped packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Dropped packet rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Missing packet rate (Warning/Critical)

Metric name: Missing packet rate

Warning MQL:

MissingPacketRate[5m].mean() > 30

Critical MQL:

MissingPacketRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Missing packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Missing packet rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Packet error rate (Warning/Critical)

Metric name: Packet error rate

Warning MQL:

PacketErrorRate[5m].mean() > 30

Critical MQL:

PacketErrorRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Packet error rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet error rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Packet receive rate (Warning/Critical)

Metric name: Packet receive rate

Warning MQL:

PacketReceiveRate[5m].mean() > 30

Critical MQL:

PacketReceiveRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Packet receive rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet receive rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Mapped delete rate (Warning/Critical)

Metric name: Mapped delete rate

Warning MQL:

MappedDeleteRate[5m].mean() > 30

Critical MQL:

MappedDeleteRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Mapped delete rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped delete rate mean is grater than 40 MB/sec for past 5 minutes.
5

Mapped insert rate (Warning/Critical)

Metric name: Mapped insert rate

Warning MQL:

MappedInsertRate[5m].mean() > 30

Critical MQL:

MappedInsertRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Mapped insert rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped insert rate mean is grater than 40 MB/sec for past 5 minutes.
5

Mapped truncate rate (Warning/Critical)

Metric name: Mapped truncate rate

Warning MQL:

MappedTruncateRate[5m].mean() > 30

Critical MQL:

MappedTruncateRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Mapped truncate rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped truncate rate mean is grater than 40 MB/sec for past 5 minutes.
5

Mapped update rate (Warning/Critical)

Metric name: Mapped update rate

Warning MQL:

MappedUpdateRate[5m].mean() > 30

Critical MQL:

MappedUpdateRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Mapped update rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped update rate mean is grater than 40 MB/sec for past 5 minutes.
5

Discard rate (Warning/Critical)

Metric name: Discard rate

Warning MQL:

DiscardRate[5m].mean() > 30

Critical MQL:

DiscardRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Discard rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Discard rate mean is grater than 40 MB/sec for past 5 minutes.
5

Ignore rate (Warning/Critical)

Metric name: Ignore rate

Warning MQL:

IgnoreRate[5m].mean() > 30

Critical MQL:

IgnoreRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Ignore rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Ignore rate mean is grater than 40 MB/sec for past 5 minutes.
5

Lag (Warning/Critical)

Metric name: Lag

Warning MQL:

Lag[5m].mean() > 10

Critical MQL:

Lag[5m].mean() > 20
10 20 A warning alert will be triggered if the Lag mean is grater than 10 Sec for past 5 minutes. A critical alert will be triggered if the Lag mean is grater than 20 Ssec for past 5 minutes.
5

Operations rate (Warning/Critical)

Metric name: Operations rate

Warning MQL:

OperationsPerSec[5m].mean() > 30

Critical MQL:

OperationsPerSec[5m].mean() > 40
20 30 A warning alert will be triggered if the Operations rate mean is grater than 20 Ops/sec for past 5 minutes. A critical alert will be triggered if the Operations rate mean is grater than 30 Ops/sec for past 5 minutes.

Sample Alarm Rule: Goldengate Replicat

  • Resource Type: Goldengate Replicat

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_goldengate_replicat

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Goldengate Replicat

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent()
    Critical alarm for any Goldengate Replicat in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

High CPU Utilization (Warning/Critical)

Metric name: CPU utilization

Warning MQL:

CpuTimeUtilizationPercentage[5m].mean() > 80

Critical MQL:

CpuTimeUtilizationPercentage[5m].mean() > 90
80 90 Warning alarm for any Goldengate Replicat in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Replicat in a given compartment reporting over 90% CPU utilization for past 5 minutes.
5

Private memory (Warning/Critical)

Metric name: Private memory

Warning MQL:

PrivateMemory[5m].mean() > 30

Critical MQL:

PrivateMemory[5m].mean() > 40
30 40 A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes.
5

I/O read rate (Warning/Critical)

Metric name: I/O read rate

Warning MQL:

IOReadRate[5m].mean() > 10

Critical MQL:

IOReadRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes.
5

I/O write rate (Warning/Critical)

Metric name: I/O write rate

Warning MQL:

IOWriteRate[5m].mean() > 10

Critical MQL:

IOWriteRate[5m].mean() > 20
10 20 A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes.
5

Dropped packet rate (Warning/Critical)

Metric name: Dropped packet rate

Warning MQL:

DroppedPacketRate[5m].mean() > 30

Critical MQL:

DroppedPacketRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Dropped packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Dropped packet rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Missing packet rate (Warning/Critical)

Metric name: Missing packet rate

Warning MQL:

MissingPacketRate[5m].mean() > 30

Critical MQL:

MissingPacketRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Missing packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Missing packet rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Packet error rate (Warning/Critical)

Metric name: Packet error rate

Warning MQL:

PacketErrorRate[5m].mean() > 30

Critical MQL:

PacketErrorRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Packet error rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet error rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Packet receive rate (Warning/Critical)

Metric name: Packet receive rate

Warning MQL:

PacketReceiveRate[5m].mean() > 30

Critical MQL:

PacketReceiveRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Packet receive rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet receive rate mean is grater than 40 Msgs/min for past 5 minutes.
5

Mapped delete rate (Warning/Critical)

Metric name: Mapped delete rate

Warning MQL:

MappedDeleteRate[5m].mean() > 30

Critical MQL:

MappedDeleteRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Mapped delete rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped delete rate mean is grater than 40 MB/sec for past 5 minutes.
5

Mapped insert rate (Warning/Critical)

Metric name: Mapped insert rate

Warning MQL:

MappedInsertRate[5m].mean() > 30

Critical MQL:

MappedInsertRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Mapped insert rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped insert rate mean is grater than 40 MB/sec for past 5 minutes.
5

Mapped truncate rate (Warning/Critical)

Metric name: Mapped truncate rate

Warning MQL:

MappedTruncateRate[5m].mean() > 30

Critical MQL:

MappedTruncateRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Mapped truncate rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped truncate rate mean is grater than 40 MB/sec for past 5 minutes.
5

Mapped update rate (Warning/Critical)

Metric name: Mapped update rate

Warning MQL:

MappedUpdateRate[5m].mean() > 30

Critical MQL:

MappedUpdateRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Mapped update rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped update rate mean is grater than 40 MB/sec for past 5 minutes.
5

Discard rate (Warning/Critical)

Metric name: Discard rate

Warning MQL:

DiscardRate[5m].mean() > 30

Critical MQL:

DiscardRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Discard rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Discard rate mean is grater than 40 MB/sec for past 5 minutes.
5

Ignore rate (Warning/Critical)

Metric name: Ignore rate

Warning MQL:

IgnoreRate[5m].mean() > 30

Critical MQL:

IgnoreRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Ignore rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Ignore rate mean is grater than 40 MB/sec for past 5 minutes.
5

Lag (Warning/Critical)

Metric name: Lag

Warning MQL:

Lag[5m].mean() > 30

Critical MQL:

Lag[5m].mean() > 40
10 20 A warning alert will be triggered if the Lag mean is grater than 10 Sec for past 5 minutes. A critical alert will be triggered if the Lag mean is grater than 20 Ssec for past 5 minutes.
5

Operations rate (Warning/Critical)

Metric name: Operations rate

Warning MQL:

OperationsPerSec[5m].mean() > 30

Critical MQL:

OperationsPerSec[5m].mean() > 40
20 30 A warning alert will be triggered if the Operations rate mean is grater than 20 Ops/sec for past 5 minutes. A critical alert will be triggered if the Operations rate mean is grater than 30 Ops/sec for past 5 minutes.

Sample Alarm Rule: Goldengate Distribution Path

  • Resource Type: Goldengate Distribution Path

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_goldengate_distribution_path

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Goldengate Distribution Path

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent()
    Critical alarm for any Goldengate Distribution Path in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

Lag (Warning/Critical)

Metric name: Lag

Warning MQL:

Lag[5m].mean() > 30

Critical MQL:

Lag[5m].mean() > 40
10 20 A warning alert will be triggered if the Lag mean is grater than 10 Sec for past 5 minutes. A critical alert will be triggered if the Lag mean is grater than 20 Ssec for past 5 minutes.
5

Network sent rate (Warning/Critical)

Metric name: Network sent rate

Warning MQL:

NetworkSentRate[5m].mean() > 30

Critical MQL:

NetworkSentRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Network sent rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Network sent rate mean is grater than 40 MB/sec for past 5 minutes.
5

Network receive rate (Warning/Critical)

Metric name: Network receive rate

Warning MQL:

NetworkReceiveRate[5m].mean() > 30

Critical MQL:

NetworkReceiveRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Network sent receive mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Network receive rate mean is grater than 40 MB/sec for past 5 minutes.

Sample Alarm Rule: Goldengate Receiver Path

  • Resource Type: Goldengate Receiver Path

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_goldengate_receiver_path

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Goldengate Receiver Path

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent()
    Critical alarm for any Goldengate Receiver Path in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

Lag (Warning/Critical)

Metric name: Lag

Warning MQL:

Lag[5m].mean() > 30

Critical MQL:

Lag[5m].mean() > 40
10 20 A warning alert will be triggered if the Lag mean is grater than 10 Sec for past 5 minutes. A critical alert will be triggered if the Lag mean is grater than 20 Ssec for past 5 minutes.
5

Network sent rate (Warning/Critical)

Metric name: Network sent rate

Warning MQL:

NetworkSentRate[5m].mean() > 30

Critical MQL:

NetworkSentRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Network sent rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Network sent rate mean is grater than 40 MB/sec for past 5 minutes.
5

Network receive rate (Warning/Critical)

Metric name: Network receive rate

Warning MQL:

NetworkReceiveRate[5m].mean() > 30

Critical MQL:

NetworkReceiveRate[5m].mean() > 40
30 40 A warning alert will be triggered if the Network sent receive mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Network receive rate mean is grater than 40 MB/sec for past 5 minutes.

Process-based Custom Resource Sample Alarm Rules

  • Resource Type: Custom Resource

  • Metric Namespace: oracle_appmgmt

  • Resource Group: custom_resource

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Custom Resource Down

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() 
    Critical alarm for any custom resource in a given compartment being down or not reporting status for over 1 minute.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

High CPU Utilization (Warning/Critical)

Metric name: CpuUtilization

Warning MQL:

CpuUtilization[1m].mean() > 80

Critical MQL:

CpuUtilization[1m].mean() > 90
>80 >90

Warning alarm for any custom resource in a given compartment reporting over 80% CPU utilization over 5 minutes.

Critical alarm for any custom resource in a given compartment reporting over 90% CPU utilization over 5 minutes.

15

High Memory Utilization (Warning/Critical)

Metric Name: MemoryUtilization

Warning MQL:

MemoryUtilization[1m].mean() > 80

Critical MQL:

MemoryUtilization[1m].mean() > 90
>80 >90

Warning alarm for any custom resource in a given compartment reporting over 80% memory utilization over 15 minutes.

Critical alarm for any custom resource in a given compartment reporting over 90% memory utilization over 15 minutes.

Oracle Service Bus (OSB)

  • Resource Type: Oracle Service Bus

  • Metric Namespace: oracle_appmgmt

  • Resource Group: oracle_servicebus

Evaluation Time Period (minutes) Alarm Rule Warning Critical Description
1

Metric name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent()
    Critical alarm for any Service Bus in a given compartment reporting being down or not reporting status for over 1min.
Note

When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
5

Metric name: ServiceBusErrors

Critical MQL:

ServiceBusErrors[1m].mean() > 0
  >0 Critical alarm for any Service Bus in a given compartment that reports when there are errors in any of the OSB services for over 5 minutes.

Microsoft IIS

  • Resource Type: IIS

  • Metric Namespace: oracle_appmgmt

  • Resource Group: microsoft_iis

Evaluation Time Period Alarm Tule Warning Critical Description
1 (or more to ignore random connectivity blips, e.g. 5 min)

IIS Down

Metric Name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()

    Critical Alarm for no longer being able to connect to IIS
5

ASP.Net Worker Process Restart

Metric Name: ASPDotNetWorkerProcessRestarts

Critical MQL:

ASPDotNetWorkerProcessRestarts[1m].mean() > 1

>1   Critical alarm to know if we have had any Worker Process Restarts. This can be caused by a number of things and cause issues including performance impact, and information loss.
5

ASP.Net Requests Queued

Metric Name: ASPDotNetRequests.Type.Queued

Warning MQL:

ASPDotNetRequests.Type.Queued[1m].mean() > 5

Critical MQL:

ASPDotNetRequests.Type.Queued[1m].mean() > 10

>5 >10 Critical / Warning threshold to let us know if incoming HTTP requests are being put into a queue due to load.
5

ASP.Net Error Rate

Metric Name: ErrorRate

Warning MQL:

ErrorRate[1m].mean() > 1%*

Critical MQL:

ErrorRate[1m].mean() > 2%*

> 1%* > 2%* Critical / Warning Threshold to alert the user that the error rate of an ASP.Net Application is above a certain threshold. This metric returns in errors/second, so the threshold will need to be set based on the average total requests they get. For example, if they usually get 100 requests/sec, we suggest 1 error for a warning and 2 for a critical.
  • Resource Type: IIS Website

  • Metric Namespace: oracle_appmgmt

  • Resource Group: microsoft_iis

Evaluation time Period (in minutes) Alarm Rule Warning Critical Description
1 (or more to ignore random connectivity blips, e.g. 5 min)

IIS Website Down

Metric Name: MonitoringStatus

Critical MQL:

MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent()

    Critical Alarm for no longer being able to connect to IIS Website
5

WWW Current Connections

Metric Name: CurrentConnections.Service.WWW

Warning MQL:

CurrentConnections.Service.WWW[1m].mean() > 90%*

Critical MQL:

CurrentConnections.Service.WWW[1m].mean() > 95%*

> 90%* > 95%* Critical / Warning threshold to alert the user that the number of connections is getting close to maximum. Note, the customer will need to set the number to 90 / 95% of their max connections. The metric is a number, so this alert being set will be unique to each customer. For example, if the user has 200 total connections allowed, we suggest 180 for warning and 190 for critical.

Metric Extensions

You can create alarm rules to trigger alarms when metric values from your Metric Extensions cross thresholds. Use the same general workflow that you would follow to create an alarm rule for built-in metrics for your resources. The main difference is in the Metric description section.

  • Compartment: choose the compartment of the resource on which the Metric Extension was enabled
  • Metric namespace: select oracle_metric_extensions_appmgmt
  • Resource group: the resource type of the resource on which the metric extension was deployed.

Creating an Alarm rule for a Metric Extension of a host is shown in the image below:


creating alarm rules for metric extensions