Setting Up Alarms
You can use Oracle Cloud Infrastructure Monitoring service to generate alarms when metrics cross thresholds.
First familiarize yourself with the Monitoring service concepts and features by reviewing Overview of Monitoring. For more information about setting up alarms, see Managing Alarms. See Monitoring Query Language (MQL) Reference for constructing advanced queries for both monitoring as well as alarms.
Also make sure that you've set the appropriate policies to use alarm rules. Refer to Getting Started.
Before you proceed you should have created an alarm destination, e.g Notification service as well as topic(s) that define who will receive these alarms.
The following table lists metric details you will need to create alarm rules for metrics used in Stack Monitoring.
Resource Type | Metric Namespace | Alarm Rule Resource Group | Alarm Rules | Metrics Reference |
---|---|---|---|---|
Host | oracle_appmgmt | host | Hosts | Host Metrics |
Non-container, container, and pluggable Oracle Databases | oracle_oci_database | n/a | Oracle Database | Oracle Database |
Oracle Database System, ASM, Cluster, and Listener | oracle_oci_database_cluster | oracle_asm, oracle_cluster, oracle_db_node, oracle_lsnr | Oracle Database | Oracle Database Cluster |
Oracle WebLogic Domain Oracle WebLogic Cluster |
oracle_appmgmt |
weblogic_cluster |
Oracle Weblogic Server | WebLogic Metrics |
Oracle WebLogic Server | oracle_appmgmt | weblogic_j2eeserver | Oracle Weblogic Server | WebLogic Metrics |
Oracle HTTP Server (OHS) | oracle_appmgmt | oracle_http_server | Oracle HTTP Server (OHS) | Oracle HTTP Server (OHS) Metrics |
Oracle Identity Manager (OIM) | oracle_appmgmt | oracle_oim / oracle_oim_cluster | Oracle Identity Manager (OIM) | Oracle Identity Manager (OIM) |
Oracle Access Manager (OAM) | oracle_appmgmt | oracle_oam / oracle_oam_cluster | Oracle Access Manager (OAM) | Oracle Access Manager (OAM) |
Oracle E-Business Suite | oracle_appmgmt | ebs_instance | Oracle E-Business Suite | E-Business Suite Metrics |
EBS Application Listener | oracle_appmgmt | oracle_ebs_app_lsnr | Oracle E-Business Suite | E-Business Suite Metrics |
EBS Concurrent Processing | oracle_appmgmt | oracle_ebs_conc_mgmt_service | Concurrent Processing | E-Business Suite Metrics |
EBS Concurrent Processing - Specialized | oracle_appmgmt | oracle_ebs_conc_mgmt_service_specialized | Concurrent Processing | E-Business Suite Metrics |
EBS Concurrent Processing Node | oracle_appmgmt | oracle_ebs_cp_node | Oracle E-Business Suite | E-Business Suite Metrics |
EBS Forms System | oracle_appmgmt | oracle_ebs_forms_system | Oracle E-Business Suite | E-Business Suite Metrics |
EBS Workflow Agent Listener | oracle_appmgmt | oracle_ebs_wf_agent_lsnr | Oracle E-Business Suite | E-Business Suite Metrics |
EBS Workflow Background Engine | oracle_appmgmt | oracle_ebs_wf_bkgd_engine | Oracle E-Business Suite | E-Business Suite Metrics |
EBS Workflow Group | oracle_appmgmt | oracle_ebs_wf_group | Oracle E-Business Suite | E-Business Suite Metrics |
EBS Workflow Notification Mailer | oracle_appmgmt | oracle_ebs_wf_notification_mailer | Workflow Notification Mailer | E-Business Suite Metrics |
Apache Tomcat | oracle_appmgmt | apache_tomcat | Apache Tomcat | Apache Tomcat Metrics |
Microsoft SQL Server | oracle_appmgmt | sql_server | Microsoft SQL Server | Microsoft SQL Server Metrics |
PeopleSoft Application Server Domain | oracle_appmgmt | oracle_psft_appserv | PeopleSoft | PeopleSoft Metrics |
PeopleSoft Process Scheduler Domain | oracle_appmgmt | oracle_psft_prcs | PeopleSoft | PeopleSoft Metrics |
PeopleSoft PIA | oracle_appmgmt | oracle_psft_pia | PeopleSoft | PeopleSoft Metrics |
PeopleSoft Elasticsearch | oracle_appmgmt | elastic_search | PeopleSoft | PeopleSoft Metrics |
PeopleSoft Process Monitor | oracle_appmgmt | oracle_psft_prcm | PeopleSoft | PeopleSoft Metrics |
Apache HTTP Server | oracle_appmgmt | apache_http_server | Apache HTTP Server | Apache HTTP Server Metrics |
OUD Directory Server | oracle_appmgmt | oud_directory | Oracle Unified Directory | Oracle Unified Directory Metrics |
OUD Proxy Server | oracle_appmgmt | oud_proxy | Oracle Unified Directory | Oracle Unified Directory Metrics |
OUD Replication Gateway | oracle_appmgmt | oud_gateway | Oracle Unified Directory | Oracle Unified Directory Metrics |
GoldenGate | oracle_appmgmt | oracle_goldengate | Oracle GoldenGate | Oracle GoldenGate Metrics |
GoldenGate ServiceManager | oracle_appmgmt | oracle_goldengate_service_manager | Oracle GoldenGate | Oracle GoldenGate Metrics |
GoldenGate AdminServer | oracle_appmgmt | oracle_goldengate_admin_server | Oracle GoldenGate | Oracle GoldenGate Metrics |
GoldenGate Performance Metric Server | oracle_appmgmt | oracle_goldengate_pm_server | Oracle GoldenGate | Oracle GoldenGate Metrics |
GoldenGate Extract | oracle_appmgmt | oracle_goldengate_extract | Oracle GoldenGate | Oracle GoldenGate Metrics |
GoldenGate Replicat | oracle_appmgmt | oracle_goldengate_replicat | Oracle GoldenGate | Oracle GoldenGate Metrics |
GoldenGate DistributionServer | oracle_appmgmt | oracle_goldengate_distribution_server | Oracle GoldenGate | Oracle GoldenGate Metrics |
GoldenGate Distribution Path | oracle_appmgmt | oracle_goldengate_distribution_path | Oracle GoldenGate | Oracle GoldenGate Metrics |
GoldenGate Receiver Server |
oracle_appmgmt | oracle_goldengate_receiver_server | Oracle GoldenGate | Oracle GoldenGate Metrics |
GoldenGate Receiver Path | oracle_appmgmt | oracle_goldengate_receiver_path | Oracle GoldenGate | Oracle GoldenGate Metrics |
Custom Resource | oracle_appmgmt | custom_resource | Process-based Custom Resource Sample Alarm Rules | Process-based Custom Resource Metrics |
Oracle Service Bus | oracle_appmgmt | oracle_servicebus | Oracle Service Bus (OSB) | Oracle Service Bus (OSB) |
Microsoft IIS | oracle_appmgmt | microsoft_iis | Microsoft IIS | Microsoft IIS Metrics |
Microsoft IIS Website | oracle_appmgmt | microsoft_iis_website | Microsoft IIS | Microsoft IIS Metrics |
Best practices for common alarm scenarios
- Create your alarm rules in the same compartment where you have discovered your resources.
- To set up an alarm rule to generate an alarm when a resource is down, specify the appropriate metric namespace and resource group and use following metric and trigger rule:
Metric Name: MonitoringStatus
Trigger rule:
-
Operator: equal to
-
Value: 0
-
Trigger delay minutes: 3
-
- To set up an alarm rule to trigger for individual resource instances, in additional to choosing the metric, you'll also have to add metric dimensions to uniquely identify the resource.
To uniquely identify a resource instance:
- You can use resourceName and resourceType OR
- You can use
resourceId
Most metrics define additional dimensions that can be used to set advanced alarms.
- Always refer to metric description found in the Metric Reference and check the evaluation time period (how often is each metric collected). When setting up alarms, make sure you provide the same value as the alarm Interval value. This can be done via Switch to Advanced Mode at the top-right corner of the alarm creation page. You can provide advanced MQL into the Query code editor section of the advanced mode page.
Hosts
Sample Alarm Rule: Host Monitoring
- Resource Type: Host
- Metric Namespace: oracle_appmgmt
- Resource Group: host
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Host Down Metric name: MonitoringStatus Critical MQL: MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() |
- | - | Critical alarm for any host in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
5 |
High CPU Utilization (Warning/Critical) Metric name: CpuUtilization Warning MQL: CpuUtilization[1m]{type="Total"}.mean() > 80 Critical MQL: CpuUtilization[1m]{type="Total"}.mean() > 90 |
> 80 | > 90 | Warning alarm for any host in a given compartment
reporting over 80% CPU utilization for past 5 minutes.
Critical alarm for any host in a given compartment reporting over 90% CPU utilization for past 5 minutes. |
15 |
High Memory Utilization (Warning/Critical) Metric name: MemoryUtilization Warning MQL: MemoryUtilization[1m]{type="Logical"}.mean() > 80 Critical MQL: MemoryUtilization[1m]{type="Logical"}.mean() > 80 |
> 80 | > 90 | Warning alarm for any host in a given compartment
reporting over 80% memory utilization for past 5 minutes.
Warning alarm for any host in a given compartment reporting over 90% memory utilization for past 5 minutes. |
15 |
Filesystem Utilization (Warning/Critical) Metric name: FilesystemUtilization Warning MQL: FilesystemUtilization[1m].mean() > 80 Critical MQL: FilesystemUtilization[1m].mean() > 90 |
> 80 | > 90 | Warning alarm for any filesystem on any host in a
given compartment reporting over 80% memory utilization.
Critical alarm for any filesystem on any host in a given compartment reporting over 90% memory utilization. Note
For monitoring selected file systems, you can further specify the fileSystemName dimension and customize your alarms to your specific needs. For example. the following MQL FilesystemUtilization[1m]{fileSystemName
= "/", osType = "Linux"}.mean() > 80 will only
apply to any root filesystems on any Linux hosts in given
compartment.
|
Oracle Database
Sample Alarm Rule: Non-Container Database
-
Resource Type: Non-Container DB
-
Metric Namespace: oracle_oci_database
-
Resource Group: n/a
Evaluation time period (minutes) | Alarm Rule(metric or MQL) | Warning | Critical | DBM Recommended Value Used? | Description |
---|---|---|---|---|---|
30 minutes |
Metric: StorageUtilizationByTablespace
OR Warning MQL: StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 75 Critical MQL: StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 85 |
>75 | >85 | Y | Warning and Critical alarm rule conditions for permanent tablespaces whose utilization is greater than 75% or 85% over the past 30 minutes. |
24 hours | InvalidObjects |
>150 | >200 | ||
15 minutes | BlockingSessions | >1 | >10 | Y | Warning and Critical alarm rule conditions to trigger an alarm when the number of blocking sessions is greater than 1 or 10 over the past 15 minutes. |
15 minutes | UsableFRA | <20 | <10 | Warning and Critical alarm rule conditions to trigger an alarm when the percentage of usable fast recovery area is less than 20% or 10% over the past 15 minutes. | |
5 minutes | ProcessLimitUtilization | >70 | >80 | Y | Warning and Critical alarm rule conditions to trigger an alarm when the process utilization (%) is greater than 70% or 80% over the past 5 minutes. |
5 minutes | SessionLimitUtilization | >90 | >97 | ||
5 minutes | CPUUtilization | >80 | >85 | Y | |
5 minutes | FRAUtilization | >70 | >75 | Y | |
5 minutes | StorageUtilization | >75 | >85 | Y | |
1 minute | MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() | n/a | n/a | Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
Sample Alarm Rule: Container Database
-
Resource Type: Container DB
-
Metric Namespace: oracle_oci_database
-
Resource Group: n/a
Evaluation time period (minutes) | Alarm Rule(metric or MQL) | Warning | Critical | DBM Recommended Value Used? | Description |
---|---|---|---|---|---|
1 minute | MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() | n/a | n/a | Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
|
30 minutes |
Metric: StorageUtilizationByTablespace
OR Warning MQL: StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 75 Critical MQL: StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 85 |
>75 | >85 | Y | Warning and Critical alarm rule conditions for permanent tablespaces whose utilization is greater than 85% or 90% over the past 30 minutes. |
5 minutes | ProcessLimitUtilization | >70 | >80 | Y | Warning and Critical alarm rule conditions to trigger an alarm when the process utilization (%) is greater than 90% or 95% over the past 5 minutes. |
5 minutes | SessionLimitUtilization | >90 | >97 | ||
15 minutes | UsableFRA | <20 | <10 | Warning and Critical alarm rule conditions to trigger an alarm when the percentage of usable fast recovery area is less than 20% or 10% over the past 15 minutes. | |
5 minutes | CPUUtilization | >80 | >85 | Y | |
5 minutes | FRAUtilization | >70 | >75 | Y | |
5 minutes | StorageUtilization | >75 | >85 | Y |
Sample Alarm Rule: Pluggable Database
-
Resource Type: Pluggable DB
-
Metric Namespace: oracle_oci_database
-
Resource Group: n/a
Evaluation time period (minutes) | Alarm Rule(metric or MQL) | Warning | Critical | DBM Recommended Value Used? | Description |
---|---|---|---|---|---|
1 minute | MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() | n/a | n/a | Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
|
5 minutes | CPUUtilization | >80 | >85 | Y | |
5 minutes | StorageUtilization | >75 | >85 | Y | |
15 minutes | BlockingSessions | >1 | >10 | Y | Warning and Critical alarm rule conditions to trigger an alarm when the number of blocking sessions is greater than 1 or 5 over the past 15 minutes. |
24 hours | InvalidObjects |
>150 | >200 | ||
30 minutes |
Metric: StorageUtilizationByTablespace
OR Warning MQL: StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 75 Critical MQL: StorageUtilizationByTablespace[1m]{tablespaceContents = "PERMANENT"}.mean() > 85 |
>75 | >85 | Y | Warning and Critical alarm rule conditions for permanent tablespaces whose utilization is greater than 85% or 90% over the past 30 minutes. |
Sample Alarm Rule: ASM/ASM Instance
-
Resource Type: ASM
-
Metric Namespace: oracle_asm
-
Resource Group: n/a
Evaluation time period (minutes) | Alarm Rule(metric or MQL) | Warning | Critical | Description |
---|---|---|---|---|
1 minute | MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() | - | - | Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
30 minutes |
|
>85 | >95 | |
30 minutes |
|
>85 | >95 |
-
Resource Type: ASM Cluster
-
Metric Namespace: oracle_cluster
-
Resource Group: n/a
Evaluation time period (minutes) | Alarm Rule(metric or MQL) | Warning | Critical | Description |
---|---|---|---|---|
1 minute | MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() | - | - | Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
30 minutes | DiskGroupUtilization |
>85 | >95 | |
30 minutes | DiskUtilization |
>85 | >95 |
Sample Alarm Rule: Listener
-
Resource Type: Listener
-
Metric Namespace: oracle_oci_database_cluster
-
Resource Group: oracle_lsnr
Evaluation time period (minutes) | Alarm Rule(metric or MQL) | Warning | Critical | Description |
---|---|---|---|---|
1 minute | MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() | - | - | Critical alarm for any Non-Container Oracle Database reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
5 minutes | RefusedConnections |
>1 | >5 |
E-Business Suite
Sample Alarm Rule: EBS
- Resource Type: Oracle E-Business Suite
- Metric Namespace: oracle_appmgmt
- Resource Group: ebs_instance
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
15 |
Executed Programs By Running Time (ms) Metric name: ExecutedProgramsByRunningTime MQL: ExecutedProgramsByRunningTime[15m].mean() > 4000
Tip1: You can filter the alarm to a specific application by adding ProgramName or ProgramShortName dimension filter. |
> 4000 | > 40000 | The running time of the requests |
15 |
Completed Requests By Application (ratio) Metric name: CompletedRequestsByApplication Dimension name: Category Dimension value: Error MQL: CompletedRequestsByApplication[15m]{Category = "Error"}.mean() > 0.001 Tip1: You can filter the alarm to a specific application by adding ApplicationName or A dimension filter. MQL: CompletedRequestsByApplication[15m]{Category = "Error", ApplicationName = "<YOUR APP NAME>"}.mean() > 0.001 |
> 0.001 | > 0.0025 |
The ratio of requests that completed with error compared to all requests in given collection interval. This means if more than 0.1% requested failed, you will get a warning, for more than 0.25% you get critical |
15 |
Active User Sessions Metric name: ActiveUserSessions MQL: ActiveUserSessions[15m].mean() > 200 |
> 200 | > 250 | The number of active user sessions |
Sample Alarm Rule: EBS Application Listener
Resource Type: EBS Application Listener
Metric Namespace: oracle_appmgmt
Resource Group: oracle_ebs_app_lsnr
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Monitoring Status MQL:
|
n/a | 0 | Critical alarm for EBS Application Listener in a given compartment reporting being down or not reporting status for over 1min. |
Sample Alarm Rule: EBS Concurrent Processing
Resource Type: EBS Concurrent Processing
Metric Namespace: oracle_appmgmt
Resource Group: oracle_ebs_conc_mgmt_service
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Monitoring Status Metric name: MonitoringStatus MQL: MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() |
N/A | 0 | The availability status.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
15 |
Concurrent Requests Error Rate Metric name: CompletedConcurrentRequests Dimension name: "State" Dimension value: "Errored" MQL: CompletedConcurrentRequests[15m]{State = "Errored"}.mean() > 0.001 |
> 0.001 | > 0.0025 | The rate of requests that completed with errors on an hourly basis. If multiplied by 100, becomes a percentage. |
15 |
Concurrent Requests Warning Rate Metric name: CompletedConcurrentRequests Dimension name: "State" Dimension value: "WithWarning" MQL: CompletedConcurrentRequests[15m]{State = "WithWarning"}.mean() > 0.15 |
> 0.0015 | > 0.003 | The rate of requests that completed with warning on an hourly basis. If multiplied by 100, becomes a percentage. |
15 |
Concurrent Requests Completed Successfully (ops/evaluation time period) Metric name: CompletedConcurrentRequests Dimension name: "State" Dimension value: "Successful" MQL: CompletedConcurrentRequests[15m]{State = "Successful"}.sum() > 2500 |
> 625 | > 2500 | The rate of requests that completed successfully on an evaluation time period (15minutes by default) basis. |
15 |
Concurrent Requests Running Metric name: ConcurrentRequestsByStatus Dimension name: "State" Dimension value: "Running" MQL:
|
> 2500 | > 10000 | The number of running requests by user. |
15 |
Concurrent Requests Pending - Normal Metric name: ConcurrentRequestsByStatus Dimension name: "State" Dimension value: "PendingNormal" MQL:
|
> 2500 | > 10000 | The number of pending requests by user. |
15 |
Concurrent Requests Pending - Standby Metric name: ConcurrentRequestsByStatus Dimension name: "State" Dimension value: "PendingStandBy" MQL:
|
> 100 | > 500 | The number of requests in pending stand-by status. |
15 |
Concurrent Requests Inactive - No Manager Metric name: ConcurrentRequestsByStatus Dimension name: "State" Dimension value: "InactiveNoManager" MQL:
|
> 100 | > 500 | The number of requests in inactive no manager status. |
15 |
Concurrent Requests Inactive - On Hold Metric name: ConcurrentRequestsByStatus Dimension name: "State" Dimension value: "InactiveOnHold" MQL:
|
> 100 | > 500 | The number of requests in inactive on hold status. |
5 |
Metric name: LongActiveConcurrentRequests MQL:
Tip1: You can filter the alarm to a Running or Pending request by adding Phase dimension filter. MQL:
Tip2: You can further filter by specific program by adding ProgramName or ProgramShortName dimension filter. MQL:
|
> 43200000 | > 86400000 | The elapsed time in ms for a pending or running request. Only top 10 requests are tracked. In this instance we are suggesting to get Warning after 12hrs and Critical after 24hrs. |
EBS Concurrent Processing - Specialized
Resource Type: EBS Concurrent Processing - Specialized
Metric Namespace: oracle_appmgmt
Resource Group: oracle_ebs_conc_mgmt_service_specialized
Metric | Metric Display Name | Unit | Description | Collection Frequency | Dimension | Resource Name |
---|---|---|---|---|---|---|
MonitoringStatus | Availability | status |
Status of the resource. Values are: 1 = Up 0 = Down Only if ALL other managers are up, status is up. If only one manager is down, overall status is down. |
1 min | NA | oracle_ebs_conc_mgmt_service_specialized |
ConcurrentProcesingComponentStatus | Concurrent Manager Status | status | Availability of concurrent manager | 1 min | Concurrent Queue Name, Description, Host Name | oracle_ebs_conc_mgmt_service_specialized |
CapacityUtilizationOfConcurrentManagers | Concurrent Manager Capacity Utilization | percent | Percentage of max processes running. If manager's max processes is 10 and 5 are running, capacity utilization is 50% | 1 min | Manager Name | oracle_ebs_conc_mgmt_service_specialized |
ManagerMaxProcesses | Concurrent Manager Max Processes | count | Maximum number of processes to be in the manager's queue. | 1 min | Manager Name | oracle_ebs_conc_mgmt_service_specialized |
ManagerRunningProcesses | Concurrent Manager Running Processes | count | Number of running processes in the manager's queue | 1 min | Manager Name | oracle_ebs_conc_mgmt_service_specialized |
Sample Alert Rule: EBS Workflow Notification Mailer
Resource Type: EBS Workflow Notification Mailer
Metric Namespace: oracle_appmgmt
Resource Group: oracle_ebs_wf_notification_mailer
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Metric name: Monitoring Status Monitoring Status MQL:
|
N/A | 0 | Critical alarm for EBS Concurrent Processing Specialized in a given compartment reporting being down or not reporting status for over 1min. |
1 |
Metric name: Concurrent Manager Capacity Utilization MQL:
|
< 50 | < 100 | Percentage of capacity utilization of all enabled managers. |
Apache Tomcat
Sample Alarm Rule: Apache Tomcat
Resource Type: Apache Tomcat
Metric Namespace: oracle_appmgmt
Resource Group: apache_tomcat
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1-5 |
Apache Tomcat Down Metric name: MonitoringStatus Critical MQL:
|
- | - | Critical alarm for any Apache Tomcat in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
15 |
High CPU Utilization (Warning/Critical) Metric name: CPUUtilization Warning MQL:
Critical MQL:
|
>80 | >90 |
Warning alarm for any Apache Tomcat in a given compartment reporting over 80% CPU utilization for past 15 minutes. Critical alarm for any Apache Tomcat in a given compartment reporting over 90% CPU utilization for past 15 minutes. |
10 |
High JVM Heap Memory Utilization (Warning/Critical) Metric name: JVMMemoryUtilization Warning MQL:
Critical MQL:
|
>80 | >90 |
Warning alarm for any Apache Tomcat in a given compartment reporting over 80% JVM heap memory utilization for past 10 minutes. Warning alarm for any Apache Tomcat in a given compartment reporting over 90% JVM heap memory utilization for past 10 minutes. |
1-5 |
High Web Request Processing Time (Warning/Critical) Metric name: WebRequestProcessingTime Warning MQL:
Critical MQL:
|
>1500 | >3000 |
Warning alarm for any Apache Tomcat in a given compartment reporting over 1500ms mean web request processing time for past 1-5 minutes. Warning alarm for any Apache Tomcat in a given compartment reporting over 3000ms mean web request processing time for past 1-5 minutes. |
Microsoft SQL Server
Sample Alarm Rules: Microsoft SQL Server
Resource Type: Microsoft SQL Server
Metric Namespace: oracle_appmgmt
Resource Group: sql_server
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1-5 |
SQL Server Availability Status Metric name: MonitoringStatus Critical MQL:
|
- | - | Critical alarm for any SQL Server in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
15 |
SQL Server CPU Utilization(%) (Warning/Critical) Metric name: CPUUtilization Warning MQL:
Critical MQL:
|
>80 | >95 | Warning alarm for any SQL Server in a given
compartment reporting over 80% CPU utilization for past 15
minutes.
Critical alarm for any SQL Server in a given compartment reporting over 90% CPU utilization for past 15 minutes. |
PeopleSoft
PeopleSoft Application Server
- Resource Type: PeopleSoft Application Server Domain
- Metric Namespace: oracle_appmgmt
- Resource Group: oracle_psft_appserv
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
5 |
Health (status) Metric name: Health Warning MQL:
Critical MQL:
|
warning = 1 | critical = 1 |
Overall health of the application server domain. A warning alarm will be fired if the state 'warning' is equal to 1. A critical alarm will be fired if the state 'critical' is equal to 1. |
5 |
Load (status) Metric name: Load Warning MQL:
Critical MQL
|
medium = 1 | heavy= 1 |
Overall load of the application server domain. A warning alarm will be fired if the state 'medium' is equal to 1. A critical alarm will be fired if the state 'heavy' is equal to 1. |
5 |
Average Service Request Execution Time (ms) Metric name: AverageServiceRequestExecutionTime Warning MQL:
|
> 1000 | NA |
Average time in milliseconds it takes to execute a service request. Warning alarm is fired when in average a request takes more than a second (1000 ms) to be executed. |
5 |
Queued Processes for Application Server (count) Metric name: QueuedTuxedoProcesses Dimension name: Category Dimension value: ApplicationServer Critical MQL:
|
NA | > 1 | Number of processes that are currently in queue for the Application Server. More than 1 process in queue will fire a critical alarm. |
5 |
Queued Processes for BRK Handler (count) Metric name: QueuedTuxedoProcesses Dimension name: Category Dimension value: BRKHandler Critical MQL:
|
NA | > 1 | Number of processes that are currently in queue for the BRK Handler. More than 1 process in queue will fire a critical alarm. |
5 |
Queued Processes for BRK Dispatcher (count) Metric name: QueuedTuxedoProcesses Dimension name: Category Dimension value: BRKDispatcher Critical MQL:
|
NA | > 1 | Number of processes that are currently in queue for the BRK Dispatcher. More than 1 process in queue will fire a critical alarm. |
5 |
Queued Processes for PUB Dispatcher (count) Metric name: QueuedTuxedoProcesses Dimension name: Category Dimension value: PUBDispatcher MQL:
|
NA | > 1 | Number of processes that are currently in queue for the PUB Dispatcher. More than 1 process in queue will fire a critical alarm. |
5 |
Queued Processes for PUB Handler (count) Metric name: QueuedTuxedoProcesses Dimension name: Category Dimension value: PUBHandler Critical MQL:
|
NA | > 1 | Number of processes that are currently in queue for the PUB Handler. More than 1 process in queue will fire a critical alarm. |
5 |
Queued Processes for SUB Dispatcher (count) Metric name: QueuedTuxedoProcesses Dimension name: Category Dimension value: SUBDispatcher Critical MQL:
|
NA | > 1 | Number of processes that are currently in queue for the SUB Dispatcher. More than 1 process in queue will fire a critical alarm. |
5 |
Queued Processes for SUB Handler (count) Metric name: QueuedTuxedoProcesses Dimension name: Category Dimension value: SUBHandler Critical MQL:
|
NA | > 1 | Number of processes that are currently in queue for the SUB Handler. More than 1 process in queue will fire a critical alarm. |
5 |
Failed Server Processes (count) Metric name: FailedServerProcesses Critical MQL:
|
NA | > 0 | Number of server processes that have failed or are down within the domain. If any server process fails, a critical alarm will be fired. |
15 |
State Files (count) Metric name: PeopleToolsStateFiles Warning MQL:
|
> 0 | NA | Number of PeopleTools state files generated in the domain logs directory. If any state file is generated, a warning alarm will be fired. |
PeopleSoft Process Scheduler
- Resource Type: PeopleSoft Process Scheduler Domain
- Metric Namespace: oracle_appmgmt
- Resource Group: oracle_psft_prcs
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
5 |
Health (status) Metric name: Health Warning MQL:
Critical MQL:
|
warning = 1 | critical = 1 |
Overall health of the process scheduler domain. A warning alarm will be fired if the state 'warning' is equal to 1. A critical alarm will be fired if the state 'critical' is equal to 1. |
5 |
Load (status) Metric name: Load Warning MQL:
Critical MQL:
|
medium = 1 | heavy= 1 |
Overall load of the process scheduler domain. A warning alarm will be fired if the state 'medium' is equal to 1. A critical alarm will be fired if the state 'heavy' is equal to 1. |
5 |
Queued Processes for PSPRCSRV (count) Metric name: QueuedTuxedoProcesses Dimension name: ProcessType Dimension value: PSPRCSRV Critical MQL:
|
NA | > 1 | Number of processes that are currently in queue for the process scheduler (PSPRCSRV). More than 1 process in queue will fire a critical alarm. |
5 |
Queued Processes for PSDSTSRV (count) Metric name: QueuedTuxedoProcesses Dimension name: ProcessType Dimension value: PSDSTSRV Critical MQL:
|
NA | > 1 | Number of processes that are currently in queue for the distribution server (PSDSTSRV). More than 1 process in queue will fire a critical alarm. |
5 |
Failed Processes (count) Metric name: FailedProcesses Critical MQL:
|
NA | > 0 | Number of server processes that have failed or are down within the domain. If any server process fails, a critical alarm will be fired. |
PeopleSoft PIA
- Resource Type: PeopleSoft PIA
- Metric Namespace: oracle_appmgmt
- Resource Group: oracle_psft_pia
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
5 |
Health (status) Metric name: Health Warning MQL:
Critical MQL:
|
warning = 1 | critical = 1 |
Overall health of the PIA. A warning alarm will be fired if the state 'warning' is equal to 1. A critical alarm will be fired if the state 'critical' is equal to 1. |
5 |
Load (status) Metric name: Load Warning MQL:
Critical MQL:
|
medium = 1 | heavy= 1 |
Overall load of the PIA. A warning alarm will be fired if the state 'medium' is equal to 1. A critical alarm will be fired if the state 'heavy' is equal to 1. |
5 |
Wait State Sockets (count) Metric name: WaitStateSockets Warning MQL:
|
> 100 | NA | Number of web server sockets that are in WAIT state. If more than 100 web server sockets are in WAIT state, a warning alarm will be fired. |
5 |
Fatal Errors (count) Metric name: FatalErrors Warning MQL:
|
> 0 | NA | Number of fatal errors in the JOLTService servlet logs. If any error occurs in the JOLTService servlet, a warning alarm will be fired. |
PeopleSoft Elasticsearch
- Resource Type: PeopleSoft Elasticsearch
- Metric Namespace: oracle_appmgmt
- Resource Group: elastic_search
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Metric name: Cluster Health MQL:
|
yellow = 1 | red = 1 |
Overall health of the elasticsearch cluster. A warning alert will be triggered if the status 'yellow' is equal to 1. A critical alert will be triggered if the status 'red' is equal to 1. |
10 |
Metric name: Memory Utilization MQL:
|
> 80 | > 90 |
Maximum configured heap of the elasticsearch node. A warning alert will be triggered if the memory utilization is grater than 80 %. A critical alert will be triggered if the memory utilization is grater than 90 %. |
PeopleSoft Process Monitor
- Resource Type: PeopleSoft Process Monitor
- Metric Namespace: oracle_appmgmt
- Resource Group: oracle_psft_prcm
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
5 |
Metric Name: Active Distribution State MQL:
|
NA | > 1 |
A critical alert will be triggered if too many processes in distribution not posted state. |
5 |
Metric Name :Run Status MQL:
|
NA | > 1 |
A critical alert will be triggered if too many processes in run no success state. |
5 |
Metric Name: Run Status MQL:
|
NA | > 0 |
A critical alert will be triggered if too many processes in run error state. |
Oracle Weblogic Server
Sample Alarm Rule: Oracle Weblogic Server
- Resource Type: OracleWeblogic Server
- Metric Namespace: oracle_appmgmt
- Resource Group: weblogic_j2eeserver
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
WebLogic Server Down Metric name: MonitoringStatus Critical MQL:
|
- | - | Critical alarm for any WebLogic Server in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
5 |
High CPU Utilization (Warning/Critical) Metric name: CpuUtilization Warning MQL:
Critical MQL:
|
> 80 | > 90 |
Warning alarm for any WebLogic Server in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any WebLogic Server in a given compartment reporting over 90% CPU utilization for past 5 minutes. |
5 |
High Heap Utilization (Warning/Critical) Metric name: MemoryUtilization Warning MQL:
{Type = "Heap"} Critical MQL:
{Type = "Heap"} |
> 80 | > 90 |
Warning alarm for any WebLogic Server in a given compartment reporting over 80% Heap utilization for past 5 minutes. Critical alarm for any WebLogic Server in a given compartment reporting over 90% Heap utilization for past 5 minutes. |
5 |
Work Manager Stuck Threads (Warning/Critical) Metric name: WorkManagerStuckThreads |
> 10 | > 15 |
Warning alarm for any WebLogic Server in a given compartment reporting more than 10 work manager stuck thread for past 5 minutes. Critical alarm for any WebLogic Server in a given compartment reporting more than 15 work manager stuck thread for past 5 minutes. |
15 |
Connection Requests Waiting Metric name: ServerConnectionPoolConnections Warning MQL: ServerConnectionPoolConnections [15m].mean() > 1 Critical MQL: ServerConnectionPoolConnections
|
>1 | >2 | |
20 |
Web Request Processing Time Metric name: WebRequestProcessingTime |
>10000 | >15000 | |
5 |
Active Thread Pool Threads Metric name: ThreadPoolThreads |
>1000 | >1250 |
Sample Alarm Rule: Oracle Weblogic Server Cluster
-
Resource Type: Oracle Weblogic Server Cluster
-
Metric Namespace: oracle_appmgmt
-
Resource Group: weblogic_cluster
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
WebLogic Cluster Down Metric name: MonitoringStatus Critical MQL:
|
- | - | Critical alarm for any WebLogic Cluster in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
Sample Alarm Rules: Oracle HTTP Server (OHS)
-
Resource Type: Oracle HTTP Server
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_http_server
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 (or more to ignore random connectivity blips, e.g. 5 min) |
Oracle HTTP Server Down Metric name: MonitoringStatus Critical MQL:
|
- | - | Critical alarm for any Oracle HTTP Server in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
5 |
High CPU Utilization (Warning/Critical) Metric name: CPUUtilization Warning MQL:
Critical MQL:
|
>80 | >90 |
Warning alarm for any Oracle HTTP Server in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Oracle HTTP Server in a given compartment reporting over 90% CPU utilization for past 5 minutes. |
5 |
High Memory Utilization (Warning/Critical) Metric name: MemoryUtilization Warning MQL:
Critical MQL:
|
>80 | >90 |
Warning alarm for any Oracle HTTP Server in a given compartment reporting over 80% memory utilization for past 5 minutes. Critical alarm for any Oracle HTTP Server in a given compartment reporting over 90% memory utilization for past 5 minutes. |
1-5 |
High Web Request Processing Time (Warning/Critical) Metric name: WebRequestProcessingTime Warning MQL:
Critical MQL:
|
>1500 | >3000 |
Warning alarm for any Oracle HTTP Server in a given compartment reporting over 1500ms mean web request processing time for past 1-5 minutes. Critical alarm for any Oracle HTTP Server in a given compartment reporting over 3000ms mean web request processing time for past 1-5 minutes. |
Oracle Identity Manager (OIM)
Sample Alarm Rule: Oracle Identity Manager (OIM)
-
Resource Type: Oracle Identity Manager / Oracle Identity Manager Cluster
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_oim / oracle_oim_cluster
Evaluation Time Period (minutes) | Alarm | Warning | Critical | Description |
---|---|---|---|---|
1 |
Metric name: Monitoring Status MQL:
|
---- | < 1 |
Availability status of the OIM cluster/server. A critical alert will be triggered if the response value is other than 1. Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
15 |
Metric Name: Orchestration - Average Execution Time MQL:
|
> 300 | > 500 |
Orchestration Average Execution Time A warning alert will be triggered if the orchestration average execution time is grater than 300 ms A critical alert will be triggered if the orchestration average execution time is grater than 500 ms |
Oracle Access Manager (OAM)
Sample Alarm Rule: Oracle Access Manager (OAM)
-
Resource Type: Oracle Access Manager / Oracle Access Manager Cluster
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_oam / oracle_oam_cluster
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Metric name: Monitoring Status MQL:
|
---- | < 1 |
Availability status of the OAM cluster/server. A critical alert will be triggered if the response value is other than 1. Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
5 |
Metric Name: Authorization Latency MQL:
|
> 500 | > 800 |
Authorization Latency A warning alert will be triggered if the authorization latency is grater than 500 ms A critical alert will be triggered if the authorization latency is grater than 800 ms |
Apache HTTP Server
Resource Type: Apache HTTP
ServerMetric Namespace: oracle_appmgmt
Resource Group: apache_http_server
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 (or more to ignore random connectivity blips, e.g. 5 min) |
Apache HTTP Server Down Metric name: MonitoringStatus Critical MQL:
|
- | - | Critical alarm for any Apache HTTP Server in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
10 |
High Memory Utilization (Warning/Critical) Metric name: MemoryUtilization Warning MQL:
Critical MQL:
|
>80 | >90 |
Warning alarm for any Apache HTTP Server in a given compartment reporting over 80% memory utilization for past 10 minutes. Critical alarm for any Apache HTTP Server in a given compartment reporting over 90% memory utilization for past 10 minutes. |
1-5 |
High Web Request Processing Time (Warning/Critical) Metric name: WebRequestProcessingTime Warning MQL:
Critical MQL:
|
>1500 | >3000 |
Warning alarm for any Apache HTTP Server in a given compartment reporting over 1500ms mean web request processing time for past 1-5 minutes. Critical alarm for any Apache HTTP Server in a given compartment reporting over 3000ms mean web request processing time for past 1-5 minutes. |
Oracle Unified Directory
Sample Alarm Rule: Oracle Unified Directory(OUD)
-
Resource Type: Oracle Unified Directory
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oud_directory, oud_proxy, oud_gateway
Alarm | Warning | Critical | Description |
---|---|---|---|
Metric name: Monitoring Status MQL: oud_base_status[1m].mean() != 1 || oud_base_status[1m].absent() |
---- | < 1 |
Availability status of the OUD server. A critical alert will be triggered if the response value is less than 1. |
Metric Name: Connection Handler State MQL: ConnectionHandlerState[1m].mean() < 1 |
---- | <1 |
Connection Handler State A critical alert will be triggered if the Connection Handler State is less than 1.
|
Metric Name: Backend Entries MQL: BackendEntries[5m].mean() > 30 BackendEntries[5m].mean() > 50 |
> 30 | >50 |
Backend Entries A warning alert will be triggered if the Backend Entries Time is greater than 30. A critical alert will be triggered if the Backend Entries is greater than 50. |
Metric Name: Connection Handler All Resident Time MQL: ConnectionHandlerAllResidentTime[5m].mean() > 300 ConnectionHandlerAllResidentTime[5m].mean() > 500 |
> 60 | > 90 |
Connection Handler All Resident Time A warning alert will be triggered if the Connection Handler All Resident Time is greater than 60. A critical alert will be triggered if the Connection Handler All Resident Time is grater than 90.
|
Metric Name: Connection Handler Connections MQL: ConnectionHandlerConnections[5m].mean() > 30 ConnectionHandlerConnections[5m].mean() > 50 |
> 30 | >50 |
Connection Handler Connections A warning alert will be triggered if the Connection Handler Connections are greater than 30. A critical alert will be triggered if the Connection Handler Connections are greater than 50.
|
Metric Name: JVM Used Memory MQL: JVMUsedMemory[5m].mean() > 1.5 JVMUsedMemory[5m].mean() > 3 |
> 1.5 | > 3 |
JVM Used Memory A warning alert will be triggered if the JVM Used Memory is greater than 1.5 mb. A critical alert will be triggered if the JVM Used Memory is greater than 3 mb.
|
Metric Name: OS Used Memory MQL: OSUsedMemory[5m].mean() > 1.5 OSUsedMemory[5m].mean() > 3 |
> 1.5 | > 3 |
OS Used Memory A warning alert will be triggered if the OS Used Memory is greater than 1.5 mb. A critical alert will be triggered if the OS Used Memory is greater than 3 mb.
|
Metric Name: Replication Domain State MQL: ReplicationDomainState[5m].mean() < 1 |
---- | < 1 |
Replication Domain State A critical alert will be triggered if the Replication Domain State is less than 1.
|
Metric Name: WFE Resident Time Operations Total Time MQL: WFEResidentTimeOperationsTotalTime[5m].mean() > 60 WFEResidentTimeOperationsTotalTime[5m].mean() > 90 |
> 60 | > 90 |
WFE Resident Time Operations Total Time A warning alert will be triggered if the WFE Resident Time Operations Total Time is greater than 60. A critical alert will be triggered if the WFE Resident Time Operations Total Time is greater than 90.
|
Metric Name: Work Queue Current Backlog MQL: WorkQueueCurrentBacklog[5m].mean() > 15 WorkQueueCurrentBacklog[5m].mean() > 30 |
> 15 | > 30 |
Work Queue Current Backlog A warning alert will be triggered if the Work Queue Current Backlog is greater than 15. A critical alert will be triggered if the Work Queue Current Backlog is greater than 30.
|
Metric Name: Extension LDAP Connections MQL: ExtensionLDAPConnections[5m].mean() > 30 ExtensionLDAPConnections[5m].mean() > 50 |
> 30 | > 50 |
Extension LDAP Connections A warning alert will be triggered if the Extension LDAP Connections are greater than 30. A critical alert will be triggered if the Extension LDAP Connections is greater than 50.
|
Metric Name: Extension LDAP Operations Total Response Time MQL: ExtensionLDAPOperationsTotalResponseTime[5m].mean() > 60 ExtensionLDAPOperationsTotalResponseTime[5m].mean() > 90 |
> 60 | > 90 |
Extension LDAP Operations Total Response Time A warning alert will be triggered if the Extension LDAP Operations Total Response Time is greater than 60. A critical alert will be triggered if the Extension LDAP Operations Total Response Time is greater than 90. |
Oracle GoldenGate
Sample Alarm Rule: Goldengate
-
Resource Type: oracle_goldengate
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_goldengate
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Goldengate Down Metric name: MonitoringStatus Critical MQL: MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent() |
Critical alarm for any Goldengate in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
Sample Alarm Rule: Goldengate AdminServer
-
Resource Type: Goldengate Admin Server
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_goldengate_admin_server
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Goldengate Admin Server Down Metric name: MonitoringStatusCritical MQL: MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent() |
Critical alarm for any Goldengate AdminServer in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
||
5 |
High CPU Utilization (Warning/Critical) Metric name: CPU utilization Warning MQL: CpuTimeUtilizationPercentage[5m].mean() > 80 Critical MQL: CpuTimeUtilizationPercentage[5m].mean() > 90 |
80 | 90 | Warning alarm for any Goldengate Admin Server in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Admin Server in a given compartment reporting over 90% CPU utilization for past 5 minutes. |
5 |
Private memory (Warning/Critical) Metric name: Private memory Warning MQL: PrivateMemory[5m].mean() > 30 Critical MQL: PrivateMemory[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes. |
5 |
I/O read rate (Warning/Critical) Metric name: I/O read rate Warning MQL: IOReadRate[5m].mean() > 10 Critical MQL: IOReadRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes. |
5 |
I/O write rate (Warning/Critical) Metric name: I/O write rate Warning MQL: IOWriteRate[5m].mean() > 10 Critical MQL: IOWriteRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes. |
5 |
Dropped packet rate (Warning/Critical) Metric name: Dropped packet rate Warning MQL: DroppedPacketRate[5m].mean() > 30 Critical MQL: DroppedPacketRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Dropped packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Dropped packet rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Missing packet rate (Warning/Critical) Metric name: Missing packet rate Warning MQL: MissingPacketRate[5m].mean() > 30 Critical MQL: MissingPacketRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Missing packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Missing packet rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Packet error rate (Warning/Critical) Metric name: Packet error rate Warning MQL: PacketErrorRate[5m].mean() > 30 Critical MQL: PacketErrorRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Packet error rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet error rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Packet receive rate (Warning/Critical) Metric name: Packet receive rate Warning MQL: PacketReceiveRate[5m].mean() > 30 Critical MQL: PacketReceiveRate[5m].mean() > 40 |
30 | 40 |
A warning alert will be triggered if the Packet receive rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet receive rate mean is grater than 40 Msgs/min for past 5 minutes. |
Sample Alarm Rule: Goldengate Distribution Service
-
Resource Type: Goldengate Distribution Service
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_goldengate_distribution_server
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Goldengate Distribution Service Metric name: MonitoringStatus Critical MQL:MonitoringStatus [1m].mean() == 0 || MonitoringStatus[1m].absent() |
Critical alarm for any Goldengate Distribution Service in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
||
5 |
High CPU Utilization (Warning/Critical) Metric name: CPU utilization Warning MQL: CpuTimeUtilizationPercentage[5m].mean() > 80 Critical MQL: CpuTimeUtilizationPercentage[5m].mean() > 90 |
80 | 90 | Warning alarm for any Goldengate Distribution Service in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Distribution Service in a given compartment reporting over 90% CPU utilization for past 5 minutes. |
5 |
Private memory (Warning/Critical) Metric name: Private memory Warning MQL: PrivateMemory[5m].mean() > 30 Critical MQL: PrivateMemory[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes. |
5 |
I/O read rate (Warning/Critical) Metric name: I/O read rate Warning MQL: IOReadRate[5m].mean() > 10 Critical MQL: IOReadRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes. |
5 |
I/O write rate (Warning/Critical) Metric name: I/O write rate Warning MQL: IOWriteRate[5m].mean() > 10 Critical MQL: IOWriteRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes. |
5 |
Dropped packet rate (Warning/Critical) Metric name: Dropped packet rate Warning MQL: DroppedPacketRate[5m].mean() > 30 Critical MQL: DroppedPacketRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Dropped packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Dropped packet rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Missing packet rate (Warning/Critical) Metric name: Missing packet rate Warning MQL: MissingPacketRate[5m].mean() > 30 Critical MQL: MissingPacketRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Missing packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Missing packet rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Packet error rate (Warning/Critical) Metric name: Packet error rate Warning MQL: PacketErrorRate[5m].mean() > 30 Critical MQL: PacketErrorRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Packet error rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet error rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Packet receive rate (Warning/Critical) Metric name: Packet receive rate Warning MQL: PacketReceiveRate[5m].mean() > 30 Critical MQL: PacketReceiveRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Packet receive rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet receive rate mean is grater than 40 Msgs/min for past 5 minutes. |
Sample Alarm Rule: Goldengate Receiver Service
-
Resource Type: Goldengate Receiver Service
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_goldengate_receiver_server
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Goldengate Receiver Service Metric name: MonitoringStatus Critical MQL: MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent() |
Critical alarm for any Goldengate Receiver Service in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
||
5 |
High CPU Utilization (Warning/Critical) Metric name: CPU utilization Warning MQL: CpuTimeUtilizationPercentage[5m].mean() > 80 Critical MQL: CpuTimeUtilizationPercentage[5m].mean() > 90 |
80 | 90 | Warning alarm for any Goldengate Receiver Service in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Receiver Service in a given compartment reporting over 90% CPU utilization for past 5 minutes. |
5 |
Private memory (Warning/Critical) Metric name: Private memory Warning MQL: PrivateMemory[5m].mean() > 30 Critical MQL: PrivateMemory[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes. |
5 |
I/O read rate (Warning/Critical) Metric name: I/O read rate Warning MQL: IOReadRate[5m].mean() > 10 Critical MQL: IOReadRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes. |
5 |
I/O write rate (Warning/Critical) Metric name: I/O write rate Warning MQL: IOWriteRate[5m].mean() > 10 Critical MQL: IOWriteRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes. |
5 |
Dropped packet rate (Warning/Critical) Metric name: Dropped packet rate Warning MQL: DroppedPacketRate[5m].mean() > 30 Critical MQL: DroppedPacketRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Dropped packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Dropped packet rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Missing packet rate (Warning/Critical) Metric name: Missing packet rate Warning MQL: MissingPacketRate[5m].mean() > 30 Critical MQL: MissingPacketRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Missing packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Missing packet rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Packet error rate (Warning/Critical) Metric name: Packet error rate Warning MQL: PacketErrorRate[5m].mean() > 30 Critical MQL: PacketErrorRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Packet error rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet error rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Packet receive rate (Warning/Critical) Metric name: Packet receive rate Warning MQL: PacketReceiveRate[5m].mean() > 30 Critical MQL: PacketReceiveRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Packet receive rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet receive rate mean is grater than 40 Msgs/min for past 5 minutes. |
Sample Alarm Rule: Goldengate Service Manager
-
Resource Type: Goldengate Service Manager
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_goldengate_service_manager
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Goldengate Service Manager Metric name: Monitoring StatusCritical MQL: MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent() |
Critical alarm for any Goldengate Service Manager in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
Sample Alarm Rule: Goldengate Performance Metric Service
-
Resource Type: Goldengate Performance Metric Service
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_goldengate_pm_server
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Goldengate Performance Metric Service Metric name: MonitoringStatus Critical MQL: MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent() |
Critical alarm for any Goldengate Performance Metric Service in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
||
5 |
High CPU Utilization (Warning/Critical) Metric name: CPU utilization Warning MQL: CpuTimeUtilizationPercentage[5m].mean() > 80 Critical MQL: CpuTimeUtilizationPercentage[5m].mean() > 90 |
80 | 90 | Warning alarm for any Goldengate Performance Metric Service in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Performance Metric Service in a given compartment reporting over 90% CPU utilization for past 5 minutes. |
5 |
Private memory (Warning/Critical) Metric name: Private memory Warning MQL: PrivateMemory[5m].mean() > 30 Critical MQL: PrivateMemory[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes. |
5 |
I/O read rate (Warning/Critical) Metric name: I/O read rate Warning MQL: IOReadRate[5m].mean() > 10 Critical MQL: IOReadRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes. |
5 |
I/O write rate (Warning/Critical) Metric name: I/O write rate Warning MQL: IOWriteRate[5m].mean() > 10 Critical MQL: IOWriteRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes. |
Sample Alarm Rule: Goldengate Extract
-
Resource Type: Goldengate Extract
-
Resource Type: Goldengate Extract
-
Resource Group: oracle_goldengate_extract
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Goldengate Extract Metric name: MonitoringStatus Critical MQL: MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent() |
Critical alarm for any Goldengate Extract in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
||
5 |
High CPU Utilization (Warning/Critical) Metric name: CPU utilization Warning MQL: CpuTimeUtilizationPercentage[5m].mean() > 80 Critical MQL: CpuTimeUtilizationPercentage[5m].mean() > 90 |
80 | 90 | Warning alarm for any Goldengate Extract in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Extract in a given compartment reporting over 90% CPU utilization for past 5 minutes. |
5 |
Private memory (Warning/Critical) Metric name: Private memory Warning MQL: PrivateMemory[5m].mean() > 30 Critical MQL: PrivateMemory[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes. |
5 |
I/O read rate (Warning/Critical) Metric name: I/O read rate Warning MQL: IOReadRate[5m].mean() > 10 Critical MQL: IOReadRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes. |
5 |
I/O write rate (Warning/Critical) Metric name: I/O write rate Warning MQL: IOWriteRate[5m].mean() > 10 Critical MQL: IOWriteRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes. |
5 |
Dropped packet rate (Warning/Critical) Metric name: Dropped packet rate Warning MQL: DroppedPacketRate[5m].mean() > 30 Critical MQL: DroppedPacketRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Dropped packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Dropped packet rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Missing packet rate (Warning/Critical) Metric name: Missing packet rate Warning MQL: MissingPacketRate[5m].mean() > 30 Critical MQL: MissingPacketRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Missing packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Missing packet rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Packet error rate (Warning/Critical) Metric name: Packet error rate Warning MQL: PacketErrorRate[5m].mean() > 30 Critical MQL: PacketErrorRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Packet error rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet error rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Packet receive rate (Warning/Critical) Metric name: Packet receive rate Warning MQL: PacketReceiveRate[5m].mean() > 30 Critical MQL: PacketReceiveRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Packet receive rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet receive rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Mapped delete rate (Warning/Critical) Metric name: Mapped delete rate Warning MQL: MappedDeleteRate[5m].mean() > 30 Critical MQL: MappedDeleteRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Mapped delete rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped delete rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Mapped insert rate (Warning/Critical) Metric name: Mapped insert rate Warning MQL: MappedInsertRate[5m].mean() > 30 Critical MQL: MappedInsertRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Mapped insert rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped insert rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Mapped truncate rate (Warning/Critical) Metric name: Mapped truncate rate Warning MQL: MappedTruncateRate[5m].mean() > 30 Critical MQL: MappedTruncateRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Mapped truncate rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped truncate rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Mapped update rate (Warning/Critical) Metric name: Mapped update rate Warning MQL: MappedUpdateRate[5m].mean() > 30 Critical MQL: MappedUpdateRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Mapped update rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped update rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Discard rate (Warning/Critical) Metric name: Discard rate Warning MQL: DiscardRate[5m].mean() > 30 Critical MQL: DiscardRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Discard rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Discard rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Ignore rate (Warning/Critical) Metric name: Ignore rate Warning MQL: IgnoreRate[5m].mean() > 30 Critical MQL: IgnoreRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Ignore rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Ignore rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Lag (Warning/Critical) Metric name: Lag Warning MQL: Lag[5m].mean() > 10 Critical MQL: Lag[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the Lag mean is grater than 10 Sec for past 5 minutes. A critical alert will be triggered if the Lag mean is grater than 20 Ssec for past 5 minutes. |
5 |
Operations rate (Warning/Critical) Metric name: Operations rate Warning MQL: OperationsPerSec[5m].mean() > 30 Critical MQL: OperationsPerSec[5m].mean() > 40 |
20 | 30 | A warning alert will be triggered if the Operations rate mean is grater than 20 Ops/sec for past 5 minutes. A critical alert will be triggered if the Operations rate mean is grater than 30 Ops/sec for past 5 minutes. |
Sample Alarm Rule: Goldengate Replicat
-
Resource Type: Goldengate Replicat
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_goldengate_replicat
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Goldengate Replicat Metric name: MonitoringStatus Critical MQL: MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent() |
Critical alarm for any Goldengate Replicat in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
||
5 |
High CPU Utilization (Warning/Critical) Metric name: CPU utilization Warning MQL: CpuTimeUtilizationPercentage[5m].mean() > 80 Critical MQL: CpuTimeUtilizationPercentage[5m].mean() > 90 |
80 | 90 | Warning alarm for any Goldengate Replicat in a given compartment reporting over 80% CPU utilization for past 5 minutes. Critical alarm for any Goldengate Replicat in a given compartment reporting over 90% CPU utilization for past 5 minutes. |
5 |
Private memory (Warning/Critical) Metric name: Private memory Warning MQL: PrivateMemory[5m].mean() > 30 Critical MQL: PrivateMemory[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Private memory mean is grater than 30 GB for past 5 minutes. A critical alert will be triggered if the Private memory mean is grater than 40 GB for past 5 minutes. |
5 |
I/O read rate (Warning/Critical) Metric name: I/O read rate Warning MQL: IOReadRate[5m].mean() > 10 Critical MQL: IOReadRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O read rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O read rate mean is grater than 20 MB/sec for past 5 minutes. |
5 |
I/O write rate (Warning/Critical) Metric name: I/O write rate Warning MQL: IOWriteRate[5m].mean() > 10 Critical MQL: IOWriteRate[5m].mean() > 20 |
10 | 20 | A warning alert will be triggered if the I/O write rate mean is grater than 10 MB/sec for past 5 minutes. A critical alert will be triggered if the I/O write rate mean is grater than 20 MB/sec for past 5 minutes. |
5 |
Dropped packet rate (Warning/Critical) Metric name: Dropped packet rate Warning MQL: DroppedPacketRate[5m].mean() > 30 Critical MQL: DroppedPacketRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Dropped packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Dropped packet rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Missing packet rate (Warning/Critical) Metric name: Missing packet rate Warning MQL: MissingPacketRate[5m].mean() > 30 Critical MQL: MissingPacketRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Missing packet rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Missing packet rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Packet error rate (Warning/Critical) Metric name: Packet error rate Warning MQL: PacketErrorRate[5m].mean() > 30 Critical MQL: PacketErrorRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Packet error rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet error rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Packet receive rate (Warning/Critical) Metric name: Packet receive rate Warning MQL: PacketReceiveRate[5m].mean() > 30 Critical MQL: PacketReceiveRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Packet receive rate mean is grater than 30 Msgs/min for past 5 minutes. A critical alert will be triggered if the Packet receive rate mean is grater than 40 Msgs/min for past 5 minutes. |
5 |
Mapped delete rate (Warning/Critical) Metric name: Mapped delete rate Warning MQL: MappedDeleteRate[5m].mean() > 30 Critical MQL: MappedDeleteRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Mapped delete rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped delete rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Mapped insert rate (Warning/Critical) Metric name: Mapped insert rate Warning MQL: MappedInsertRate[5m].mean() > 30 Critical MQL: MappedInsertRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Mapped insert rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped insert rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Mapped truncate rate (Warning/Critical) Metric name: Mapped truncate rate Warning MQL: MappedTruncateRate[5m].mean() > 30 Critical MQL: MappedTruncateRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Mapped truncate rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped truncate rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Mapped update rate (Warning/Critical) Metric name: Mapped update rate Warning MQL: MappedUpdateRate[5m].mean() > 30 Critical MQL: MappedUpdateRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Mapped update rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Mapped update rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Discard rate (Warning/Critical) Metric name: Discard rate Warning MQL: DiscardRate[5m].mean() > 30 Critical MQL: DiscardRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Discard rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Discard rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Ignore rate (Warning/Critical) Metric name: Ignore rate Warning MQL: IgnoreRate[5m].mean() > 30 Critical MQL: IgnoreRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Ignore rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Ignore rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Lag (Warning/Critical) Metric name: Lag Warning MQL: Lag[5m].mean() > 30 Critical MQL: Lag[5m].mean() > 40 |
10 | 20 | A warning alert will be triggered if the Lag mean is grater than 10 Sec for past 5 minutes. A critical alert will be triggered if the Lag mean is grater than 20 Ssec for past 5 minutes. |
5 |
Operations rate (Warning/Critical) Metric name: Operations rate Warning MQL: OperationsPerSec[5m].mean() > 30 Critical MQL: OperationsPerSec[5m].mean() > 40 |
20 | 30 | A warning alert will be triggered if the Operations rate mean is grater than 20 Ops/sec for past 5 minutes. A critical alert will be triggered if the Operations rate mean is grater than 30 Ops/sec for past 5 minutes. |
Sample Alarm Rule: Goldengate Distribution Path
-
Resource Type: Goldengate Distribution Path
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_goldengate_distribution_path
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Goldengate Distribution Path Metric name: MonitoringStatus Critical MQL: MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent() |
Critical alarm for any Goldengate Distribution Path in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
||
5 |
Lag (Warning/Critical) Metric name: Lag Warning MQL: Lag[5m].mean() > 30 Critical MQL: Lag[5m].mean() > 40 |
10 | 20 | A warning alert will be triggered if the Lag mean is grater than 10 Sec for past 5 minutes. A critical alert will be triggered if the Lag mean is grater than 20 Ssec for past 5 minutes. |
5 |
Network sent rate (Warning/Critical) Metric name: Network sent rate Warning MQL: NetworkSentRate[5m].mean() > 30 Critical MQL: NetworkSentRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Network sent rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Network sent rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Network receive rate (Warning/Critical) Metric name: Network receive rate Warning MQL: NetworkReceiveRate[5m].mean() > 30 Critical MQL: NetworkReceiveRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Network sent receive mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Network receive rate mean is grater than 40 MB/sec for past 5 minutes. |
Sample Alarm Rule: Goldengate Receiver Path
-
Resource Type: Goldengate Receiver Path
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_goldengate_receiver_path
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Goldengate Receiver Path Metric name: MonitoringStatus Critical MQL: MonitoringStatus[1m].mean() == 0 || MonitoringStatus[1m].absent() |
Critical alarm for any Goldengate Receiver Path in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
||
5 |
Lag (Warning/Critical) Metric name: Lag Warning MQL: Lag[5m].mean() > 30 Critical MQL: Lag[5m].mean() > 40 |
10 | 20 | A warning alert will be triggered if the Lag mean is grater than 10 Sec for past 5 minutes. A critical alert will be triggered if the Lag mean is grater than 20 Ssec for past 5 minutes. |
5 |
Network sent rate (Warning/Critical) Metric name: Network sent rate Warning MQL: NetworkSentRate[5m].mean() > 30 Critical MQL: NetworkSentRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Network sent rate mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Network sent rate mean is grater than 40 MB/sec for past 5 minutes. |
5 |
Network receive rate (Warning/Critical) Metric name: Network receive rate Warning MQL: NetworkReceiveRate[5m].mean() > 30 Critical MQL: NetworkReceiveRate[5m].mean() > 40 |
30 | 40 | A warning alert will be triggered if the Network sent receive mean is grater than 30 MB/sec for past 5 minutes. A critical alert will be triggered if the Network receive rate mean is grater than 40 MB/sec for past 5 minutes. |
Process-based Custom Resource Sample Alarm Rules
-
Resource Type: Custom Resource
-
Metric Namespace: oracle_appmgmt
-
Resource Group: custom_resource
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 |
Custom Resource Down Metric name: MonitoringStatus Critical MQL:
|
Critical alarm for any custom resource in a given compartment being down or not reporting status for over 1 minute.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
||
5 |
High CPU Utilization (Warning/Critical) Metric name: CpuUtilization Warning MQL:
Critical MQL:
|
>80 | >90 |
Warning alarm for any custom resource in a given compartment reporting over 80% CPU utilization over 5 minutes. Critical alarm for any custom resource in a given compartment reporting over 90% CPU utilization over 5 minutes. |
15 |
High Memory Utilization (Warning/Critical) Metric Name: MemoryUtilization Warning MQL:
Critical MQL:
|
>80 | >90 |
Warning alarm for any custom resource in a given compartment reporting over 80% memory utilization over 15 minutes. Critical alarm for any custom resource in a given compartment reporting over 90% memory utilization over 15 minutes. |
Oracle Service Bus (OSB)
-
Resource Type: Oracle Service Bus
-
Metric Namespace: oracle_appmgmt
-
Resource Group: oracle_servicebus
Evaluation Time Period (minutes) | Alarm Rule | Warning | Critical | Description |
1 |
Metric name: MonitoringStatus Critical MQL:
|
Critical alarm for any Service Bus in a given compartment reporting being down or not reporting status for over 1min.
Note
When configuring an alarm for MonitoringStatus define Trigger Delay Minutes to 3 to decrease the possibility of false notifications.
|
||
5 |
Metric name: ServiceBusErrors Critical MQL:
|
>0 | Critical alarm for any Service Bus in a given compartment that reports when there are errors in any of the OSB services for over 5 minutes. |
Microsoft IIS
-
Resource Type: IIS
-
Metric Namespace: oracle_appmgmt
-
Resource Group: microsoft_iis
Evaluation Time Period | Alarm Tule | Warning | Critical | Description |
---|---|---|---|---|
1 (or more to ignore random connectivity blips, e.g. 5 min) |
IIS Down Metric Name: MonitoringStatus Critical MQL: MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() |
Critical Alarm for no longer being able to connect to IIS | ||
5 |
ASP.Net Worker Process Restart Metric Name: ASPDotNetWorkerProcessRestarts Critical MQL: ASPDotNetWorkerProcessRestarts[1m].mean() > 1 |
>1 | Critical alarm to know if we have had any Worker Process Restarts. This can be caused by a number of things and cause issues including performance impact, and information loss. | |
5 |
ASP.Net Requests Queued Metric Name: ASPDotNetRequests.Type.Queued Warning MQL: ASPDotNetRequests.Type.Queued[1m].mean() > 5 Critical MQL: ASPDotNetRequests.Type.Queued[1m].mean() > 10 |
>5 | >10 | Critical / Warning threshold to let us know if incoming HTTP requests are being put into a queue due to load. |
5 |
ASP.Net Error Rate Metric Name: ErrorRate Warning MQL: ErrorRate[1m].mean() > 1%* Critical MQL: ErrorRate[1m].mean() > 2%* |
> 1%* | > 2%* | Critical / Warning Threshold to alert the user that the error rate of an ASP.Net Application is above a certain threshold. This metric returns in errors/second, so the threshold will need to be set based on the average total requests they get. For example, if they usually get 100 requests/sec, we suggest 1 error for a warning and 2 for a critical. |
-
Resource Type: IIS Website
-
Metric Namespace: oracle_appmgmt
-
Resource Group: microsoft_iis
Evaluation time Period (in minutes) | Alarm Rule | Warning | Critical | Description |
---|---|---|---|---|
1 (or more to ignore random connectivity blips, e.g. 5 min) |
IIS Website Down Metric Name: MonitoringStatus Critical MQL: MonitoringStatus[1m].mean() != 1 || MonitoringStatus[1m].absent() |
Critical Alarm for no longer being able to connect to IIS Website | ||
5 |
WWW Current Connections Metric Name: CurrentConnections.Service.WWW Warning MQL: CurrentConnections.Service.WWW[1m].mean() > 90%* Critical MQL: CurrentConnections.Service.WWW[1m].mean() > 95%* |
> 90%* | > 95%* | Critical / Warning threshold to alert the user that the number of connections is getting close to maximum. Note, the customer will need to set the number to 90 / 95% of their max connections. The metric is a number, so this alert being set will be unique to each customer. For example, if the user has 200 total connections allowed, we suggest 180 for warning and 190 for critical. |
Metric Extensions
You can create alarm rules to trigger alarms when metric values from your Metric Extensions cross thresholds. Use the same general workflow that you would follow to create an alarm rule for built-in metrics for your resources. The main difference is in the Metric description section.
- Compartment: choose the compartment of the resource on which the Metric Extension was enabled
- Metric namespace: select
oracle_metric_extensions_appmgmt
- Resource group: the resource type of the resource on which the metric extension was deployed.
Creating an Alarm rule for a Metric Extension of a host is shown in the image below: