Monitor Metrics for VM Cluster Resources

You can monitor the health, capacity, and performance of your VM clusters and databases with metrics, alarms, and notifications. You can use Oracle Cloud Infrastructure Console, Monitoring APIs, or Database Management APIs to view metrics.

Note: To view metrics you must have the required access as specified in an Oracle Cloud Infrastructure policy (whether you're using the Console, the REST API, or another tool). See Getting Started with Policies for information on policies.

WARNING:

Metrics, events, and audit events will not be sent if Cluster Ready Services (CRS) is not running before Autonomous Health Framework (AHF) starts.

Prerequisites for Using Metrics

The following prerequisites are required for the metrics to flow out of the VM cluster.

  1. Metrics on the VM clusters depends on Oracle Trace File Analyzer (TFA) agent. Ensure that these components are up and running. AHF version 22.2.4 or higher is required for capturing metrics from the VM clusters. To start, stop, or check the status of TFA, see Manage Oracle Trace File Analyzer.
  2. To view the metrics on the Oracle Cloud Infrastructure Console, the TFA flag defaultocimonitoring must be set to ON. This flag is set to ON by default and you need not perform any action to set this. If you are not seeing metrics on the Console, then as root user on the guest VM, check if the flag is set to ON.
    tfactl get defaultocimonitoring
    .---------------------------------------------------------------------.
    |                             <host name>                             |
    +-------------------------------------------------------------+-------+
    | Configuration Parameter                                     | Value |
    +-------------------------------------------------------------+-------+
    | Send CEF metrics to OCI Monitoring ( defaultOciMonitoring ) | ON    |
    '-------------------------------------------------------------+-------'
    If the defaultocimonitoring flag is set to OFF, then run the tfactl set defaultocimonitoring=on or tfactl set defaultocimonitoring=ON command to turn it on:
    tfactl set defaultocimonitoring=on
    Successfully set defaultOciMonitoring=ON
    .---------------------------------------------------------------------.
    |                             <host name>                             |
    +-------------------------------------------------------------+-------+
    | Configuration Parameter                                     | Value |
    +-------------------------------------------------------------+-------+
    | Send CEF metrics to OCI Monitoring ( defaultOciMonitoring ) | ON    |
    '-------------------------------------------------------------+-------'
  3. The following network configurations are required.
    1. Egress rules for outgoing traffic: The default egress rules are sufficient to enable the required network path : For more information, see Default Security List .If you have blocked the outgoing traffic by modifying the default egress rules on your Virtual Cloud Network(VCN), you will need to revert the settings to allow outgoing traffic. The default egress rule allowing outgoing traffic (as shown in the Rules Required for both Client and Backup Networks ) is as follows:
      • Stateless: No (all rules must be stateful)
      • Destination Type: CIDR
      • Destination CIDR: All <region> Services in Oracle Services Network
      • IP Protocol: 443 (HTTPS)
    2. Public IP or Service Gateway: The compute instance must have either a public IP address or a service gateway to be able to send compute instance metrics to the Monitoring service.

      If the instance does not have a public IP address, set up a service gateway on the virtual cloud network (VCN). The service gateway lets the instance send compute instance metrics to the Monitoring service without the traffic going over the internet. Here are special notes for setting up the service gateway to access the Monitoring service:

      1. When creating the service gateway, enable the service label called All <region> Services in Oracle Services Network. It includes the Monitoring service.
      2. When setting up routing for the subnet that contains the instance, set up a route rule with Target Type set to Service Gateway, and the Destination Service set to All <region> Services in Oracle Services Network.

        For detailed instructions, see Access to Oracle Services: Service Gateway.

View Metrics for VM Cluster

Perform the following steps to view the metrics for Guest VMs using the console.

Note

When there is a network problem and Oracle Trace File Analyzer (TFA) is unable to post metrics, TFA will wait for one hour before attempting to retry posting the metrics. This is required to avoid creating a backlog of metrics processing on TFA.

Potentially one hour of metrics will be lost between network restore and the first metric posted.

  1. Open the navigation menu. Click Oracle Database, then click Oracle Exadata Database Service on Dedicated Infrastructure.
  2. Choose your Compartment. A list of VM clusters is displayed.
  3. In the list of VM clusters, click the VM cluster for which you want to view the metrics. Details of the VM cluster you selected are displayed.
  4. In the Resources section, click Metrics.

    A chart for each metrics is displayed. By default, the metrics for the last one hour are displayed.

    You can only select the oci_database_cluster namespace from the Metric namespace drop-down.

  5. If you want to change the interval, select the required start time and end time. Alternatively, you can select the interval from the Quick Selects drop down menu. The metrics are refreshed immediately for the selected interval.
  6. For each metric, you can choose the interval and statistic independently.
    • Interval - The time period for which the metric is calculated.
    • Statistic - The mathematical method by which the metric is calculated.
  7. For each metric, you can choose the following options from the 'Options' drop down menu.
    • View Query in Metrics Explorer

    • Copy Chart URL

    • Copy Query (MQL)

    • Create an Alarm on this Query
    • Table View

For Detailed information on various options for viewing the metrics chart, see Viewing Default Metric Charts.

View Metrics for a Database

Perform the following steps to view the metrics for a database using the console.

Note

When there is a network problem and Oracle Trace File Analyzer (TFA) is unable to post metrics, TFA will wait for one hour before attempting to retry posting the metrics. This is required to avoid creating a backlog of metrics processing on TFA.

Potentially one hour of metrics will be lost between network restore and the first metric posted.

  1. Open the navigation menu. Click Oracle Database, then click Exadata on Oracle Public Cloud.
  2. Choose your Compartment. A list of VM clusters is displayed.
  3. In the list of VM clusters, click the VM cluster that contains the database for which you want to view the metrics. Details of the VM cluster you selected are displayed.
  4. In the list of databases, click the database for which you want to view the metrics.
  5. In the Resources section, click Metrics.

    A chart for each metrics is displayed. By default, the metrics for the last one hour are displayed.

  6. Select a namespace from the Metric namespace from where you wish to view metrics.
    Note

    • When Database Management is enabled, you will have an option to choose from oci_database or oracle_oci_database namespace.
    • When Database Management is disabled, then you can view metrics only from the oci_database namespace.
  7. If you want to change the interval, select the required start time and end time. Alternatively, you can select the interval from the Quick Selects drop down menu. The metrics are refreshed immediately for the selected interval.
  8. For each metric, you can choose the interval and statistic independently.
    • Interval - The time period for which the metric is calculated.
    • Statistic - The mathematical method by which the metric is calculated.
  9. For each metric, you can choose the following options from the 'Options' drop down menu.
    • View Query in Metrics Explorer
    • Copy Chart URL
    • Copy Query (MQL)
    • Create an Alarm on this Query
    • Table View

For Detailed information on various options for viewing the metrics chart, see Viewing Default Metric Charts.

View Metrics for a PDB

  1. Open the navigation menu. Click Oracle Database, then click Exadata on Oracle Public Cloud.
  2. Choose your Compartment. A list of VM clusters is displayed.
  3. In the list of VM clusters, click the VM cluster that contains the database for which you want to view the metrics. Details of the VM cluster you selected are displayed.
  4. In the list of databases, click the database that contains the PBD for which you want to view the metrics.
  5. Under Resources, click Pluggable Databases.
  6. In the list of VM clusters, click the PDB that you wish to view metrics.
  7. Select a namespace from the Metric namespace from where you wish to view metrics.
    Note

    • When Database Management is enabled, you will have an option to choose from oracle_oci_database namespace.
    • When Database Management is disabled, then the system will display a banner asking you to enable Database Management to provide metrics.

View Metrics for VM Clusters in a Compartment

Perform the following steps to view the metrics for databases in a compartment using the console.

Note

When there is a network problem and Oracle Trace File Analyzer (TFA) is unable to post metrics, TFA will wait for one hour before attempting to retry posting the metrics. This is required to avoid creating a backlog of metrics processing on TFA.

Potentially one hour of metrics will be lost between network restore and the first metric posted.

  1. Open the Oracle Cloud Infrastructure Console by clicking the menu icon next to Oracle Cloud.
  2. From the left navigation list click Observability & Management.
  3. Under Monitoring, click Service Metrics.
  4. On the Service Metrics page, under Compartment select your compartment.
  5. On the Service Metrics page, under Metric Namespace select oci_database_cluster.
  6. If there are multiple VM clusters in the compartment you can show metrics aggregated across the clusters by selecting Aggregate Metric Streams.
  7. If you want to limit the metrics you see, next to Dimensions click Add (click Edit if you have already added dimensions).
  8. In the Dimension Name field select a dimension.
  9. In the Dimension Value field select a value.
  10. Click Done.
  11. In the Edit dimensions dialog click +Additional Dimension to add an additional dimension. Click X to remove a dimension.
  12. To create an alarm on a specific metric, click Options and select Create an Alarm on this Query. See Managing Alarms for information on setting and using alarms.
Note

If you don't see any metrics, check the network settings and AHF version listed in the prerequisites section.

View Metrics for Databases in a Compartment

Perform the following steps to view the metrics for databases in a compartment using the console.

Note

When there is a network problem and Oracle Trace File Analyzer (TFA) is unable to post metrics, TFA will wait for one hour before attempting to retry posting the metrics. This is required to avoid creating a backlog of metrics processing on TFA.

Potentially one hour of metrics will be lost between network restore and the first metric posted.

  1. Open the Oracle Cloud Infrastructure Console by clicking the menu icon next to Oracle Cloud.
  2. From the left navigation list click Observability & Management.
  3. Under Monitoring, click Service Metrics.
  4. On the Service Metrics page, under Compartment select your compartment.
  5. On the Service Metrics page, under Metric Namespace select oci_database.
  6. If there are multiple databases in the compartment you can show metrics aggregated across the databases by selecting Aggregate Metric Streams.
  7. If you want to limit the metrics you see, next to Dimensions click Add (click Edit if you have already added dimensions).
  8. In the Dimension Name field select a dimension.
  9. In the Dimension Value field select a value.
  10. Click Done.
  11. In the Edit dimensions dialog click +Additional Dimension to add an additional dimension. Click X to remove a dimension.
  12. To create an alarm on a specific metric, click Options and select Create an Alarm on this Query. See Managing Alarms for information on setting and using alarms.

Manage Oracle Trace File Analyzer

The deployment of the cloud-certified Autonomous Health Framework (AHF), which includes Oracle Trace File Analyzer, is managed by Oracle. You shouldn’t install this manually on the guest VMs.

  • To check the run status of Oracle Trace File Analyzer, run the tfactl status command as root or a non-root user:
    # tfactl status 
    .-------------------------------------------------------------------------------------------------.
    | Host           | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status|
    +----------------+---------------+--------+------+------------+----------------------+------------+
    | node1          | RUNNING       |  41312 | 5000 | 22.1.0.0.0 | 22100020220310214615| COMPLETE    |
    | node2          | RUNNING       | 272300 | 5000 | 22.1.0.0.0 | 22100020220310214615| COMPLETE    |
    '----------------+---------------+--------+------+------------+----------------------+------------'
  • To start the Oracle Trace File Analyzer daemon on the local node, run the tfactl start command as root:
    # tfactl start
    Starting TFA..
    Waiting up to 100 seconds for TFA to be started..
    . . . . .
    . . . . .
    . . . . .
    . . . . .
    . . . . .
    . . . . .
    . . . . .
    . . . . .
    Successfully started TFA Process..
    . . . . .
    TFA Started and listening for commands
  • To stop the Oracle Trace File Analyzer daemon on the local node, run the tfactl stop command as root:
    # tfactl stop
    Stopping TFA from the Command Line
    Nothing to do !
    Please wait while TFA stops
    Please wait while TFA stops
    TFA-00002 Oracle Trace File Analyzer (TFA) is not running
    TFA Stopped Successfully
    Successfully stopped TFA..

Manage Database Service Agent

View the /opt/oracle/dcs/log/dcs-agent.log file to identify issues with the agent.

  • To check the status of the Database Service Agent, run the systemctl status command:
    # systemctl status dbcsagent.service
    dbcsagent.service
    Loaded: loaded (/usr/lib/systemd/system/dbcsagent.service; enabled; vendor preset: disabled)
    Active: active (running) since Fri 2022-04-0113:40:19UTC; 6min ago
    Process: 9603ExecStopPost=/bin/bash -c kill `ps -fu opc |grep "java.*dbcs-agent.*jar"|awk '{print $2}'` (code=exited, status=0/SUCCESS)
    Main PID: 10055(sudo)
    CGroup: /system.slice/dbcsagent.service
    ‣ 10055sudo -u opc /bin/bash -c umask 077; /bin/java
  • To start the agent if it is not running, run the systemctl start command as the root user:
    systemctl start dbcsagent.service