Compute Instance Health Metrics

You can monitor the health, capacity, and performance of your compute virtual machine (VM) instances by using metrics, alarms, and notifications.

This topic describes the metrics emitted by the metric namespace oci_compute_instance_health.

Resources: Compute VM instances.

Overview of Metrics: oci_compute_instance_health

The following compute instance health metric helps you monitor the status, health, and accessibility of compute instances.

Instance accessibility status: The instance_accessibility_status metric lets you monitor whether a VM instance is unresponsive. Compute sends an Address Resolution Protocol (ARP) request to the instance's virtual network interface card (VNIC). If the ARP ping fails, the metric shows that the instance is unresponsive.

Note

The instance_accessibility_status metric doesn't determine or report the specific reason for the instance's unresponsiveness. The ARP test provides no insight into the possible issues with the instance's OS.

Instance File System Status: The instance_file_system_status metric lets you monitor whether a VM instance has file system anomaly issue. Compute analyzes VM kernel logs to determine volume status. If the volume is in anomaly status, the metric shows the type and volume of the issue.

Note

The instance_file_system_status metric does not determine or report the specific reason for the file system issue of the instance or issues with the OS or volumes of the instance.

Using MQL to view instance_file_system_status

// The query does not specify the volume type, it can be used for general monitoring purpose of read-only volume issues. Users can get volumeType info by inspecting the "volumeType" dimension of the metrics. 
InstanceFileSystemStatus[5m]{resourceId = "YOUR-VM-OCID-IN-TENANCY"}.max()
// The queries below specify the volume type, they can be used for specific monitoring purposes
InstanceFileSystemStatus[5m]{resourceId = "YOUR-VM-OCID-IN-TENANCY", volumeType = BOOT_VOLUME}.max()
InstanceFileSystemStatus[5m]{resourceId = "YOUR-VM-OCID-IN-TENANCY", volumeType = DATA_VOLUME}.max()
InstanceFileSystemStatus[5m]{resourceId = "YOUR-VM-OCID-IN-TENANCY", volumeType = UNKNOWN}.max()

Troubleshooting an unresponsive VM instance

Required IAM Policy

To monitor resources, you must be granted the required type of access in a policy written by an administrator, whether you're using the Console or the REST API with an SDK, CLI, or other tool. The policy must give you access to the monitoring services as well as the resources being monitored. If you try to perform an action and get a message that you don't have permission or are unauthorized, contact the administrator to find out what type of access you were granted and which compartment you need to work in. For more information about user authorizations for monitoring, see IAM Policies.

Available Metrics: oci_compute_instance_health

The metrics listed in the following table is automatically available for your instances. You do not need to enable monitoring on the instance to get these metrics.

You also can use the Monitoring service to create custom queries.

The metrics includes the following dimensions :

resourceDisplayName: The friendly name of the instance.
resourceId: The OCID of the instance.
volumeType: The type of volume that has an issue. The values are among BOOT_VOLUME, DATA_VOLUME, and UNKNOWN. When the value is UNKNOWN, the type of volume with an issue cannot be determined.
issueType: The type of file system issue. The value is READ_ONLY when the instance volume is in READ_ONLY mode.


Metric	Metric Display Name	Unit	Description	Dimensions
`instance_accessibility_status`	Instance accessibility status	Count	The accessibility status of a VM instance. A value of 1 indicates that the instance is unresponsive due to an issue with the infrastructure or the instance itself. A value of 0 indicates that an accessibility issue has not been detected. If the instance is stopped, then the metric does not have a value.	`resourceDisplayName` `resourceId`
`instance_file_system_status`	Instance file system status	Count	The file system status of a VM instance. A value of 1 indicates that the instance has file system issue due to the infrastructure or the instance itself. A value of 0 indicates that the file system issue has not been detected. If the instance is stopped, then the metric does not have a value.	`resourceDisplayName` `resourceId` `volumeType` `issueType`

Using the Console

To view compute health metrics for a single instance

To view compute health metrics for all instances in a compartment

Using the API

For information about using the API and signing requests, see REST API documentation and Security Credentials. For information about SDKs, see SDKs and the CLI.

Use the following APIs for monitoring:

Monitoring API for metrics and alarms
Notifications API for notifications (used with alarms)

Oracle Cloud Infrastructure Documentation Try Free Tier

Compute Instance Health Metrics

Overview of Metrics: oci_compute_instance_health 🔗

Required IAM Policy 🔗

Available Metrics: oci_compute_instance_health 🔗

Using the Console 🔗

Using the API 🔗

Oracle Cloud Infrastructure Documentation
Try Free Tier

Overview of Metrics: oci_compute_instance_health

Required IAM Policy

Available Metrics: oci_compute_instance_health

Using the Console

Using the API