Monitor a HeatWave Cluster
You can use Database Management to monitor a HeatWave cluster.
The information about a HeatWave cluster attached to a Database Management-enabled MySQL DB system is displayed on the DB system's MySQL database details page. To go to the MySQL database details page, click the name of the DB system on the MySQL HeatWave fleet summary page.
On the MySQL database details page, you can perform the following tasks to monitor HeatWave clusters:
- Click the HeatWave cluster information tab to view
details such as the compartment and OCID of the DB system, number of nodes in the
HeatWave cluster, and whether Lakehouse is enabled. In addition, you can:
- Click See details adjacent to Nodes to view information about the HeatWave nodes in the HeatWave cluster.
- Monitor the total number of open alarms and the number of
alarms by severity for the DB system and the attached HeatWave cluster. Note
that the alarms are only displayed in Database Management if the OCID of the DB system is specified using the
resourceId
dimension when creating the alarm. You can click the number of alarms to access the Alarms panel and review the list of open alarms. For information, see Monitor Alarms for MySQL HeatWave DB Systems.
- Click the HeatWave cluster tab in the
Summary section to:
- Monitor the Health status timeline,
which displays an overview of the health status of the HeatWave cluster
during the selected period of time. The health status indicates whether the
HeatWave cluster is healthy, reloading data, recovering, or experiencing a
node failure. The color of the blocks denotes the status and the number of
blocks denotes the time slots within the selected time period over which
status is checked. For example, if the default time period, Last
60 min, is selected, then each block represents a period of
two minutes. Here's information on what the color of the blocks in the
Health status timeline denotes:
- Green: HeatWave cluster is healthy.
- Blue: HeatWave cluster is reloading data.
- Amber: HeatWave cluster is recovering.
- Red: HeatWave cluster has failed. This status is displayed if even one of the nodes in the HeatWave cluster is in a failed state.
- Grey: Health status metric data is missing for the HeatWave cluster.
For more information on HeatWave cluster health status, see HeatWave Cluster Failure and Recovery.
- Monitor key HeatWave cluster metrics such as Memory
(%) and CPU (%). The visual
representation of the HeatWave cluster metrics helps obtain a quick insight
into the health, capacity, and performance of the HeatWave cluster.
The Memory (%) and CPU (%) metric charts display metrics at the HeatWave node level and you can select the Aggregate node metrics check box to view the aggregate metrics for all the nodes in the HeatWave cluster. In these charts, you can also view the metrics for a particular HeatWave node by selecting the node in the Nodes drop-down list.
On the metric charts, you can:
- Hover the mouse to view additional details such as the name of the node and metric, date and time, and value.
- Hover the mouse on the tooltip adjacent to the name of the metric chart to view a brief description of the metric displayed in the chart.
- Filter the data by clicking the names of the nodes in the legend above the chart. This is applicable to the Memory (%) and CPU (%) metric charts, which display metrics at the HeatWave node level.
- Select an option in the drop-down list in the upper-right corner to view a different metric in the chart. This drop-down list is available for the Memory (%) chart and you can select Usage to view the Memory (GB) (allocated and used) metrics.
- Monitor the Health status timeline,
which displays an overview of the health status of the HeatWave cluster
during the selected period of time. The health status indicates whether the
HeatWave cluster is healthy, reloading data, recovering, or experiencing a
node failure. The color of the blocks denotes the status and the number of
blocks denotes the time slots within the selected time period over which
status is checked. For example, if the default time period, Last
60 min, is selected, then each block represents a period of
two minutes. Here's information on what the color of the blocks in the
Health status timeline denotes:
- Click Metrics on the left pane under Resources and then click the HeatWave cluster tab to monitor a wide range of HeatWave cluster metrics and investigate and analyze data using different indicators. The charts in this section include those displayed in the Summary section and other metric charts such as Network throughput. In addition to the charts displayed in this section by default, you also have the option of customizing and displaying the metrics that you want to monitor by selecting them from the Select charts drop-down list adjacent to the Time period field.