Health Monitor
Health Monitor collects statistics for system CPU usage, disk usage,
memory usage, and the MySQL server process (mysqld
). The statistics
are logged in history tables, for which Health Monitor takes samples once a minute,
and retains a day's worth of data. You can also access these statistics in tables
that are populated with current readings on demand when you query them.
Overview of Health Monitor
Health Monitor samples disk space information, system memory information,
and process memory information every sixty seconds, and stores a day's worth of data in
tables in the Performance Schema of the MySQL Server. If the available disk space or
available memory falls below certain preset thresholds, Health Monitor issues warnings and
logs it to the performance_schema.error_log
table.
Health Monitor uses MiB/GiB/TiB storage units where 100 GB = 93 GiB.
The total storage available to a DB system contains the following:
- Available storage: The disk space after the storage reserve is excluded from the total storage.
- Storage reserve: The disk space that is lesser of the
disk_low_space_level
anddisk_low_space_percent
configuration variables. The default value ofdisk_low_space_level
is 5,000 MiB, and the default value ofdisk_low_space_percent
is 4% of the total storage. - Critical storage: The disk space in the storage reserve below the
critical threshold, which is defined by the
disk_low_space_critical_level
configuration variable. This is the minimum viable disk space for the system to avoid disk exhaustion. The default critical storage space is 2,000 MiB.
Disk Space Warnings
When the disk space falls below certain percentage of the available storage,
Health Monitor issues the following warnings and logs it to the
performance_schema.error_log
table:
WARNING_DISK_USAGE_LEVEL_1
: When the disk space falls below 20% of the available storage.WARNING_DISK_USAGE_LEVEL_2
: When the disk space falls below 10% of the available storage.WARNING_DISK_USAGE_LEVEL_3
: When the disk space falls below 5% of the available storage.
For example, if the total storage is 100,000 MiB, and the storage reserve is lesser of the following two configuration variables:
disk_low_space_level
: The default value is 5,000 MiB.disk_low_space_percent
: It is 4% of the total storage, that is, 4% of 100,000 MiB = 4,000 MiB.
The storage reserve is 4,000 MiB, which is lesser of the two configuration variables.
You can find the available storage by excluding the storage reserve from the total storage:
100,000 - 4,000 = 96,000 MiB
Health Monitor issues the following warnings:
WARNING_DISK_USAGE_LEVEL_1
: When the disk space falls below 20% of the available storage, which is 20/100*96,000 = 19,200 MiB.WARNING_DISK_USAGE_LEVEL_2
: When the disk space falls below 10% of the available storage, which is 10/100*96,000 = 9,600 MiB.WARNING_DISK_USAGE_LEVEL_3
: When the disk space falls below 5% of the available storage, which is 5/100*96,000 = 4,800 MiB.
Significant or Sustained Shortage of Disk Space
Health Monitor defines significant or sustained shortage of disk space when the following conditions are met:
- Significant shortage: The disk space drops below the critical
threshold defined by
disk_low_space_critical_level
. - Sustained shortage: The disk space drops below the storage
reserve, and then remains in the storage reserve for the duration defined by the
disk_low_space_duration
configuration variable.
When there is a significant or sustained shortage of disk space, Health Monitor does the following:
SUPER_READ_ONLY=ON
: Sets the system variables,SUPER_READ_ONLY
, to ON in the MySQL Server. The MySQL Server rejects all new incoming SQL write statements (UPDATE
,INSERT
,DELETE
, andDDL
), regardless of users and privileges. Running transactions are allowed to complete, but new writes are prohibited until the disk space is available again. You cannot load data into the HeatWave cluster when the server is inSUPER_READ_ONLY
mode.In some cases, global read locks, ongoing commits, or metadata locks blocks the Health Monitor from setting the
SUPER_READ_ONLY
system variable. In such cases, if you set thehealth_monitor.disk_fallback_force
variable, the Health Monitor identifies and terminates active queries holding global locks to set theSUPER_READ_ONLY
system variable.OFFLINE_MODE=ON
: Sets the system variable,OFFLINE_MODE
, to ON in the MySQL Server. The MySQL Server disconnects client users who do not have theCONNECTION_ADMIN
privilege, terminates running statements and releases locks, and blocks new connections with an appropriate error.super_read_only_disk_full=ON
: Sets the status variable,super_read_only_disk_full
, to ON indicating that read-only mode was triggered due to low disk space. You can see the value ofsuper_read_only_disk_full
status variable with either of the following command:SHOW GLOBAL STATUS WHERE variable_name = 'super_read_only_disk_full';
SELECT * FROM performance_schema.global_status WHERE variable_name = 'super_read_only_disk_full';
System variables change behavior of the MySQL Server while status variables are static indicators of the MySQL Server state.
Recovery to Normal Operation
Once the SUPER_READ_ONLY
and
OFFLINE_MODE
are
set, Health Monitor does the following:
- Check the disk space every minute .
- Set
OFFLINE_MODE=OFF
if the available disk space recovers and remains abovedisk_recovery_level
fordisk_recovery_time_1
seconds or more. - Set
SUPER_READ_ONLY=OFF
if the available disk space recovers and remains abovedisk_recovery_level
for an additionaldisk_recovery_time_2
seconds or more.
Normal operations are resumed once the disk space remains above
disk_recovery_level
for disk_recovery_time_1
+
disk_recovery_time_2
seconds.
Once the disk space recovers, Health Monitor clears the system variables,
SUPER_READ_ONLY
, and OFFLINE_MODE
, and the status
variable, super_read_only_disk_full
. You do not have to restart the
MySQL instance to clear the variables.
If there is a significant or sustained shortage of disk space again, then the Health
Monitor again sets the system variables, SUPER_READ_ONLY
and
OFFLINE_MODE
, and the status variable,
super_read_only_disk_full
, to ON in the MySQL Server.
Before returning a primary instance of a high availability DB system to normal operating
mode, the Health Monitor verifies that the server is online with the group majority. The
Health Monitor never sets SUPER_READ_ONLY
or
OFFLINE_MODE
on the secondary instances.
Recovery Increment, Recovery Max Cycle, and Recovery Window
Each time the MySQL Server recovers from a significant or sustained shortage of disk
space, the Health Monitor increases the disk recovery time by
disk_recovery_increment
seconds.
For example, if disk_recovery_time_1
= 300 seconds, and
disk_recovery_time_2
= 600 seconds, and
disk_recovery_increment
= 400 seconds, then the disk recovery time
for each cycle is as following:
- Cycle 1:
disk_recovery_time_1
= 300 seconds,disk_recovery_time_2
= 600 seconds - Cycle 2:
disk_recovery_time_1
= (300 + 400) seconds,disk_recovery_time_2
= (600+400) seconds - Cycle 3:
disk_recovery_time_1
= (300+400+400) seconds,disk_recovery_time_2
= (600+400+400) seconds
The Health Monitor restores to the normal operating mode a maximum of
disk_recovery_max_cycles
times within a fixed window of
disk_recovery_window
seconds, from the last recovery cycle. For
example, if disk_recovery_max_cycles
is set to 5, and
disk_recovery_window
is set to 86,400 seconds (1 day), then the
Health Monitor can restore to the normal operating mode five times in a window of 86,400
seconds. If a significant or sustained shortage happens the sixth time in the same
window, the system variables, SUPER_READ_ONLY
and
OFFLINE_MODE
are set to ON.
The recovery count is reset after disk_recovery_window
seconds from the
last recovery.
Memory Warnings
When the available memory falls below certain thresholds, Health Monitor issues the following warnings:
WARNING_MEMORY_USAGE_LEVEL_1
: When the available memory falls below 1024 MiB.WARNING_MEMORY_USAGE_LEVEL_2
: When the available memory falls below 500 MiB.WARNING_MEMORY_USAGE_LEVEL_3
: When the available memory falls below 100 MiB.
Health Monitor logs the information regarding MySQL version, global memory usage, and various configuration settings in the error log. The information is logged as comma separated values for easy transfer into a spreadsheet for further analysis.
Related Topics
Viewing Health Monitor Tables
Health Monitor tables store monitoring data and statistics in MySQL Server's Performance Schema.
The health tables contain disk storage information, system memory information, and process memory information.
The statistics tables contain statistics for system CPU usage, disk usage,
memory usage, and the MySQL server process (mysqld
). Each statistics
table has two versions, an on-demand version, which is populated with current readings
when you query it, and a history version for which Health Monitor takes samples once a
minute, and retains a day's worth of data. Both versions of the table have the same
columns.
Using a Command-Line Client
Use a command-line client such as MySQL Client or MySQL Shell to view the Health Monitor tables.
- A command-line client such as MySQL Shell that is connected to the DB system. See Connecting to a DB System.
+-------------------------+---------------------+-------------+-----------------+-------------+-------------+
| DEVICE | TIMESTAMP | TOTAL_BYTES | AVAILABLE_BYTES | USE_PERCENT | MOUNT_POINT |
+-------------------------+---------------------+-------------+-----------------+-------------+-------------+
| /dev/mapper/vg_db-lv_db | 2021-08-08 16:12:08 | 53656686592 | 51364134912 | 4.27 | /db |
| /dev/mapper/vg_db-lv_db | 2021-08-08 16:13:08 | 53656686592 | 51364134912 | 4.27 | /db |
| /dev/mapper/vg_db-lv_db | 2021-08-08 16:14:08 | 53656686592 | 51364134912 | 4.27 | /db |
| /dev/mapper/vg_db-lv_db | 2021-08-08 16:15:08 | 53656686592 | 51364134912 | 4.27 | /db |
| /dev/mapper/vg_db-lv_db | 2021-08-08 16:16:08 | 53656686592 | 51364134912 | 4.27 | /db |
| /dev/mapper/vg_db-lv_db | 2021-08-08 16:17:08 | 53656686592 | 51364134912 | 4.27 | /db |
| /dev/mapper/vg_db-lv_db | 2021-08-08 16:18:08 | 53656686592 | 51364134912 | 4.27 | /db |
| /dev/mapper/vg_db-lv_db | 2021-08-08 16:19:08 | 53656686592 | 51364134912 | 4.27 | /db |
| /dev/mapper/vg_db-lv_db | 2021-08-08 16:20:08 | 53656686592 | 51364134912 | 4.27 | /db |
| /dev/mapper/vg_db-lv_db | 2021-08-08 16:21:08 | 53656686592 | 51364134912 | 4.27 | /db |
Health Monitor Tables
Refer to the Health Monitor Tables to see the disk storage information, system memory information, and process memory information.
The complete list of Health Monitor tables is as following:
performance_schema.health_block_device
: Disk storage status information.performance_schema.health_system_memory
: System memory status information.performance_schema.health_process_memory
: Process memory status information.performance_schema.system_cpu_stats
: CPU usage statistics, populated on demand.performance_schema.system_cpu_stats_history
: Stored CPU usage statistics.performance_schema.system_disk_stats
: Disk usage statistics, populated on demand.performance_schema.system_disk_stats_history
: Stored disk usage statistics.performance_schema.system_memory_stats
: Memory usage statistics, populated on demand.performance_schema.system_memory_stats_history
: Stored memory usage statistics.performance_schema.system_process_stats
: Process and thread statistics, populated on demand.performance_schema.system_process_stats_history
: Stored process and thread statistics.performance_schema.error_log
: Warnings and errors. See Viewing the Error Log.
performance_schema.health_block_device Table
Health Monitor uses the
performance_schema.health_block_device
table in MySQL Server's
Performance Schema to store data on used and available disk space in a DB
system.
Table 17-1 health_block_device Performance Schema Table
Column | Description |
---|---|
DEVICE |
The device name for the disk. |
TIMESTAMP |
The time the data sample in this row was collected. |
TOTAL_BYTES |
The total amount of storage on the device, in bytes. |
AVAILABLE_BYTES |
The amount of available storage on the device, in bytes. |
USE_PERCENT |
The percentage of disk space in use. |
MOUNT_POINT |
The device's mount point. |
performance_schema.health_system_memory Table
Health Monitor uses the
performance_schema.health_system_memory
table in MySQL Server's
Performance Schema to store data on memory usage by the DB system
Table 17-2 health_system_memory Performance Schema Table
Column | Description |
---|---|
TIMESTAMP |
The time the data sample in this row was collected. |
TOTAL_MEMORY |
The total amount of memory for the system, in bytes. |
AVAILABLE |
The memory available for starting new applications without swapping, in bytes. |
USE_PERCENT |
The percentage of memory in use. |
MEMORY_FREE |
The amount of unused memory. |
MEMORY_FS_CACHE |
The filesystem page cache. |
SWAP_TOTAL |
The total amount of swap memory. |
SWAP_FREE |
The total amount of free swap memory. |
performance_schema.health_process_memory Table
Health Monitor uses the
performance_schema.health_process_memory
table in MySQL Server's
Performance Schema to store data on memory usage by the MySQL Server process (mysqld) in the
DB system.
Table 17-3 health_process_memory Performance Schema Table
Column | Description |
---|---|
TIMESTAMP |
The time the data sample in this row was collected. |
PROCESS_NAME |
The name of the MySQL Server process. |
PID |
The system identifier for the MySQL Server process. |
VM_RSS |
The resident set size of the MySQL server process, in bytes. |
VM_DATA |
The size of the MySQL Server process data segment, in bytes. |
VM_SWAP |
The size of the MySQL server process swap segment, in bytes. |
PAGE_FAULTS |
The number of page faults requiring disk I/O. |
performance_schema.system_cpu_stats Table
Health Monitor collects statistics for CPU usage by the DB system. The statistics are available in two tables in MySQL Server's Performance Schema, an on-demand table and a corresponding history table. Both tables have the same columns.
The performance_schema.system_cpu_stats
table in MySQL Server's
Performance Schema is populated with a data sample when you query it. The
performance_schema.system_cpu_stats_history
table stores data
samples regularly.
The statistics in these tables are sourced from the Linux system file
/proc/stat
.
Table 17-4 system_cpu_stats and system_cpu_stats_history Performance Schema Tables
Column | Description |
---|---|
TIMESTAMP |
The time the data sample in this row was collected. |
CPU |
The cpu row shows the aggregated CPU usage data, and each numbered row (for example, cpu0, cpu1) shows data for an individual process. All times are in milliseconds. |
USER_MS |
The time spent in user mode. |
NICE_MS |
The time spent in user mode with low priority. |
SYSTEM_MS |
The time spent in system mode. |
IDLE_MS |
The time spent in the idle task. |
IOWAIT_MS |
The time spent waiting for I/O to complete. |
IRQ_MS |
The time spent servicing hardware interrupts. |
SOFTIRQ_MS |
The time spent servicing software interrupts. |
STEAL_MS |
The time spent running other operating systems in a virtualized environment. |
GUEST_MS |
The time spent running a virtual CPU with normal priority. |
GUEST_NICE_MS |
The time spent running a virtual CPU with low priority. |
performance_schema.system_disk_stats Table
Health Monitor collects statistics for disk usage by the DB system. The statistics are available in two tables in MySQL Server's Performance Schema, an on-demand table and a corresponding history table. Both tables have the same columns.
The performance_schema.system_disk_stats
table in MySQL
Server's Performance Schema is populated with a data sample when you query it. The
performance_schema.system_disk_stats_history
table stores data
samples regularly.
The statistics in these tables are sourced from the Linux system file
/proc/diskstats
.
Table 17-5 system_disk_stats and system_disk_stats_history Performance Schema Tables
Column | Description |
---|---|
TIMESTAMP |
The time the data sample in this row was collected. |
DEVICE |
The device name for this disk. |
READS |
The number of reads completed successfully. |
READ_BYTES |
The number of bytes read. |
READ_TIME_MS |
The time spent reading, in milliseconds. |
WRITES |
The number of writes completed successfully. |
WRITE_BYTES |
The number of bytes written. |
WRITE_TIME_MS |
The time spent writing, in milliseconds. |
FLUSHES |
The number of flush requests completed successfully. |
FLUSH_TIME_MS |
The time spent flushing, in milliseconds. |
performance_schema.system_memory_stats Table
Health Monitor collects statistics for memory usage by the DB system. The statistics are available in two tables in MySQL Server's Performance Schema, an on-demand table and a corresponding history table. Both tables have the same columns.
The performance_schema.system_memory_stats
table in
MySQL Server's Performance Schema is populated with a data sample when you query it.
The performance_schema.system_memory_stats_history
table stores
data samples regularly.
The statistics in these tables are sourced from the Linux system file
/proc/meminfo
.
Table 17-6 system_memory_stats and system_memory_stats_history Performance Schema Tables
Column | Description |
---|---|
TIMESTAMP |
The time the data sample in this row was collected. |
TOTAL_BYTES |
The total usable system memory, in bytes. |
FREE_BYTES |
The unused memory. |
USED_BYTES |
The total used memory. |
AVAILABLE_BYTES |
The estimated memory available for starting new applications, in bytes. |
BUFFER_BYTES |
The amount of memory used for temporary storage for raw disk blocks. |
CACHED_BYTES |
The amount of memory used to cache files read from disk (the page cache). |
SLAB_BYTES |
The amount of memory used to cache in-kernel data structures. |
SWAP_TOTAL_BYTES |
The total amount of swap space available on disk. |
SWAP_FREE_BYTES |
The total amount of swap space that is currently unused. |
SWAP_USED_BYTES |
The total amount of memory that has been swapped out from RAM and is temporarily on disk. |
SWAP_IN |
The amount of memory swapped back into RAM from disk, in KB per second. |
SWAP_OUT |
The amount of memory swapped out to disk from RAM, in KB per second. |
performance_schema.system_process_stats Table
Health Monitor collects statistics for each thread used by the MySQL Server process in the DB system. The statistics are available in two tables in MySQL Server's Performance Schema, an on-demand table and a corresponding history table. Both tables have the same columns.
The performance_schema.system_process_stats
table in
MySQL Server's Performance Schema is populated with a data sample when you query it.
The performance_schema.system_process_stats_history
table stores
data samples regularly.
For process statistics, the on-demand table returns one row for each thread used by
the mysqld
process. However, the history table stores a single row
for each sample, with the aggregated totals for all the threads used by the
mysqld
process.
The statistics in these tables are sourced from the Linux system file
/proc/self/task/[pid]/stat
.
Table 17-7 system_process_stats and system_process_stats_history Performance Schema Tables
Column | Description |
---|---|
TIMESTAMP |
The time the data sample in this row was collected. |
PID |
The process ID. |
PROCESS_NAME |
The process name, which is
mysqld .
|
TID |
The thread ID for the individual thread shown in this row. |
THREAD_NAME |
The instrumented thread name for the individual thread shown in this row. |
STATE |
A character showing the thread state at the time of the
sample: running (R ), sleeping in an interruptible wait
(S ), waiting in an uninterruptible disk sleep
(D ), zombie thread (Z ), stopped on
a signal or trace stopped (T ), paging
(W ), dead (X ), wakekill
(K ), waking (W ), or parked
(P ).
|
UTIME_MS |
The time this thread spent in user mode, in milliseconds. |
STIME_MS |
The time this thread spent in kernel mode, in milliseconds. |
CUTIME_MS |
The time that child processes of this process spent in user mode, in milliseconds. |
CSTIME_MS |
The time that child processes of this process spent in kernel mode, in milliseconds. |
NUM_THREADS |
The nunmber of threads in this process. |
VSIZE_BYTES |
The virtual memory size in bytes. |
RSS_BYTES |
The number of bytes the process has in real memory (the resident set size). |
RSSLIM_BYTES |
The limit in bytes on the resident set size of the process. |
PROCESSOR |
The CPU number on which the thread last executed. |
DELAYACCT_BLKIO_MS |
The aggregated block I/O delays, in milliseconds. |
GUEST_TIME_MS |
The guest time for the process, in milliseconds. |
CGUEST_TIME_MS |
The guest time for the child processes of the process, in milliseconds. |
READ_BYTES |
The number of bytes that this process fetched from the storage layer. |
WRITE_BYTES |
The number of bytes that this process sent to the storage layer. |
Viewing Health Monitor Variables
The Health Monitor variables are configuration settings for the monitoring activity in the DB system. You cannot change the values of these variables.
Using a Command-Line Client
Use a command-line client such as MySQL Client or MySQL Shell to view the Health Monitor variables and their values.
- A running DB system.
- A command-line client such as MySQL Client or MySQL Shell that is connected to the DB system. See Connecting to a DB System.
+----------------------------------------------+-------+
| Variable_name | Value |
+----------------------------------------------+-------+
| health_monitor.disk_fallback_enable | ON |
| health_monitor.disk_fallback_force | OFF |
| health_monitor.disk_low_space_critical_level | 2000 |
| health_monitor.disk_low_space_duration | 300 |
| health_monitor.disk_low_space_level | 5000 |
| health_monitor.disk_low_space_percent | 4 |
| health_monitor.disk_monitored | /db |
| health_monitor.disk_recovery_enable | ON |
| health_monitor.disk_recovery_increment | 300 |
| health_monitor.disk_recovery_level | 5 |
| health_monitor.disk_recovery_max_cycles | 3 |
| health_monitor.disk_recovery_time_1 | 300 |
| health_monitor.disk_recovery_time_2 | 300 |
| health_monitor.disk_recovery_window | 86400 |
| health_monitor.disk_retention | 86400 |
| health_monitor.disk_running | ON |
| health_monitor.disk_sample_rate | 60 |
| health_monitor.disk_usage_warning_level_1 | 20 |
| health_monitor.disk_usage_warning_level_2 | 10 |
| health_monitor.disk_usage_warning_level_3 | 5 |
...
+----------------------------------------------+-------+
35 rows in set (0.01 sec)
Health Monitor Variables
All Health Monitor system variables are prefixed with
health_monitor
. You cannot alter the values of the variables from their
defaults in a DB system.
The
health_monitor.
prefix is omitted from the table below. For
example, disk_fallback_enable
is actually
health_monitor.disk_fallback_enable
.
Table 17-8 Health Monitor System Variables
System Variable | Default Value | Description |
---|---|---|
disk_fallback_enable |
ON | If there is a significant or sustained shortage of disk
space, Health Monitor sets the system variable,
SUPER_READ_ONLY and OFFLINE_MODE
to ON in the MySQL Server.
|
disk_fallback_force |
OFF | Terminate foreground queries that are holding global
metadata locks and retry setting
SUPER_READ_ONLY .
|
disk_low_space_critical_level |
2000 | The critical low disk space threshold in MiB. If the
available disk space falls below this value for any duration, Health
Monitor sets the SUPER_READ_ONLY and
OFFLINE_MODE modes on the MySQL Server.
|
disk_low_space_duration |
300 | The duration (in seconds) for which the available disk
space can remain below either disk_low_space_level or
disk_low_space_percent (whichever is lower) before
Health Monitor sets the SUPER_READ_ONLY and
OFFLINE_MODE modes on the MySQL Server.
|
disk_low_space_level |
5000 | The low space threshold for the storage reserve, in
MiB. If the available disk space drops below this percentage of total
available storage for longer than the value of
disk_low_space_duration , Health Monitor sets the
SUPER_READ_ONLY and OFFLINE_MODE
modes on the MySQL Server.
|
disk_low_space_percent |
4 | The low space threshold for the storage reserve,
expressed as a percentage. If the available disk space drops below this
percentage of total available storage for longer than the value of
disk_low_space_duration , Health Monitor sets the
SUPER_READ_ONLY and OFFLINE_MODE
modes on the MySQL Server.
|
disk_monitored |
/db | The mount point for disk statistics. |
disk_retention |
86400 |
How many seconds each data sample is retained for in the history tables for disk usage data. The minimum value is 1 second, the default is 86400 seconds (1 day), and the maximum value is 864000 seconds (10 days). If |
disk_recovery_enable |
ON | Set the variable to ON to enable recovery from a sustained or significant shortage and to OFF to disable recovery from a sustained or significant shortage. |
disk_recovery_level |
5% | The percentage of total disk space above the storage
reserve.
|
disk_recovery_time_1 |
300 seconds | The amount of time (in seconds) that the available
storage must remain above disk_recovery_level before
the Health Monitor sets OFFLINE_MODE = OFF .
|
disk_recovery_time_2 |
300 seconds | The additional time (in seconds) that the available
storage must remain above disk_recovery_level after
disk_recovery_time_1 before the Health Monitor sets
SUPER_READ_ONLY = OFF .
|
disk_recovery_max_cycles |
3 cycles | The maximum number of times that the MySQL server can be
placed back into normal operating mode within the
disk_recovery_cycle_window timeframe.
|
disk_recovery_window |
86400 seconds | The amount of time (in seconds) following the last successful recovery cycle after which the recovery count is reset. A value of 0 means that the recovery count is never reset. |
disk_recovery_increment |
300 seconds | The time (in seconds) to increase the
disk_recovery_time_1 and
disk_recovery_time_2 recovery timers for each
recovery cycle.
|
disk_running |
ON | The active state of the Health Monitor's disk monitor. |
disk_sample_rate |
60 | The frequency of disk data collection, in seconds. By default, data is collected every 60 seconds. The minimum sample rate is 1 (every second), and the maximum is 86400 (once a day). |
disk_usage_warning_level_1 |
20 | A percentage of available disk space above the defined
disk_low_space_level . If the available disk space
falls below this level, the warning
WARNING_DISK_USAGE_LEVEL_1 is raised.
|
disk_usage_warning_level_2 |
10 | A percentage of available disk space, above the defined
disk_low_space_level . If the available disk space
falls below this level, the warning
WARNING_DISK_USAGE_LEVEL_2 is raised.
|
disk_usage_warning_level_3 |
5 | A percentage of available disk space, above the defined
disk_low_space_level . If the available disk space
falls below this level, the warning
WARNING_DISK_USAGE_LEVEL_3 is raised.
|
memory_reporting |
ON | Enables reporting of memory data. |
memory_retention |
86400 |
How many seconds each data sample is retained for in the history tables for memory usage data. The minimum value is 1 second, the default is 86400 seconds (1 day), and the maximum value is 864000 seconds (10 days). |
memory_running |
ON | The active state of the Health Monitor's memory monitor. |
memory_sample_rate |
60 | The frequency of memory data collection, in seconds. By default, data is collected every 60 seconds. The minimum sample rate is 1 (every second), and the maximum is 86400 (once a day). |
memory_usage_warning_level_1 |
1024 | An amount of available memory (MiB). If the available
memory falls below this level, the warning
WARNING_MEMORY_USAGE_LEVEL_1 is raised.
|
memory_usage_warning_level_2 |
500 | An amount of available memory (MiB). If the available
memory falls below this level, the warning
WARNING_MEMORY_USAGE_LEVEL_2 is raised.
|
memory_usage_warning_level_3 |
100 | An amount of available memory (MiB). If the available
memory falls below this level, the warning
WARNING_MEMORY_USAGE_LEVEL_3 is raised.
|
status_interval |
10 | The frequency (in multiples of
disk_sample_rate ) of sending the status messages to
the error log table. For example, if status_interval =
10 and disk_sample_rate = 60 seconds, the
Health Monitor sends a status message to the error log table every 600
seconds.
|
system_cpu_stats_history |
ON | Whether system CPU usage statistics are collected and
stored in the Performance Schema table
system_cpu_stats_history . If these statistics are
not stored, you can still access a snapshot on demand by querying the
system_cpu_stats table.
|
system_disk_stats_history |
ON | Whether system disk usage statistics are collected and
stored in the Performance Schema table
system_memory_stats_history . If these statistics
are not stored, you can still access a snapshot on demand by querying
the system_memory_stats table.
|
system_memory_stats_history |
ON | Whether system memory usage statistics are collected and
stored in the Performance Schema table
system_disk_stats_history . If these statistics are
not stored, you can still access a snapshot on demand by querying the
system_disk_stats table.
|
system_process_stats_history |
ON | Whether MySQL server process (mysqld )
statistics are collected and stored in the Performance Schema table
system_process_stats_history . If these statistics
are not stored, you can still access a snapshot on demand by querying
the system_process_stats table.
|
system_retention |
86400 |
How many seconds each data sample is retained for in the history tables for system and process statistics. The minimum value is 1 second, the default is 86400 seconds (1 day), and the maximum value is 864000 seconds (10 days). |
system_running |
ON | Whether system statistics are collected by Health
Monitor. When this is set to ON , the statistics tables
for system CPU usage, disk usage, memory usage, and the MySQL server
process (mysqld ) are available, and the corresponding
history tables are populated if their system variables are set to
ON .
|
system_sample_rate |
60 | The frequency of system statistics collection, in seconds. By default, data is collected every 60 seconds. The minimum sample rate is 1 (every second), and the maximum is 86400 (once a day). |
Health Monitor Messages
The Health Monitor errors, warnings, and status updates are logged in
performance_schema.error_log
. Health Monitor abbreviates TiB as T, GiB
as G and MiB as M in the error log. For example, 306.6G = 306.6 GiB = 329.2 GB.
Table 17-9 Health Monitor Messages
Message Type | Description |
---|---|
Status Update | Health Monitor issues a routine status update according
to the number of seconds defined by sample_rate *
status_interval . For
example:
|
Threshold Warning | If the amount of available space falls below one of the
thresholds defined by disk_usage_warning_level_* , the
Health Monitor issues a warning. The following example shows the warning
for disk_usage_warning_level_2 , defined as 4% of the
total space above the low space
limit:
|
disk_low_space_duration Warning
|
If the available disk space drops below the low space
limit of 18 GiB, the Health Monitor starts the fallback timer and issues
a warning. For
example:
|
OFFLINE/SUPER_READ_ONLY Mode
Warning
|
If the available disk space remains below the low space limit for more than disk_low_space_duration seconds, the Health Monitor puts the server into SUPER_READ_ONLY and OFFLINE_MODE . For example: See Resolving SUPER_READ_ONLY and OFFLINE_MODE Issue.
|
Critical Warning | If the available disk space drops below
disk_low_space_critical_level for any length of
time, the Health Monitor puts the server into
SUPER_READ_ONLY mode. For
example:
|
Timer Canceled Warning | If the available space rises above the low space limit before disk_low_space_duration seconds completes, the fallback mode timer is canceled and the Health Monitor issues either a status message or a warning message if the available space is still within a warning range. For example:[Warning] [MY-013695] [Health] Disk: Low Level (18M): Fallback Timer Canceled [Warning] [MY-013695] [Health] Disk: Warning Level 1 (107M): monitored='/db', available=79M, total=466M, used=82.9%, low limit=18M, critical=10M, recovery=40M, warnings=107M/62M/40M |
Memory Warning | If the available memory falls below one of the thresholds defined by memory_usage_warning_level_* , the Health Monitor issues a warning which includes the memory diagnostics details in comma separated values (csv). The csv data in the following example has been removed due to space constraint. [Warning] [MY-013695] [Health] MEMORY: WARNING LEVEL 1 (1.0G): available=950M, total=7.5G, used=87.6%, mysqld=5.9G, warnings=1.0G/500M/100M [Warning] [MY-013695] [Health] [Warning] [MY-013695] [Health] ===== MEMORY DIAGNOSTICS BEGIN (csv) ===== --------------------------------------- TOTAL ALLOCATED MEMORY AND BUFFER SIZES --------------------------------------- keyword,value mysql_version,8.0.33 total_allocated,15121493496 innodb_buffer_pool_size,13958643712 join_buffer_size,262144 ---------------------------------------- MEMORY AGGREGATION PER LOGICAL COMPONENT ---------------------------------------- component,current_count,current_alloc,max_alloc,min_alloc ...<csv data>... ----------------------- MEMORY USAGE PER THREAD ----------------------- thread_id,user,current_count_used,current_allocated,current_avg_alloc,current_max_alloc,total_allocated ...<csv data>... ----------------------- MEMORY USAGE PER USER ----------------------- user,current_count_used,current_allocated,current_avg_alloc,current_max_alloc,total_allocated ...<csv data>... ------------------- PROCESSLIST DETAILS ------------------- thd_id,conn_id,user,db,command,state,time,current_statement,execution_engine,statement_latency,progress, lock_latency,cpu_latency,rows_examined,rows_sent,rows_affected,tmp_tables,tmp_disk_tables,full_scan, last_statement,last_statement_latency,current_memory,last_wait,last_wait_latency,source,trx_latency, trx_state,trx_autocommit,pid,program_name ...<csv data>... --------------- SESSION DETAILS --------------- thd_id,conn_id,user,db,command,state,time,current_statement,execution_engine,statement_latency,progress, lock_latency,cpu_latency,rows_examined,rows_sent,rows_affected,tmp_tables,tmp_disk_tables,full_scan, last_statement,last_statement_latency,current_memory,last_wait,last_wait_latency,source,trx_latency, trx_state,trx_autocommit,pid,program_name ...<csv data>... -------------------------------------------------------- CURRENT STATEMENT PER THREAD ORDERED BY STATEMENT MEMORY -------------------------------------------------------- THREAD_ID,EVENT_ID,END_EVENT_ID,EVENT_NAME,SOURCE,TIMER_START,TIMER_END,TIMER_WAIT,LOCK_TIME,SQL_TEXT, DIGEST,DIGEST_TEXT,CURRENT_SCHEMA,OBJECT_TYPE,OBJECT_SCHEMA,OBJECT_NAME,OBJECT_INSTANCE_BEGIN,MYSQL_ERRNO, RETURNED_SQLSTATE,MESSAGE_TEXT,ERRORS,WARNINGS,ROWS_AFFECTED,ROWS_SENT,ROWS_EXAMINED,CREATED_TMP_DISK_TABLES, CREATED_TMP_TABLES,SELECT_FULL_JOIN,SELECT_FULL_RANGE_JOIN,SELECT_RANGE,SELECT_RANGE_CHECK,SELECT_SCAN, SORT_MERGE_PASSES,SORT_RANGE,SORT_ROWS,SORT_SCAN,NO_INDEX_USED,NO_GOOD_INDEX_USED,NESTING_EVENT_ID, NESTING_EVENT_TYPE,NESTING_EVENT_LEVEL,STATEMENT_ID,CPU_TIME,MAX_CONTROLLED_MEMORY,MAX_TOTAL_MEMORY, EXECUTION_ENGINE ...<csv data>... ---------------------------------------------------------- CURRENT STATEMENT PER THREAD FOR THREADS USING MOST MEMORY ---------------------------------------------------------- THREAD_ID,EVENT_ID,END_EVENT_ID,EVENT_NAME,SOURCE,TIMER_START,TIMER_END,TIMER_WAIT,LOCK_TIME,SQL_TEXT, DIGEST,DIGEST_TEXT,CURRENT_SCHEMA,OBJECT_TYPE,OBJECT_SCHEMA,OBJECT_NAME,OBJECT_INSTANCE_BEGIN,MYSQL_ERRNO, RETURNED_SQLSTATE,MESSAGE_TEXT,ERRORS,WARNINGS,ROWS_AFFECTED,ROWS_SENT,ROWS_EXAMINED,CREATED_TMP_DISK_TABLES, CREATED_TMP_TABLES,SELECT_FULL_JOIN,SELECT_FULL_RANGE_JOIN,SELECT_RANGE,SELECT_RANGE_CHECK,SELECT_SCAN, SORT_MERGE_PASSES,SORT_RANGE,SORT_ROWS,SORT_SCAN,NO_INDEX_USED,NO_GOOD_INDEX_USED,NESTING_EVENT_ID, NESTING_EVENT_TYPE,NESTING_EVENT_LEVEL,STATEMENT_ID,CPU_TIME,MAX_CONTROLLED_MEMORY,MAX_TOTAL_MEMORY, EXECUTION_ENGINE,thread_id ...<csv data>... ---------------------------------------------------------- STATEMENT HISTORY PER THREAD FOR THREADS USING MOST MEMORY ---------------------------------------------------------- THREAD_ID,EVENT_ID,END_EVENT_ID,EVENT_NAME,SOURCE,TIMER_START,TIMER_END,TIMER_WAIT,LOCK_TIME,SQL_TEXT, DIGEST,DIGEST_TEXT,CURRENT_SCHEMA,OBJECT_TYPE,OBJECT_SCHEMA,OBJECT_NAME,OBJECT_INSTANCE_BEGIN,MYSQL_ERRNO, RETURNED_SQLSTATE,MESSAGE_TEXT,ERRORS,WARNINGS,ROWS_AFFECTED,ROWS_SENT,ROWS_EXAMINED,CREATED_TMP_DISK_TABLES, CREATED_TMP_TABLES,SELECT_FULL_JOIN,SELECT_FULL_RANGE_JOIN,SELECT_RANGE,SELECT_RANGE_CHECK,SELECT_SCAN, SORT_MERGE_PASSES,SORT_RANGE,SORT_ROWS,SORT_SCAN,NO_INDEX_USED,NO_GOOD_INDEX_USED,NESTING_EVENT_ID, NESTING_EVENT_TYPE,NESTING_EVENT_LEVEL,STATEMENT_ID,CPU_TIME,MAX_CONTROLLED_MEMORY,MAX_TOTAL_MEMORY, EXECUTION_ENGINE,thread_id ...<csv data>... ----------------------------- EVENTS ALLOCATING MOST MEMORY ----------------------------- processlist_id,processlist_user,thread_id,event_name,count_alloc,count_free,current_count_used, current_number_of_bytes_used ...<csv data>... ---------------------- BUFFER POOL STATISTICS ---------------------- POOL_ID,POOL_SIZE,FREE_BUFFERS,DATABASE_PAGES,OLD_DATABASE_PAGES,MODIFIED_DATABASE_PAGES,PENDING_DECOMPRESS, PENDING_READS,PENDING_FLUSH_LRU,PENDING_FLUSH_LIST,PAGES_MADE_YOUNG,PAGES_NOT_MADE_YOUNG,PAGES_MADE_YOUNG_RATE, PAGES_MADE_NOT_YOUNG_RATE,NUMBER_PAGES_READ,NUMBER_PAGES_CREATED,NUMBER_PAGES_WRITTEN,PAGES_READ_RATE, PAGES_CREATE_RATE,PAGES_WRITTEN_RATE,NUMBER_PAGES_GET,HIT_RATE,YOUNG_MAKE_PER_THOUSAND_GETS, NOT_YOUNG_MAKE_PER_THOUSAND_GETS,NUMBER_PAGES_READ_AHEAD,NUMBER_READ_AHEAD_EVICTED,READ_AHEAD_RATE, READ_AHEAD_EVICTED_RATE,LRU_IO_TOTAL,LRU_IO_CURRENT,UNCOMPRESS_TOTAL,UNCOMPRESS_CURRENT ...<csv data>... |
Related Topics