HeatWave Cluster Failure and Recovery

HeatWave triggers the recovery process when there is either a HeatWave node failure, or maintenance or planned restart of MySQL Server.

HeatWave monitors the HeatWave node status regularly and if there is no response from the node after 60 seconds, it considers it a HeatWave node failure.

During the recovery process, HeatWave automatically attempts to bring the node online, reform the cluster, and reload data that was previously loaded. There are two ways of reloading data:

  • From the HeatWave storage layer: During recovery from a HeatWave node failure, HeatWave reloads data from the HeatWave Storage layer, which is created when you enable the HeatWave cluster for the first time. To facilitate recovery, data is persisted to Object Storage when data is loaded into HeatWave and when data changes is propagated from the DB system to HeatWave. Loading data from Object Storage is faster because the data does not need to be converted to the HeatWave storage format, as is required when loading data from the DB system.
  • From the DB system: During recovery from maintenance or planned restart of MySQL Server, HeatWave reloads data from the DB system.

During recovery, HeatWave automatically reloads data. However, if the MySQL Server is in the SUPER_READ_ONLY mode, you cannot load data into HeatWave, and the HeatWave recovery fails. Disable the SUPER_READ_ONLY mode to load data. See Resolving SUPER_READ_ONLY and OFFLINE_MODE Issue.

When you unload a table, the data is removed from HeatWave, and in a background operation, it is removed from Object Storage too.