Generating a Node Count Estimate

Estimate the number of HeatWave nodes required to run a workload, which depends on the size of the tables and columns to be loaded, and the compression achieved in memory for this data.

When you start the service, database tables, on which HeatWave queries are run, have to be loaded to the HeatWave cluster memory. Under-provisioning the HeatWave cluster results in a data load or query execution failure due to space limitations. Over-provisioning the HeatWave cluster results in additional costs for unneeded resources. Based on the database tables you intend to load to memory, machine learning intelligently estimates the number of HeatWave nodes you require.

Using the Console

Use the Console to generate a node count estimate while adding a HeatWave cluster to a DB system, or any time later to adjust the number of nodes as the data increases or decreases in size.

This task requires the following:
  • The data you intend to load into the HeatWave cluster present in the DB system.
  • Optionally, log into your DB system and run ANALYZE TABLE on tables you intend to load into the HeatWave cluster. Estimates should generally be valid without running ANALYZE TABLE, but running ANALYZE TABLE ensures that estimates are as accurate as possible.
Do the following to generate a node count estimate:
  1. Open the navigation menu, and select Databases. Under HeatWave MySQL, click DB systems.
  2. In the HeatWave cluster filter, select Attached to filter the DB systems that have a HeatWave cluster attached.
  3. Click the name of your DB system to open the DB system details page.
  4. In the Resources list, click HeatWave cluster.
  5. In the HeatWave cluster information frame, click Add HeatWave cluster or Edit.
  6. On the Add HeatWave cluster or Edit dialog box, click Estimate node.
  7. On the Estimate node panel, click Generate estimate. If you recently generated a node count estimate, the previous estimate details are displayed. Click Regenerate estimate to create a new estimate.
    The operation may take several minutes depending on the size and properties of your data. When the operation completes, you get a response that contains the following details:
    • Name: Specify the name of the schema.
    • Memory estimate: Specify the estimated amount of memory required for the schema.
    • Information: Specify the number of tables in the schema and the number of tables with errors.
  8. Select the schemas that you want to include in the node count estimate.
    The estimate details in Summary are adjusted automatically after modifying the schema selection.
  9. (Optional) Expand the schema rows to view information about individual tables. Deselect tables that you do not want to include in the estimate.
    Note

    The Information column reports errors if there are problems with a table. For example, an error is reported for tables with unsupported column data types, tables without a primary key, or tables with too many columns. Tables with errors are not included in the node count estimate. You can regenerate the node count estimate after resolving the errors. See Node Count Estimate Table Errors.
  10. (Optional) If you want to change the currently selected shape, select another shape for the HeatWave nodes.
  11. Review the estimate details in the Summary, which provides the following information:
    • Shape: Specify the selected HeatWave node shape.
    • CPU core count: Specify the CPU core count of the selected HeatWave node shape.
    • Memory size: Specify the memory size of the selected HeatWave node shape.
    • Max network bandwidth: Specify the maximum network bandwidth of the selected HeatWave node shape.
    • Node: Specify the estimated number of HeatWave nodes required based on the data size and the selected HeatWave node shape.
    • Total memory required: Specify the estimated amount of memory required for the HeatWave cluster based on the data size.
    • Total memory: Specify the total HeatWave cluster memory size, which is the memory size of the selected HeatWave node shape multiplied by the estimated number of nodes.
  12. (Optioanl) You can click Show load command to view the load command.
    Note

    The load command is generated based on the schemas and tables selected for the node count estimate. You can use the command after the HeatWave cluster is provisioned to load the selected schemas and tables. You can run the command from any MySQL client that is connected to the DB system.
  13. Click Apply estimated node.
    When you apply the estimated node, it overwrites the shape and node in the Add HeatWave cluster or Edit HeatWave cluster panel.

Node Count Estimate Table Errors

While estimating node count, you may encounter table errors if certain conditions are not met.

Table 11-2 Node Count Estimate Table Errors

Table Error Description
TOO MANY COLUMNS TO LOAD The table has too many columns. The column limit is 1017.
ALL COLUMNS MARKED AS NOT SECONDARY There are no columns to load. All table columns are defined as NOT SECONDARY.
CONTAINS VARLEN COLUMN WITH >65532 BYTES A VARLEN column exceeds the 65532 byte limit. See VARLEN Encoding.
ESTIMATION COULD NOT BE CALCULATED The estimate could not be calculated. For example, a table estimate may not be available if statistics for VARLEN columns are unavailable.
UNABLE TO LOAD TABLE WITHOUT PRIMARY KEY A table must be defined with a primary key before it can be loaded into HeatWave cluster.