Importing Data to Oracle HeatWave

Use MySQL Shell to connect to the database system and import your data to Oracle HeatWave with MySQL Shell's dump and dump loading utilities.

For instructions to set up client connections to a database system and to install MySQL Shell, see Connecting to Oracle HeatWave.

Here's a summary of a process with Oracle Database Service for Azure to export data from a local MySQL server instance, and import it to the database system for the HeatWave Cluster:

  • On your local network, use one of MySQL Shell's dump utilities to export the data from the MySQL server instance to a set of local dump files. Apply the compatibility options for MySQL Database Service.
  • Transfer the dump files with your preferred file transfer method to an Azure VM that can connect to the database system.
  • On the Azure VM, use MySQL Shell's dump loading utility to import the dump files to the database system, then use Oracle HeatWave's Auto Parallel Load feature to load the data into the HeatWave nodes.
If you want to try out this process with sample dump files already created by MySQL Shell's dump utilities, you can use the airportdb download (approximately 640MB in size). To download and unpack it, use these commands:

wget https://downloads.mysql.com/docs/airport-db.tar.gz
tar xvzf airport-db.tar.gz
or:

wget https://downloads.mysql.com/docs/airport-db.zip
unzip airport-db.zip
On Windows, use the URL given to download the Zip archive. Unpacking the compressed tar or Zip archive results in a directory named airport-db, which contains the data files.

Here are the process steps in detail:

  1. Use MySQL Shell's instance dump utility util.dumpInstance or schema dump utility util.dumpSchemas to dump the data from a MySQL server instance to a set of local dump files.
    For the reference documentation for MySQL Shell's dump utilities, see Instance Dump Utility, Schema Dump Utility, and Table Dump Utility. Here you can find the syntax to invoke the utilities, and the full list of options that you can add to structure your dump output, such as including and excluding tables.
    If you are trying out the process with airportdb, you can skip over this step, unless you want to try out the data transfer from your local network to the Azure VM. In that case, download and unpack the airportdb files on your local machine.
    1. On a machine on your local network, start MySQL Shell and select JavaScript mode by typing \js, or Python mode by typing \py. These examples are shown in JavaScript mode.
    2. Run the following command to start a global session by connecting to the endpoint of the MySQL Server instance:
      \connect <UserName>@<EndpointIPAddress>
      • \connect: The MySQL Shell command to establish a new connection.
      • <UserName>: The user name for an administrator account for the MySQL Server instance.
      • <EndpointIPAddress>: The IP address of the endpoint of the MySQL Server instance.
      Enter the password for the administrator account when prompted.
    3. Carry out a dry run of the dump process to find any issues for compatibility with a database system. You can fix these issues when you carry out the dump process for real.
      Use these options for the dry run:
      dryRun: true
      Use this option the first time you run the utility. It makes the utility carry out compatibility checks and list the results, but not actually dump the data.
      ocimds: true
      Use this option to enable the checks and modifications for compatibility with MySQL Database Service. In a dry run, the option makes the utility check for any issues in the data structure that need to be fixed before import.
      Specify a location where the dump files will be placed. In this example, the instance dump utility is being used to dump all the schemas from the instance to files in the folder C:/Users/hanna/worlddump, though the files are not being created yet:
      util.dumpInstance("C:/Users/hanna/worlddump", {dryRun: true, ocimds: true})
      You will get a response like this, with the issues listed in the dump output:
      
      Checking for compatibility with MySQL Database Service 8.0.30
      ...
      Compatibility issues with MySQL Database Service 8.0.30 were found. Please use the 
      'compatibility' option to apply compatibility adaptations to the dumped DDL.
      Util.dumpInstance: Compatibility issues were found (RuntimeError)
    4. Carry out the real dump process, removing the dryRun option, and applying any compatibility modifications that you were advised about in the dump output.
      Use these options for the real dump:
      ocimds: true
      Use this option to enable the checks and modifications for compatibility with MySQL Database Service. In the real dump process, the option makes changes to the dump metadata so that the import works in a database system, and checks for any remaining issues in the data structure that are not covered by the compatibility modifications.
      compatibility: [array of strings]
      Use this option to add compatibility fixes to the dump output. List the modification options that are requested in the results from the compatibility checks.
      For example:
      util.dumpInstance("C:/Users/hanna/worlddump", {
              > ocimds: true, compatibility: ["strip_definers", "strip_restricted_grants"]})
      The utility dumps the metadata and data to a set of dump files in the location that you stated.
  2. Use your preferred file transfer method to make a connection to an Azure VM that can connect to the database system. Transfer all of the dump files and the containing folder to the Azure VM.
    If you are trying out the process with airportdb, you can download and unpack the files on the Azure VM.
  3. Use MySQL Shell's dump loading utility util.loadDump on the Azure VM to upload the dump files to the database system.
    For the reference documentation for MySQL Shell's dump loading utility, with the syntax and more options that you can add to control the dump upload, see Dump Loading Utility.
    1. On the Azure VM, start MySQL Shell and select JavaScript mode by typing \js, or Python mode by typing \py. These examples are shown in JavaScript mode.
    2. Run the following command to start a global session by connecting to the endpoint of the database system:
      \connect <UserName>@<DBSystemEndpointIPAddress>
      • \connect: The MySQL Shell command to establish a new connection.
      • <UserName>: The user name for the administrator account that you set up for the HeatWave Cluster.
      • <DBSystemEndpointIPAddress>: The IP address of the endpoint of the database system. You can find this on the Connect tab on the deployment's details page.
      Enter the password for the administrator account when prompted.
    3. Run the dump loading utility to upload the dump files to the database system from the location where you placed them on the Azure VM.
      You can do a dry run first if you want to check that there are no issues. For example:
      util.loadDump("/mnt/data/worlddump", {dryRun:true})
      In this example of a real upload, the GTID set is transferred to the database system, and a progress state file is created so that you can cancel and resume the upload:
      util.loadDump("/mnt/data/worlddump", {progressFile: "loadprogress.json", updateGtidSet: append})
    4. When the upload is complete, you can verify that the data is present on the database system by switching MySQL Shell to SQL mode (type \sql and press Enter). In this mode, you can use MySQL statements to access the databases and tables.
  4. If you have already created the HeatWave nodes, use Oracle HeatWave's Auto Parallel Load feature to load the data into them. Auto Parallel Load facilitates the process of loading data into HeatWave by automating many of the steps involved. It can be run from any MySQL client or connector, and is implemented as a stored procedure named heatwave_load, which resides in the MySQL sys schema.
    1. On the Azure VM, start MySQL Shell and select SQL mode by typing \sql.
    2. Run the following command to start a global session by connecting to the endpoint of the database system:
      \connect <UserName>@<DBSystemEndpointIPAddress>
      • \connect: The MySQL Shell command to establish a new connection.
      • <UserName>: The user name for the administrator account that you set up for the HeatWave Cluster.
      • <DBSystemEndpointIPAddress>: The IP address of the endpoint of the database system. You can find this on the Connect tab on the deployment's details page.
      Enter the password for the administrator account when prompted.
    3. Call sys.heatwave_load to load the data into the HeatWave Cluster, naming the schema (database) that you want to load.
      For example, for the sample airportdb database, issue this command:
      CALL sys.heatwave_load(JSON_ARRAY("airportdb"),NULL);
Now that the data has been imported to the database system, you can use HeatWave Autopilot to estimate a suitable number of HeatWave nodes for the HeatWave Cluster. See Provisioning HeatWave Nodes for instructions.