Installing Conda Environments in a Notebook Session

To use conda environments in notebook sessions, you must install them.

You can install either a Data Science or a published conda environment by clicking Install in the environment card. You can copy and run the code snippet in a terminal window. The new environment is installed in a block volume under /home/datascience/conda folder. The folders in /home/datascience/conda correspond to the slugs of the conda environments.

After the environment is ready to be used as a notebook kernel, the new environment entry is listed in the Installed Conda Environments tab in the Environment Explorer tab. Then a new kernel for that particular conda environment is available in the JupyterLab Launcher tab in the Notebook category. You can start working in that conda environment by clicking the environment kernel icon to open a new tab to open a new notebook file.

Or, you can open a new notebook by clicking File, select New, and then select a kernel for the notebook session.

Important

Because all installed conda environments are stored on the block volume in /home/datascience, these environments are available after the session is activated. You don't need to reinstall the conda environments after you deactivate the notebook session.

Install a conda environment using the odsc conda command in a JupyterLab terminal window tab:

odsc conda install --slug <slug>

The <slug> is the slug of the environment you want to install. The slug is listed in the environment card in the Environment Explorer tab. You're prompted to change the version of the environment, which is optional. It can take a few seconds to see the new kernel in the JupyterLab Launcher tab.

By default, odsc conda looks for Data Science conda environments with matching <slug> value, or <name> and <version>. You can target an Object Storage bucket hosting a published conda environments by adding the --override option. It looks for the target conda environment in the bucket defined in the custom config.yaml file that was created by odsc conda init. For example:

odsc conda install --override --slug <slug>

List all the supported install options with odsc conda install -h.

Conda environments can also be installed using tar files. Provide the URI of the tar files by specifying it with the --uri option. It can be local path, PAR links, or an OCI link.

Installing from a local file:
odsc conda install --uri <path_to_the_local_environment_tar_file>
Installing with a PAR link:
odsc conda install --uri <http_link_to_the_environment_tar_file>
Installing with an OCI link using resource principal authentication:
odsc conda install --uri <oci://my-bucket@my-namespace/path_to_tar_file>
Important

Installing libraries in the base environment (Python 3) isn't recommended because they're not persisted after reactivation of the notebook. The best practice is to clone the base environment, and then install libraries there.

Upgrading the PySpark Conda Environment

These steps are only required if you have installed the older version of the PySpark conda environment and want to preserve it for potential future use. If you don't need the old environment nor have made any specific configurations for Spark, we recommend deleting the old environment before proceeding with the installation of the new version.

  1. Preparing for the PySpark conda environment update:
    • Open the Data Science notebook session.
    • Find the spark_conf_dir directory in your home directory, and then rename it to spark_conf_dir_v2. The rename action temporarily disables the pyspark32_p38_cpu_v2 environment.

      You can revert by renaming spark_conf_dir_v2 back to spark_conf_dir then the pyspark32_p38_cpu_v2 is operational again.

  2. Updating the Pyspark conda environment:
    • Open a terminal and run the command:

      odsc conda install -s pyspark32_p38_cpu_v3

      The command installs a V3 conda environment, and creates a new spark_conf_dir directory.

  3. Verifying the configuration changes:
    • If you made any custom changes to the old spark_conf_dir_v2 configuration, such as modifications to core-site.xml or spark-defaults.conf, ensure that these changes are copied to their respective files in the new spark_conf_dir_v2 directory.