In App Features
ML JupyterLab is an app in the AI/ML Accelerator package. A license is required in order to use the AI/ML Accelerator package. For more information, please contact DNAnexus Sales via [email protected].
The ML JupyterLab maintains the core features of a DXJupyterLab environment (for detailed information, please see here). To be specialized for ML work, the app is added with multiple new features which are listed below.
In-App Features of ML JupyterLab
ML JupyterLab is an app in the AI/ML Accelerator package. A license is required in order to use the AI/ML Accelerator package. For more information, please contact DNAnexus Sales via [email protected].
The ML JupyterLab maintains the core features of a DXJupyterLab environment (for detailed information, please see here). To be specialized for ML work, the app is added with multiple new features which are listed below.
Kernel
ML JupyterLab uses python 3.10.18 kernel. This version provides a stable and widely-used environment for data science and machine learning tasks. It is fully compatible with a wide range of data processing and machine learning libraries, including NumPy, Pandas, Scikit-learn, TensorFlow, and PyTorch, enabling efficient model development and experimentation.
Managing Environments in ML JupyterLab: Conda and UV Support
Besides the default Python kernel with a ready-to-use, pre-installed ML environment to help users get started immediately, ML JupyterLab offers flexible and robust support for creating and managing custom virtual environments using Conda or UV.
This allows users to test new models that require dependencies different from the base environment, without interfering with the default setup.
Conda Environments
Conda is a powerful package and environment manager that is well-suited for managing complex dependencies across Python and non-Python packages. It is pre-installed in ML JupyterLab, allowing you to quickly create and manage isolated environments tailored to your ML workflows.
To create a new environment (e.g. Python 3.13) and register it as a Jupyter Notebook kernel, follow the steps below:
Create a new Conda environment named "conda-py313" with Python 3.13
conda create -n conda-py313 python=3.13 -y
Activate the newly created environment
conda activate conda-py313
Install the IPython kernel package inside the environment
conda install ipykernel -y
Register the environment as a new Jupyter kernel
python -m ipykernel install --user --name py313-conda --display-name "Python 3.13 (Conda)"
After completing these steps, you’ll see "Python 3.13 (Conda)" available as a kernel in the ML Jupyter Notebook interface. You can now use this environment to run code and install additional packages using conda install or pip install as needed.
For more information about the use of Conda, please visit the Conda Documentation.
UV Support
UV is a modern, high-performance Python package manager that supports fast, reproducible environment creation. In ML JupyterLab, UV is pre-installed, so you can start using it right away without any additional setup. It’s particularly effective for lightweight, Python-native workflows where speed and consistency are essential.
To create a new environment (e.g. Python 3.13) and register it as a Jupyter Notebook kernel, follow these steps:
Create a new virtual environment named "uv_env" with Python 3.13
uv venv uv_env --python=python3.13
Activate the new environment
source uv_env/bin/activate
Install the IPython kernel package so it can be added to Jupyter
uv pip install ipykernel
Register the environment as a new Jupyter kernel
python -m ipykernel install --user --name new_python313 --display-name "Python 3.13 (UV)"
Once completed, you'll see "Python 3.13 (UV)" as an option in the Jupyter Notebook kernel list. From there, you can install additional packages using uv pip install <package>
a s needed for your ML workflows.
For more information about the use of UV, please visit the UV Documentation.
fsspec-dnanexus
fsspec-dnanexus is a pre-installed Python library on the ML JupyterLab that abstracts file system operations, providing the unified APIs for interacting with the DNAnexus project storage. It simplifies working with files on the DNAnexus project by direct access without the need for downloading data to the local storage of the JupyterLab environment.
Here is an example use of fsspec-dnanexus to read a .csv file from DNAnexus project
import pandas as pd
df = pd.read_csv("dnanexus://my-dx-project:/folder/data.csv")
For the detailed usage, please refer to the Official PyPI page.
Environment Snapshots
Environment Snapshots in ML JupyterLab let you capture the complete state of your session—including installed libraries, configurations, and local files—so you can easily reuse it in future sessions without reinstalling everything.
A ML JupyterLab session is run in a Docker container, and an Environment Snapshot file is a tarball generated by saving the Docker container state (with the docker commit and docker save commands). Any installed packages and files created locally are saved to a snapshot file, with the exception of directories /home/dnanexus and /mnt/, which are not included. This file is then uploaded to the project to .Notebook_snapshots and can be passed as input the next time the app is started.
Below are steps to create a reuse an Environment Snapshot
1. Prepare & Install Additional Required Libraries
To install any additional packages, let’s open a Terminal in your ML JupyterLab session. Then, use typical commands for installation:
pip install your-library==version
conda install -c conda-forge your-package
Run imports or simple test code in your notebook to confirm proper installation.
2. Create a Snapshot of Your Environment
Locate DNAnexus → Create Snapshot from the ML JupyterLab menu.

The snapshot process generates a tarball file which is saved under the .Notebook_snapshots directory in your DNAnexus project.
3. Reuse the Snapshot in Future Jobs
When launching a new ML JupyterLab job, in the ML JupyterLab launch dialog, let’s select your snapshot (the previously saved .tar.gz file) under the Snapshot Image field.
Note: When using a snapshot in a new ML JupyterLab job, the saved environment will be automatically installed on all nodes of the job, ensuring consistent dependencies across the entire cluster.
Ray dashboard
ML JupyterLab provides access to the Ray Dashboard, a powerful tool for monitoring and managing distributed applications built using Ray. The dashboard gives users real-time insights into their Ray clusters and distributed applications.
Key Features of the Ray Dashboard:
Cluster Monitoring: Get an overview of the state of your Ray cluster, including node health, task statuses, and resource utilization.
Task and Actor Management: Track the progress of tasks and actors across the cluster, enabling users to identify bottlenecks or performance issues.
Resource Utilization: Monitor how resources such as CPU, memory, and GPUs are being used by your distributed tasks.
Logs and Debugging: Access logs and other debugging tools to troubleshoot and optimize your workflows.
Accessing the Ray Dashboard:
To Use:
Open the Homepage: This is the interface where applications and services are available to launch.
Click the Ray Dashboard Icon: Clicking the icon should automatically open the Ray Dashboard in a new tab on JupyterLab
Explore the Dashboard: Once the dashboard is open, you can navigate through the various sections like Overview, Nodes, Actors, Tasks, etc.
Ray Dashboard Tab Overview:
Below is the overview of each section of the Ray Dashboard. For more detailed information, please refer to the Ray Dashboard Documentation.
Section
What it Does
Use Case
Overview
Summarizes key information about the Ray cluster, including resource usage, number of active jobs, actors, and nodes.
Quickly check the overall health and activity of the Ray cluster directly within ML JupyterLab. Useful as a starting point for deeper diagnostics.
Jobs
Displays a list of submitted Ray jobs along with their status (running, succeeded, failed), runtime environment, and timestamps.
Track job execution in real time. Helps users debug failed jobs or confirm successful task completion without leaving JupyterLab.
Serve
Shows the status and configuration of Ray Serve deployments, including endpoints, replica counts, and health.
Monitor deployed machine learning models or APIs, check routing logic, and scale deployments to meet demand. Ideal for users serving models via Ray Serve.
Cluster
Provides details about each node in the Ray cluster: available resources, current usage, and node status.
View how resources (CPU, memory, GPU) are distributed across nodes. Useful for optimizing workload placement or identifying underperforming nodes.
Actor
Lists all active and historical Ray actors, showing their state, creation tasks, resource usage, and ownership.
Useful for debugging stateful components, such as streaming pipelines or long-lived agents, especially if something gets stuck or fails silently.
Metric
Presents system and application-level metrics, including CPU usage, memory, GPU, and custom user-defined metrics.
Visualize performance trends over time. Helps with performance tuning and detecting memory leaks or CPU bottlenecks.
Logs
Aggregates and displays logs from Ray drivers, workers, and components in a searchable and filterable view.
Essential for debugging errors, investigating crashes, or understanding unexpected behavior during execution. Users can check logs directly from JupyterLab.
Script Server - Execute a command on all workers
ML JupyterLab provides a built-in Script Server that allows you to interact directly with all computing clusters in your job via preconfigured scripts or arbitrary commands. This feature streamlines multiple cluster operations such as job submission, resource monitoring, file transfer, package installations, and environment management.
The integrated Script Server enables you to communicate effortlessly with computing clusters, no terminal access or complex commands required. With just a few clicks, you can:
Run parallel shell commands across nodes (via PDSH)
Restart and manage distributed systems like Ray clusters
Monitor or control cluster behavior in real time
How to Use the Script Server
1. Access the Script Server Panel:
Open the Homepage: This is the interface where applications and services are available to launch.
Click the Script Server icon: Clicking the icon should automatically open the Script Server in a new tab of the web browser.
2. Select A Script to Operate
Choose either "PDSH" or "Restart Ray Cluster" from the list in the left sidebar to execute your desired operation.
PDSH: to run any shell command simultaneously on multiple compute nodes. By default, the head node directory “/scratch” is mounted to all worker nodes.
Restart Ray Cluster: to restart all Ray processes across the cluster. Useful if the cluster becomes unresponsive or needs to be reinitialized after job failures.
Example Use Case - Install additional package on all workers using PDSH
# Install additional packages on all workers using PDSH
uv pip install --system 'accelerate>=0.11' 'skorch'
Resources
To create a support ticket if there are technical issues:
Go to the Help header (same section where Projects and Tools are) inside the platform
Select “Contact Support”
Fill in the Subject and Message to submit a support ticket.
Last updated
Was this helpful?