Getting Started in ML JupyterLab
Last updated
Was this helpful?
Last updated
Was this helpful?
ML JupyterLab is an app in the AI/ML Accelerator package. A license is required in order to use the AI/ML Accelerator package. For more information, please contact DNAnexus Sales via .
ML JupyterLab is essentially a purpose-built JupyterLab instance on DNAnexus Platform. It inherits all capabilities of a standard JupyterLab, plus specialized features for AI/ML development. This section gives you a quick start on how to launch ML JupyterLab.
Find ML JupyterLab in your Tools Library: Simply open your and search for “AI/ML Accelerator - ML JupyterLab”. If you cannot find it, you might have to obtain a license.
Set the Required Inputs
Docker Image:
When you launch ML JupyterLab, it will ask you to pick a Docker image from a list. Those are the prebuilt environments tailored to AI/ML development. You can pick the most standard option, which is buildspec-1.0/ray-2.32.0-py310-cpu, or other options if you need a GPU-friendly setup.
Currently, we support Ray version 2.32 as a distributed engine.
Available Docker images:
General Python 3.10 with CPU Support:
buildspec-1.0/ray-2.32.0-py310-cpu
Pytorch with GPU Support:
buildspec-1.0/ray-2.32.0-py310-gpu-pytorch
TensorFlow with GPU Support:
buildspec-1.0/ray-2.32.0-py310-gpu-tensorflow
Each image is optimized for specific workloads, the included packages and their version in each Docker image are listed in the Pre-installed ML packages section.
Duration:
This parameter sets the duration (in minutes) for which your environment will remain active. The expected runtime should be specified based on how long you plan to work with the environment, the size of the dataset, or the complexity of the tasks you will be running.
For example, larger datasets or more complex computations may require a longer runtime.
If you are unsure about the duration, use the default value and you can change this parameter inside the app later.
Set the Optional Parameters
Instance Type & Initial Instance Count
This input is crucial if you want to develop AI/ML using large datasets that need intensive computing power. ML JupyterLab has a built-in Ray cluster and this architecture can help create a workspace with a huge number of CPUs and RAM.
To find this input, click on the instance icon in the top right corner of the input panel which is automatically open when you try to launch ML JupyterLab.
By default, the ML JupyterLab uses two mem2_ssd1_v2_x4 instances (Input Parameters: Initial Instance Count: 2, Instance Type: mem2_ssd1_v2_x4). As the head node is dedicated for job distribution, this default has only one worker. Therefore, by default, your ML JupyterLab has 15.6 GB of memory and 4 cores.
You can change the Instance Type or Initial Instance Count to obtain the computing power that you want. For example, to launch an ML JupyterLab with 512 cores, you can set Instance Type to mem4_ssd1_x128 and Initial Instance Count to 5.
This setting helps create computing-intensive environments that are impossible to achieve with a single node.
Note: If you are working with GPU instance types, avoid using mem1_ssd1_gpu2_x8 and mem1_ssd1_gpu2_x32.
Additional Requirements
This is an optional input. This input requires a text file containing a list of libraries and packages that you want to install in your environment. These libraries are additional to the one that are already provided.
The file should be formatted as a plain text document, with each package listed on a new line. Each line can specify a package name and optionally its version.
For example:
numpy==1.21.0 pandas>=1.3.0 scikit-learn
Wheel Files to be Installed
This is an optional input. This input allows you to specify an array of wheel files (.whl) that need to be installed as part of the setup for your JupyterLab job.
Opening the Worker URL: Once your ML JupyterLab is launched, you will be redirected to the Monitor screen. From there, click on the Open button.
Use the Open button in the Worker URL to use ML JupyterLab
Even when the Job State is “Running”, it might take a few more minutes for the Platform to set up ML JupyterLab. When the job is not ready, you will see the below screen. In such cases, simply reload your browser after a few minutes.
The waiting screen of ML JupyterLab when the instance is not ready
dx run app-ml_jupyterlab_ray_cluster \
-icluster_image='buildspec-1.0/ray-2.32.0-py39-cpu' \
--name='My first ML-JupyterLab'
Once the Job State is at Running, you can get the Worker URL with:
dx describe job-xxxx --json | jq -r .httpsApp.dns.url
To create a support ticket if there are technical issues:
Go to the Help header (same section where Projects and Tools are) inside the platform
Select “Contact Support”
Fill in the Subject and Message to submit a support ticket.
You can customize this list based on the specific packages you need for your project. The system will automatically resolve dependencies when installing these libraries. This format also follows the .
are pre-built Python package distributions, enabling faster and more reliable installations compared to source distributions.
You can also launch your job with :