Academy Documentation
  • Usage of Academy Documentation
  • Getting Started
    • Background Information
    • For Apollo Users
    • For Titan Users
    • For Scientists
    • For HPC Users
    • For Experienced Users
  • Cloud Computing
    • General Information
    • Cloud Computing for Scientists
    • Cloud Computing for HPC Users
  • Overview of the Platform
    • Overview of the Platform User Interface
    • Tool Library and App Introduction
  • Billing Access and Orgs
    • Orgs and Account Management
    • Billing and Pricing
  • Cohort Browser
    • Apollo Introduction
    • Overview of the Cohort Browser
    • Combining Cohorts
    • Genomic Variant Browser
    • Somatic Variants
  • JSON
    • Introduction
    • JSON on the Platform
  • Command Line Interface (CLI)
    • Introduction to CLI
    • Advanced CLI
  • Building Applets
    • Introduction
    • Bash
      • Example 1: Word Count (wc)
      • Example 2: fastq_quality_trimmer
      • Example 3: samtools
      • Example 4: cnvkit
      • Example 5: samtools with a Docker Image
    • Python
      • Example 1: Word Count (wc)
      • Example 2: fastq_quality_trimmer
      • Example 3: cnvkit
    • Publishing Applets to Apps
  • Building Workflows
    • Native Workflows
    • WDL
      • Example 1: hello
      • Example 2: Word Count (wc)
      • Example 3: fastq_trimmer
      • Example 4: cnvkit
      • Example 5: workflow
    • Nextflow
      • Resources To Learn Nextflow
      • Overview of Nextflow
      • Nextflow Setup
      • Importing Nf-Core
      • Building Nextflow Applets
      • Error Strategies for Nextflow
      • Job Failures
      • Useful Information
  • Interactive Cloud Computing
    • Cloud Workstation
    • TTYD
    • TTYD vs Cloud Workstation
    • JupyterLab
      • Introduction
      • Running a JupyterLab Notebook
  • Docker
    • Using Docker
    • Creating Docker Snapshots
    • Running Docker with Swiss Army Knife
  • Portals
    • Overview of JSON files for Portals
    • Branding JSON File
    • Home JSON File
    • Navigation JSON File
    • Updating Your Portal
  • AI/ ML Accelerator
    • Data Profiler
      • Introduction to Data Profiler
      • Utilizing Data Profiler Navigator
      • Dataset Level Screen
      • Table Level Screen
      • Column Level Screen
      • Explorer Mode
      • Accessing Data Profiler in ML JupyterLab
    • ML JupyterLab
      • Introduction to ML JupyterLab
      • Launching a ML JupyterLab Job
      • In App Features
      • Getting Started with ML JupyterLab
    • MLflow
      • Introduction to MLflow
      • Getting Started with MLflow
      • Using MLflow Tracking Server
      • Model Registry
      • Using Existing Model
      • Utilizing MLflow in JupyterLab
Powered by GitBook
On this page
  • To Launch with GUI
  • To Launch with CLI
  • Resources

Was this helpful?

Export as PDF
  1. AI/ ML Accelerator
  2. ML JupyterLab

Launching a ML JupyterLab Job

PreviousIntroduction to ML JupyterLabNextIn App Features

Last updated 2 months ago

Was this helpful?

ML JupyterLab is an app in the AI/ML Accelerator package. A license is required in order to use the AI/ML Accelerator package. For more information, please contact DNAnexus Sales via .

ML JupyterLab is essentially a purpose-built JupyterLab instance on DNAnexus Platform. It inherits all capabilities of a standard JupyterLab, plus specialized features for AI/ML development. This section gives you a quick start on how to launch ML JupyterLab.

To Launch with GUI

  1. Find ML JupyterLab in your Tools Library: Simply open your and search for “AI/ML Accelerator - ML JupyterLab”. If you cannot find it, you might have to obtain a license.

  2. Set the Required Inputs

  • Docker Image:

    • When you launch ML JupyterLab, it will ask you to pick a Docker image from a list. Those are the prebuilt environments tailored to AI/ML development. You can pick the most standard option, which is buildspec-1.0/ray-2.32.0-py310-cpu, or other options if you need a GPU-friendly setup.

    • Currently, we support Ray version 2.32 as a distributed engine.

      • Available Docker images:

        • General Python 3.10 with CPU Support:

          • buildspec-1.0/ray-2.32.0-py310-cpu

        • Pytorch with GPU Support:

          • buildspec-1.0/ray-2.32.0-py310-gpu-pytorch

        • TensorFlow with GPU Support:

          • buildspec-1.0/ray-2.32.0-py310-gpu-tensorflow

      • Each image is optimized for specific workloads, the included packages and their version in each Docker image are listed in the Pre-installed ML packages section.

  • Duration:

    • This parameter sets the duration (in minutes) for which your environment will remain active. The expected runtime should be specified based on how long you plan to work with the environment, the size of the dataset, or the complexity of the tasks you will be running.

    • For example, larger datasets or more complex computations may require a longer runtime.

    • If you are unsure about the duration, use the default value and you can change this parameter inside the app later.

  1. Set the Optional Parameters

  • Instance Type & Initial Instance Count

    • This input is crucial if you want to develop AI/ML using large datasets that need intensive computing power. ML JupyterLab has a built-in Ray cluster and this architecture can help create a workspace with a huge number of CPUs and RAM.

    • To find this input, click on the instance icon in the top right corner of the input panel which is automatically open when you try to launch ML JupyterLab.

    • By default, the ML JupyterLab uses two mem2_ssd1_v2_x4 instances (Input Parameters: Initial Instance Count: 2, Instance Type: mem2_ssd1_v2_x4). As the head node is dedicated for job distribution, this default has only one worker. Therefore, by default, your ML JupyterLab has 15.6 GB of memory and 4 cores.

    • You can change the Instance Type or Initial Instance Count to obtain the computing power that you want. For example, to launch an ML JupyterLab with 512 cores, you can set Instance Type to mem4_ssd1_x128 and Initial Instance Count to 5.

    • This setting helps create computing-intensive environments that are impossible to achieve with a single node.

    • Note: If you are working with GPU instance types, avoid using mem1_ssd1_gpu2_x8 and mem1_ssd1_gpu2_x32.

  • Additional Requirements

    • This is an optional input. This input requires a text file containing a list of libraries and packages that you want to install in your environment. These libraries are additional to the one that are already provided.

    • The file should be formatted as a plain text document, with each package listed on a new line. Each line can specify a package name and optionally its version.

    • For example:

      • numpy==1.21.0 pandas>=1.3.0 scikit-learn

  • Wheel Files to be Installed

    • This is an optional input. This input allows you to specify an array of wheel files (.whl) that need to be installed as part of the setup for your JupyterLab job.

  1. Opening the Worker URL: Once your ML JupyterLab is launched, you will be redirected to the Monitor screen. From there, click on the Open button.

  • Use the Open button in the Worker URL to use ML JupyterLab

  • Even when the Job State is “Running”, it might take a few more minutes for the Platform to set up ML JupyterLab. When the job is not ready, you will see the below screen. In such cases, simply reload your browser after a few minutes.

The waiting screen of ML JupyterLab when the instance is not ready

To Launch with CLI

dx run app-ml_jupyterlab_ray_cluster \ -icluster_image='buildspec-1.0/ray-2.32.0-py39-cpu' \ --name='My first ML-JupyterLab'

Once the Job State is at Running, you can get the Worker URL with:

dx describe job-xxxx --json | jq -r .httpsApp.dns.url

Resources

To create a support ticket if there are technical issues:

  1. Go to the Help header (same section where Projects and Tools are) inside the platform

  2. Select “Contact Support”

  3. Fill in the Subject and Message to submit a support ticket.

You can customize this list based on the specific packages you need for your project. The system will automatically resolve dependencies when installing these libraries. This format also follows the .

are pre-built Python package distributions, enabling faster and more reliable installations compared to source distributions.

You can also launch your job with :

PIP v24.2 standard
Wheel files
dxtoolkit
Full Documentation
sales@dnanexus.com
Tools Library