Academy Documentation
  • Usage of Academy Documentation
  • Getting Started
    • Background Information
    • For Apollo Users
    • For Titan Users
    • For Scientists
    • For HPC Users
    • For Experienced Users
  • Cloud Computing
    • General Information
    • Cloud Computing for Scientists
    • Cloud Computing for HPC Users
  • Overview of the Platform
    • Overview of the Platform User Interface
    • Tool Library and App Introduction
  • Billing Access and Orgs
    • Orgs and Account Management
    • Billing and Pricing
  • Cohort Browser
    • Apollo Introduction
    • Overview of the Cohort Browser
    • Combining Cohorts
    • Genomic Variant Browser
    • Somatic Variants
  • JSON
    • Introduction
    • JSON on the Platform
  • Command Line Interface (CLI)
    • Introduction to CLI
    • Advanced CLI
  • Building Applets
    • Introduction
    • Bash
      • Example 1: Word Count (wc)
      • Example 2: fastq_quality_trimmer
      • Example 3: samtools
      • Example 4: cnvkit
      • Example 5: samtools with a Docker Image
    • Python
      • Example 1: Word Count (wc)
      • Example 2: fastq_quality_trimmer
      • Example 3: cnvkit
    • Publishing Applets to Apps
  • Building Workflows
    • Native Workflows
    • WDL
      • Example 1: hello
      • Example 2: Word Count (wc)
      • Example 3: fastq_trimmer
      • Example 4: cnvkit
      • Example 5: workflow
    • Nextflow
      • Resources To Learn Nextflow
      • Overview of Nextflow
      • Nextflow Setup
      • Importing Nf-Core
      • Building Nextflow Applets
      • Error Strategies for Nextflow
      • Job Failures
      • Useful Information
  • Interactive Cloud Computing
    • Cloud Workstation
    • TTYD
    • TTYD vs Cloud Workstation
    • JupyterLab
      • Introduction
      • Running a JupyterLab Notebook
  • Docker
    • Using Docker
    • Creating Docker Snapshots
    • Running Docker with Swiss Army Knife
  • Portals
    • Overview of JSON files for Portals
    • Branding JSON File
    • Home JSON File
    • Navigation JSON File
    • Updating Your Portal
  • AI/ ML Accelerator
    • Data Profiler
      • Introduction to Data Profiler
      • Utilizing Data Profiler Navigator
      • Dataset Level Screen
      • Table Level Screen
      • Column Level Screen
      • Explorer Mode
      • Accessing Data Profiler in ML JupyterLab
    • ML JupyterLab
      • Introduction to ML JupyterLab
      • Launching a ML JupyterLab Job
      • In App Features
      • Getting Started with ML JupyterLab
    • MLflow
      • Introduction to MLflow
      • Getting Started with MLflow
      • Using MLflow Tracking Server
      • Model Registry
      • Using Existing Model
      • Utilizing MLflow in JupyterLab
Powered by GitBook
On this page
  • Setting Up MLflow on DNAnexus
  • Defining an MLflow Experiment
  • Logging the Models
  • Autologging with Popular Machine Learning Frameworks
  • Manual Logging
  • Resources

Was this helpful?

Export as PDF
  1. AI/ ML Accelerator
  2. MLflow

Getting Started with MLflow

PreviousIntroduction to MLflowNextUsing MLflow Tracking Server

Last updated 2 months ago

Was this helpful?

AI/ML Accelerator - MLflow is specifically built to track your ML experiments on the DNAnexus platform environment via the ML JupyterLab (another app in the AI/ML Accelerator package) environment. A license is required in order to use the AI/ML Accelerator package. For more information, please contact DNAnexus Sales via .

Setting Up MLflow on DNAnexus

Normally, to start to track your ML project using MLflow, you need to set an MLflow Tracking URI first. However, with MLflow running on the ML JupyterLab environment, the Tracking URI is automatically set, you just need to define an MLflow Experiment and start logging your models.

As another use case, if you already have the ML models stored in the DNAnexus project (i.e. in the./mlflow folder), and if you just want to assess the GUI of the MLflow Tracking Server, you are recommended to run an MLflow job. See Using MLflow Tracking Server section for more details.

Defining an MLflow Experiment

To organize distinct runs of a specific project or idea, you need to create an Experiment that groups all related iterations (runs) together.

Your MLflow experiment can be defined as the command below. If you already had an experiment, you can also use the below command to specify the existing Experiment. Assigning a unique and meaningful name to the Experiment makes it easier to stay organized and simplifies the process of locating runs in the future.

mlflow.set_experiment("TCGA Breast Cancer")

Logging the Models

Autologging with Popular Machine Learning Frameworks

Autologging simplifies the process of tracking machine learning experiments by automatically capturing key parameters, metrics, and artifacts during training. With support for popular frameworks like sklearn or pycaret, you can enable autologging to seamlessly log model configurations, performance metrics, and artifacts without requiring extensive manual input. This feature ensures consistent experiment tracking while reducing the effort needed to set up logging.

To start autologging, you need to execute the below command.

mlflow.autolog() \

Manual Logging

For advanced use cases, manual logging provides granular control over the data recorded in MLflow. Using functions like mlflow.log_params(), you can explicitly log custom parameters, while mlflow.sklearn.log_model() allows saving and registering trained models along with your metadata. This approach is ideal for scenarios requiring highly customized logging workflows or integrations with bespoke pipelines.

Resources

To create a support ticket if there are technical issues:

  1. Go to the Help header (same section where Projects and Tools are) inside the platform

  2. Select "Contact Support"

  3. Fill in the Subject and Message to submit a support ticket.

sales@dnanexus.com
Full Documentation