Getting Started with MLflow

AI/ML Accelerator - MLflow is specifically built to track your ML experiments on the DNAnexus platform environment via the ML JupyterLab (another app in the AI/ML Accelerator package) environment. A license is required in order to use the AI/ML Accelerator package. For more information, please contact DNAnexus Sales via [email protected].

Setting Up MLflow on DNAnexus

Normally, to start to track your ML project using MLflow, you need to set an MLflow Tracking URI first. However, with MLflow running on the ML JupyterLab environment, the Tracking URI is automatically set, you just need to define an MLflow Experiment and start logging your models.

As another use case, if you already have the ML models stored in the DNAnexus project (i.e. in the./mlflow folder), and if you just want to assess the GUI of the MLflow Tracking Server, you are recommended to run an MLflow job. See Using MLflow Tracking Server section for more details.

Defining an MLflow Experiment

To organize distinct runs of a specific project or idea, you need to create an Experiment that groups all related iterations (runs) together.

Your MLflow experiment can be defined as the command below. If you already had an experiment, you can also use the below command to specify the existing Experiment. Assigning a unique and meaningful name to the Experiment makes it easier to stay organized and simplifies the process of locating runs in the future.

mlflow.set_experiment("TCGA Breast Cancer")

Logging the Models

Autologging with Popular Machine Learning Frameworks

Autologging simplifies the process of tracking machine learning experiments by automatically capturing key parameters, metrics, and artifacts during training. With support for popular frameworks like sklearn or pycaret, you can enable autologging to seamlessly log model configurations, performance metrics, and artifacts without requiring extensive manual input. This feature ensures consistent experiment tracking while reducing the effort needed to set up logging.

To start autologging, you need to execute the below command.

mlflow.autolog() \

Manual Logging

For advanced use cases, manual logging provides granular control over the data recorded in MLflow. Using functions like mlflow.log_params(), you can explicitly log custom parameters, while mlflow.sklearn.log_model() allows saving and registering trained models along with your metadata. This approach is ideal for scenarios requiring highly customized logging workflows or integrations with bespoke pipelines.

Resources

Full Documentation

To create a support ticket if there are technical issues:

Go to the Help header (same section where Projects and Tools are) inside the platform
Select "Contact Support"
Fill in the Subject and Message to submit a support ticket.

PreviousIntroduction to MLflow NextUsing MLflow Tracking Server

Last updated 4 months ago

Was this helpful?