Academy Documentation
  • Usage of Academy Documentation
  • Getting Started
    • Background Information
    • For Apollo Users
    • For Titan Users
    • For Scientists
    • For HPC Users
    • For Experienced Users
  • Cloud Computing
    • General Information
    • Cloud Computing for Scientists
    • Cloud Computing for HPC Users
  • Overview of the Platform
    • Overview of the Platform User Interface
    • Tool Library and App Introduction
  • Billing Access and Orgs
    • Orgs and Account Management
    • Billing and Pricing
  • Cohort Browser
    • Apollo Introduction
    • Overview of the Cohort Browser
    • Combining Cohorts
    • Genomic Variant Browser
    • Somatic Variants
  • JSON
    • Introduction
    • JSON on the Platform
  • Command Line Interface (CLI)
    • Introduction to CLI
    • Advanced CLI
  • Building Applets
    • Introduction
    • Bash
      • Example 1: Word Count (wc)
      • Example 2: fastq_quality_trimmer
      • Example 3: samtools
      • Example 4: cnvkit
      • Example 5: samtools with a Docker Image
    • Python
      • Example 1: Word Count (wc)
      • Example 2: fastq_quality_trimmer
      • Example 3: cnvkit
    • Publishing Applets to Apps
  • Building Workflows
    • Native Workflows
    • WDL
      • Example 1: hello
      • Example 2: Word Count (wc)
      • Example 3: fastq_trimmer
      • Example 4: cnvkit
      • Example 5: workflow
    • Nextflow
      • Resources To Learn Nextflow
      • Overview of Nextflow
      • Nextflow Setup
      • Importing Nf-Core
      • Building Nextflow Applets
      • Error Strategies for Nextflow
      • Job Failures
      • Useful Information
  • Interactive Cloud Computing
    • Cloud Workstation
    • TTYD
    • TTYD vs Cloud Workstation
    • JupyterLab
      • Introduction
      • Running a JupyterLab Notebook
  • Docker
    • Using Docker
    • Creating Docker Snapshots
    • Running Docker with Swiss Army Knife
  • Portals
    • Overview of JSON files for Portals
    • Branding JSON File
    • Home JSON File
    • Navigation JSON File
    • Updating Your Portal
  • AI/ ML Accelerator
    • Data Profiler
      • Introduction to Data Profiler
      • Utilizing Data Profiler Navigator
      • Dataset Level Screen
      • Table Level Screen
      • Column Level Screen
      • Explorer Mode
      • Accessing Data Profiler in ML JupyterLab
    • ML JupyterLab
      • Introduction to ML JupyterLab
      • Launching a ML JupyterLab Job
      • In App Features
      • Getting Started with ML JupyterLab
    • MLflow
      • Introduction to MLflow
      • Getting Started with MLflow
      • Using MLflow Tracking Server
      • Model Registry
      • Using Existing Model
      • Utilizing MLflow in JupyterLab
Powered by GitBook
On this page
  • Best Practices
  • Helpful Tips
  • Setting Up/ Running a Notebook
  • Opening Notebooks from Project Storage
  • Installing Software Packages
  • Best Practices
  • Image Snapshots
  • Supplemental Information
  • Resources

Was this helpful?

Export as PDF
  1. Interactive Cloud Computing
  2. JupyterLab

Running a JupyterLab Notebook

PreviousIntroductionNextDocker

Last updated 4 months ago

Was this helpful?

Before you begin, review the overview documentation and log onto the

Best Practices

  1. Use a DNAnexus JupyerLab notebook so that they will save onto the platform easily.

    1. When you open your JupyterLab session, you will select this option to start a DNAnexus JupyterLab notebook

    2. You can do that by selecting these 2 different options:

  1. Remember to save your notebooks and anything you want to export out of the notebook.

  2. You can access data what is in your notebook space vs what is in your DNAnexus project space by viewing them individually here:

Helpful Tips

  1. All notebooks that are stored in Project will have DX before the name. Example below:

  1. When you are running code blocks, remember that in JupyterLab you can run them out of order. This means that you need to pay attention to the numbers on the side of the code blocks for the order. This is highlighted in gold below:

  1. If you choose to write in python or R primarily, you can use the following at the top of your code block to "switch" to bash scripting. Example below

  1. Notebook locking: only one user can edit at a time from project storage 4. When a user is editing a notebook, it is locked and others cannot edit it.

    1. When a notebook is saved and the kernel is shutdown in JupyterLab, then others can access it.

    2. In order to unlock it, you will need to close the notebook and then use the screen shot below to ensure that you have shutdown the kernel.

Setting Up/ Running a Notebook

  1. Download or Access data files to JupyterLab environment

    1. to Download:

      dx download "PATH"
    2. To access data in JupyterLab environment

      1. this will be read only

      2. do not reflect recent changes in the file system

      3. to add from Project storage add

        data = pd.read_csv("/mnt/project/PATH.csv")

        to the front of your path

  2. Import the data

    import ___ as pd 
    NAME = pd.read_csv("PATH.csv")
  3. Do analysis

    1. This is where you will add the code chunks in that you will need for the rest of your analysis

  4. Upload Results back to Project Space

%%bash 
dx upload FILE --destination users/YOUR_ID/

Opening Notebooks from Project Storage

  • Notebooks can also be directly opened from project storage

  • When you save in JupyterLab, notebook gets uploaded to platform as a new file. This goes back to the concept of immutability

  • Old version of notebook goes into .Notebook_archive/ folder in project

Installing Software Packages

  • Install packages normally with package managers such as pip install (python) or install.packages (R)

Best Practices

  1. Use the correct base image (Spark or Jupyter)

  2. Install all software using a separate Jupyter Notebook

  3. Use version tags when possible

    1. pip install <PACKAGE>==<VERSION>

  4. R: Install from CRAN URL

    1. install.packages(packageurl, repos=NULL, type="source")

  5. Rename and move image from Notebook_Snapshot/ folder

Image Snapshots

To Create a Snapshot:

To Use a Snapshot on a New Notebook

Snapshot Best Practices

  1. Don't save data in your snapshot - it uses storage space and impacts costs.

  2. Snapshots can be large - they use storage space so think twice.

  3. Make sure to rename the snapshot according to your organization's naming conventions: you can remember what they refer to when returning to the project in the future.

Supplemental Information

Resources

To create a support ticket if there are technical issues:

  1. Go to the Help header (same section where Projects and Tools are) inside the platform

  2. Select "Contact Support"

  3. Fill in the Subject and Message to submit a support ticket.

A number of packages are preinstalled, based on the instance type.

packageurl <- ""

List of Preinstalled Packages
http://cran.r-project.org/src/contrib/Archive/ggplot2/ggplot2_0.9.1.tar.gz
Running JupyterLabs with Papermill
dx extract dataset
Spark JupyterLab
DXJupyterLab Reference
Using DXJupyterLab
Full Documentation
DNAnexus Platform