Running a JupyterLab Notebook

Use Cases for a Single JupyterLab Instance

  • R needs to run in a regular notebook and for downstream analysis

  • If directly interacting with the database/ dataset, it is recommended that you either 1) use Python and/ or 2) use Spark for extracting the data that is relevant for the downstream analysis

General “Recipe” for Utilizing Single Instance JupyterLab Notebooks

  1. Create a DX JupyterLab Notebook so that it will automatically save onto the Trusted Research Environment. You can do so by selecting these 2 different options:

    1. Option 1 is from the Launcher:

    b. Option 2 is from the DNAnexus Tab:

  1. Start writing your JupyterLab Notebook. Select which kernel you are going to use (options will vary depending on the Image you selected in set up).

  2. Download packages and save the software environment as a snapshot

    1. Download Packages

    pip install ___ #python
    install.packages() #R

    b. Save the Snapshot of the environment

  3. Start writing your code.

    1. Load Packages

    import ____ #python
    library() #R

    b. Download or Access data files to the JupyterLab environment

    %%bash 
    #option 1: dx download 
    dx download "PATH TO FILE"
    
    #option 2: dx fuse 
    data = pd.read_csv("/mnt/project/PATH.csv")

    c. Import the data

    import ___ as pd 
    NAME = pd.read_csv("PATH.csv")

    d. Then, perform the analysis for your data

    e. Upload results back to Project Space

    %%bash 
    dx upload FILE --destination /your/path/for/results
  4. Save your DX Jupyterlab Notebook

Opening Notebooks from Project Storage

  • Notebooks can also be directly opened from project storage

  • When you save in JupyterLab, the notebook gets uploaded to the platform as a new file. This goes back to the concept of immutability.

  • The old version of notebook goes into .Notebook_archive/ folder in project.

Last updated

Was this helpful?