# Introduction to JupyterLab

### New to JupyterLab?

If you have never used a JupyterLab notebook before, please view this information:

* [Jupyter Notebook Documentation](https://docs.jupyter.org/en/latest/)
* [Try Jupyter](https://docs.jupyter.org/en/latest/start/index.html)
* [Jupyter Architecture](https://docs.jupyter.org/en/latest/projects/architecture/content-architecture.html)
* [Content Community](https://docs.jupyter.org/en/latest/community/content-community.html)

### Introduction

We can interact with the platform in several different ways and install software packages in these different environments depending on what we are wanting to use and how we want to use it. As shown in the diagram below, we will be explaining Jupyter Lab Python/R/Stata and Spark JupyterLab Python/R:

<figure><img src="/files/0Ze2bC83vijJV0bMDm2d" alt=""><figcaption></figcaption></figure>

### Why JupyterLab?

Data Scientists’ tasks can be interactive. Options for interactive analysis in JupyterLab are:&#x20;

* Notebook-based Analysis
* Exploratory Data Analysis (EDA)
* Data Preprocessing/ Cleaning
* Implementing New Machine Learning(ML)/ Model
* Building Workflows

### Requesting an Instance

#### Use Single DXJupyter Instance if:

* The work can be done on a single machine instance
* Main Use Cases:
* Python/R
* Image Processing
* ML
* Stata

#### Use Spark Cluster DXJupyter If:

* Working with very large datasets that will not fit in memory on a single instance
* Using the Cohort Browser and querying a large ingested dataset
* Needing to use Spark based tools such as dxdata, HAIL or GLOW

### Starting a JupyterLab Job

1. Select JupyterLab with Python, R, Stata, ML, Image Processing or JupyterLab from Spark from the Tool Library, or select “Start Analysis” from the project space and select JupyterLab from the tool list. Once selected, press “Run Selected”&#x20;

<figure><img src="/files/UJ4BVRa7gkXhL7la5ORD" alt=""><figcaption></figcaption></figure>

2. Select the output location, and change the job name if desired.&#x20;

<figure><img src="/files/Gl26mb4461pcRL1zLJuN" alt=""><figcaption></figcaption></figure>

3. Then, select the inputs you intend on using&#x20;
   1. Snapshot file (not required, and how to create a snapshot is in the Utilizing Snapshot section)&#x20;
   2. Input files (not required, can do in the notebook analysis)&#x20;
   3. Stata settings file (license required for Stata)&#x20;
   4. Update the Duration if desired&#x20;
   5. Add Commands to run in the JupyterLab environment (optional)&#x20;
   6. Finally, update the Feature. For a full list of packages in each feature, please look in the Preinstalled Packages List.  The options are&#x20;
      * Python\_R
      * ML
      * IMAGE\_PROCESSING
      * STATA
      * MONAI\_ML

<figure><img src="/files/IiKNgUKlCK5aqiUBmf3T" alt=""><figcaption></figcaption></figure>

4. Then, press “Start Analysis” in the far right corner&#x20;

<figure><img src="/files/rjw8wthU2auU2qII0HnI" alt=""><figcaption></figcaption></figure>

5. Next, confirm the following parameters:&#x20;
   1. Job Name&#x20;
   2. Output Folder
   3. Priority (defaults to normal, can be set to high)
   4. Spending Limit (optional)&#x20;
   5. Instance Type (change the default value if needed)&#x20;

<figure><img src="/files/Xheo3ESOJDNixEqTjTue" alt=""><figcaption></figcaption></figure>

6. &#x20;Then, press “Launch Analysis”
7. When redirected to the monitor tab, select the job name
8. It will redirect you to the details of the JupyterLab job. Wait for the job to start running, and for the worker URL to appear
9. Press “Open Worker URL” and the JupyterLab home page will appear

<figure><img src="/files/x2jVzye9aGwKIlPffBav" alt=""><figcaption></figcaption></figure>

6. Note: Sometimes, the job is still initializing, so if you press Open Worker URL immediately, it may show a 502 error message. This is okay, and the job will update when the job is finished initializing.&#x20;

Running instances may take several minutes to load as the allocations become available.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://academy.dnanexus.com/interactivecloudcomputing/jupyterlab/introduction.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
