# JSON on the Platform

Be sure to install [jq](https://jqlang.github.io/jq/).

## Background <a href="#background" id="background"></a>

[JavaScript Object Notation](https://www.json.org/json-en.html) (JSON) is a data exchange format designed to be easy for humans and machines to read. You will encounter JSON several places on the DNAnexus platform such as when you create and edit native applets and workflows. As shown in Figure 1, JSON is used to communicate with the DNAnexus Application Programming Interface (API) You will need to understand the responses from the API will help you debug applets, find failed jobs, and relaunch analyses.

![](https://1979569080-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FPtCOm9rXoRi4P9rh1ET8%2Fuploads%2Fgit-blob-a45261762c6a91119f1941d686b09207e2b3832f%2FJSON_1.jpg?alt=media)

## JSON Examples <a href="#json_examples" id="json_examples"></a>

Here is an example of an objects inside other objects describing the output of the FastQC app that creates two files as outputs, one of an HTML report and the other of a text file containing statistics on the input FASTQ:

```
{
   "report_html": {
       "dnanexus_link": "file-G4x7GX80VBzQy64k4jzgjqgY"
   },
   "stats_txt": {
       "dnanexus_link": "file-G4x7GXQ0VBzZxFxz4fqV120B"
   }
}
```

In a later chapter, you will use a file called *dxapp.json* to build custom applets on DNAnexus. To see a full example from a working app, run **`dx get app-fastqc`** to download the source code for the FastQC app. This should create a *fastqc* directory that contains the file *dxapp.json*.

Following is a portion of this file showing a typical JSON document you'll encounter on DNAnexus:

```
{
    "name": "fastqc",
    "title": "FastQC Reads Quality Control",
    "summary": "Generates a QC report on reads data",
    "dxapi": "1.0.0",
    "openSource": true,
    "version": "3.0.3",
    "inputSpec": [
        {
            "name": "reads",
            "label": "Reads",
            "help": "A file containing the reads to be checked. Accepted formats are gzipped-FASTQ and BAM.",
            "class": "file",
            "patterns": [
                "*.fq.gz",
                "*.fastq.gz",
                "*.sam",
                "*.bam"
            ]
        },
    ...
}
```

* The root element of this JSON document is an object, as denoted by the curly brackets.
* The value of *inputSpec* is a list, as denoted by the square brackets.
* Each value in the list is another object.
* The first three values of this object are strings.
* The *patterns* value is a list of strings representing file globs that match the input file extensions.

The following links explain the *dxapp.json* file in greater detail:

* [Third Party App Style Guide](https://documentation.dnanexus.com/developer/apps/third-party-and-community-apps/third-party-app-style-guide#dxapp.json)
* [App Metadata](https://documentation.dnanexus.com/developer/apps/app-metadata)

## Validating JSON <a href="#validating_json" id="validating_json"></a>

JSON is a strict format that is easy to get wrong if you are manually editing a file. For this reason, we suggest you use text editors that understand JSON syntax, highlight data structures, and spot common mistakes. For instance, a JSON object looks very similar to a Python dictionary, which allows a trailing comma in a list. Open the `python3` REPL (read-evaluate-print-loop) and enter the following to verify:

```
>>> { 'patterns': [ '*.bam', '*.sam', ] }
{'patterns': ['*.bam', '*.sam']}
```

A similar trailing comma in JSON would make the document invalid. To see this, go to JSONlint.com, paste this into the input box, and press the "Validate JSON" button:

```
{ "patterns": [ "*.bam", "*.sam", ] }
```

The result should reformat the JSON onto three lines as follows:

```
{
    "patterns": ["*.bam", "*.sam", ]
}
```

The second line should be highlighted in red, and the "Results" below show that a JSON value is expected after the last comma and before the closing square bracket.

```
Error: Parse error on line 2:
... ["*.bam", "*.sam", ]}
-----------------------^
Expecting 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '[', got ']'
```

Remove the offending comma and revalidate the document to see the "Results" change to "Valid JSON." You may also want to install a command-line tool like `jsonlint` that can show similar errors:

```
$ jsonlint dxapp.json
Error: Parse error on line 15:
...*.sam",            ],            "help
----------------------^
Expecting 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '[', got ']'
```

## Viewing JSON <a href="#viewing_json" id="viewing_json"></a>

JSON is not dependent on whitespace, so the previous example could be compressed to the following:

```
$ cat minified.json
{"report_html":{"dnanexus_link":"file-G4x7GX80VBzQy64k4jzgjqgY"},"stats_txt":
{"dnanexus_link":"file-G4x7GXQ0VBzZxFxz4fqV120B"}}
```

The `jq` program will format JSON into an indented data structure that is easier to read. In the following example, we execute `jq` with the *filter* `.` to indicate we wish to see the entire document, which is the last argument. Depending on your terminal, the keys may be shown in one color and the values in a different color:

```
$ jq . minified.json
{
  "report_html": {
    "dnanexus_link": "file-G4x7GX80VBzQy64k4jzgjqgY"
  },
  "stats_txt": {
    "dnanexus_link": "file-G4x7GXQ0VBzZxFxz4fqV120B"
  }
}
```

The power of `jq` lies in the filter argument, which allows you to extract and manipulate the contents of the document. Use the filter *.report\_html* to extract the value for key *report\_html* that lies at the root of the document:

```
$ jq .report_html example.json
{
  "dnanexus_link": "file-G4x7GX80VBzQy64k4jzgjqgY"
}
```

::: note If you request a key that does not exist, you will get the JavaScript value `null`, indicating no value is present: :::

```
$ jq .report_htm example.json
null
```

Filters may chain keys to search further into the document structure. In the following example, we can extract the file identifier by chaining *.report\_html.dnanexus\_link*:

```
$ jq .report_html.dnanexus_link example.json
"file-G4x7GX80VBzQy64k4jzgjqgY"
```

## Reading from Unix Pipes <a href="#reading_from_unix_pipes" id="reading_from_unix_pipes"></a>

Unix-type operating systems such as Linux and FreeBSD/macOS have three special filehandles:

* STDIN (standard in)
* STDOUT (standard out)
* STDERR (standard error)

STDOUT and STDERR control the output of programs where the first is usually the console and the second is an error channel to segregate errors from regular output. For instance, the STDOUT of `jq` can be redirected to a file using the `>` operator:

```
$ jq . minified.json > prettified.json
$ cat prettified.json
{
  "report_html": {
    "dnanexus_link": "file-G4x7GX80VBzQy64k4jzgjqgY"
  },
  "stats_txt": {
    "dnanexus_link": "file-G4x7GXQ0VBzZxFxz4fqV120B"
  }
}
```

STDIN is an input filehandle created by using a pipe (`|`) in the following example:

```
$ cat minified.json | jq .
{
  "report_html": {
    "dnanexus_link": "file-G4x7GX80VBzQy64k4jzgjqgY"
  },
  "stats_txt": {
    "dnanexus_link": "file-G4x7GXQ0VBzZxFxz4fqV120B"
  }
}
```

Alternatively, you can read from an input redirect using `<`:

```
$ jq . < example.json
{
  "report_html": {
    "dnanexus_link": "file-G4x7GX80VBzQy64k4jzgjqgY"
  },
  "stats_txt": {
    "dnanexus_link": "file-G4x7GXQ0VBzZxFxz4fqV120B"
  }
}
```

## Using jq For DNAnexus Responses <a href="#using_jq_for_dnanexus_responses" id="using_jq_for_dnanexus_responses"></a>

Many `dx` commands can return JSON by appending the `--json` flag to them. For instance, **`dx describe app-fastqc`** will return a table of metadata about the FastQC app. In the following example, I will request the same data as JSON and will pipe it into the `head` program to see the first 10 lines:

```
$ dx describe app-fastqc --json | head
{
    "id": "app-G81jg5j9jP7qxb310vg2xQkX",
    "class": "app",
    "billTo": "org-dnanexus_apps",
    "created": 1644399511000,
    "modified": 1644401066806,
    "createdBy": "user-jkotrs",
    "name": "fastqc",
    "version": "3.0.3",
    "aliases": [
```

As with previous examples, the result is a JSON document with an object at the root level; therefore, I can pipe the output into `jq .id` to extract the app identifier:

```
$ dx describe app-fastqc --json | jq .id
"app-G81jg5j9jP7qxb310vg2xQkX"
```

I can use **`dx find projects --public`** to view a list of public projects. Using `head`, I can see the root of the JSON is a list:

```
$ dx find projects --public --json | head
[
    {
        "id": "project-F0yyz6j9Jz8YpxQV8B8Kk7Zy",
        "level": "VIEW",
        "permissionSources": [
            "PUBLIC"
        ],
        "public": true,
        "describe": {
            "id": "project-F0yyz6j9Jz8YpxQV8B8Kk7Zy",
```

The `jq` filter `.[]` will iterate over the values of a list at the root, so I can use `.[].id` in the following command to extract the project identifier of each. As this returns over 100 results, I'll use `head` to show the first few lines:

```
$ dx find projects --public --json | jq ".[].id" | head -3
"project-F0yyz6j9Jz8YpxQV8B8Kk7Zy"
"project-G4FX3QXKzJxqXxGpK2pJ7Z3K"
"project-FGX8gVQB9X7K5f1pKfPvz9yG"
```

You can also use pipes inside of the `jq` filter to extract the same data:

```
$ dx find projects --public --json | jq ".[] | .id" | head -n 3
"project-F0yyz6j9Jz8YpxQV8B8Kk7Zy"
"project-G4FX3QXKzJxqXxGpK2pJ7Z3K"
"project-FGX8gVQB9X7K5f1pKfPvz9yG"
```

## Recipes for Using jq <a href="#recipes_for_using_jq" id="recipes_for_using_jq"></a>

### Editing Job Input and Rerunning <a href="#editing_job_input_and_rerunning" id="editing_job_input_and_rerunning"></a>

You may wish to re-run an analysis, possibly with slightly different inputs. For this example, I'll use the *job.json* file rather than using the pipe

```
$ jq .input job.json
{
  "reads": {
    "$dnanexus_link": "file-BQbXKk80fPFj4Jbfpxb6Ffv2"
  },
  "format": "auto",
  "kmer_size": 7,
  "nogroup": true
}
```

Redirect this to a file:

```
$ jq .input job.json > input.json
```

::: note If you had access to the original job ID, you would run the following: :::

```
$ dx describe job-G4x7G5j0B3K2FKzgP654ZqpK --json | jq .input > input.json
```

Edit the *input.json* file, perhaps to indicate a different *kmer\_size*, then re-run the app using the new input:

```
$ dx run app-G4YyQ9044b90F1vG8y9YkKk3 -f input.json
```

### Finding Failed Jobs <a href="#finding_failed_jobs" id="finding_failed_jobs"></a>

Sometimes I find jobs that some jobs have failed when processing large batches of data. I can use **`dx find jobs --state failed`** to return a list of failed jobs that I might see if the input files were corrupt or were especially large, causing the instances to run out of disk space or memory. First, I'll show you how to use more advanced filtering in `jq`. The file *jobs.json* shows example output from **`dx find jobs --json`** that I'll use to extract the state of the jobs:

```
$ jq ".[].state" rap-jobs.json | sort | uniq -c | sort -rn
  15 "failed"
   3 "done"
   2 "terminated"
```

A `select` statement in `jq` can find the "failed" jobs, and pipes join to more filters to extract the job IDs and the app IDs:

```
$ jq '.[] | select (.state | contains("failed")) | .id, .executable' rap-jobs.json | head
"job-G6jj9k8JPXfG42094KG5JFX4"
"applet-G6jj9b0JPXf5Q6ZF4G85K156"
"job-G6jj1zQJPXf34z8v4KqjZKP1"
"applet-G6jg9p8JPXf4Q9Pb4GgPK8Vp"
"job-G6jg9vQJPXfGbJb54GFkJ33Y"
"applet-G6jg9p8JPXf4Q9Pb4GgPK8Vp"
"job-G6jg7Y0JPXfG6q53G12vQZK8"
"applet-G6jg6pQJPXf7ypXq33B75Qq1"
"job-G6jg57QJPXf90Jjv4K8pgkG7"
"applet-G6jfg90JPXfGZkVb7PPxjpPY"
```

To be useful in a `bash` loop, I need the job and app IDs on the same line, so I can use `paste` for this:

```
$ jq '.[] | select (.state | contains("failed")) | .id, .executable' rap-jobs.json | paste - -
"job-G6jj9k8JPXfG42094KG5JFX4"  "applet-G6jj9b0JPXf5Q6ZF4G85K156"
"job-G6jj1zQJPXf34z8v4KqjZKP1"  "applet-G6jg9p8JPXf4Q9Pb4GgPK8Vp"
"job-G6jg9vQJPXfGbJb54GFkJ33Y"  "applet-G6jg9p8JPXf4Q9Pb4GgPK8Vp"
"job-G6jg7Y0JPXfG6q53G12vQZK8"  "applet-G6jg6pQJPXf7ypXq33B75Qq1"
"job-G6jg57QJPXf90Jjv4K8pgkG7"  "applet-G6jfg90JPXfGZkVb7PPxjpPY"
"job-G6jZk6jJPXf1q1Py5VKX6gJK"  "applet-G6jZjG0JPXf7ZxZP4G5v0X1k"
"job-G6jYY28JPXfFvFXY4GXB6jG2"  "applet-G6jYXq0JPXf5Q6ZF4G85JVgG"
"job-G6jY9FQJPXf3pj894GFJ02jy"  "applet-G6jY7zQJPXfG42094KG5Gkyy"
"job-G6jY858JPXfBKX1X0j434BY5"  "applet-G6jY7zQJPXfG42094KG5Gkyy"
"job-G6jY740JPXf7V2vJ4G2Gkfj7"  "applet-G6jY6zQJPXf81J984K6kfB3V"
"job-G6jY5v8JPXfPGQq15k77zPJ9"  "applet-G6jY5jjJPXf6Ffqg4GqF4KPg"
"job-G6jY4k0JPXfPGQq15k77zP9Q"  "applet-G6jY39jJPXfG42094KG5GkV9"
"job-G6jXPJQJPXfBbf694G3Fg07K"  "applet-G6jXJJjJPXf7V2vJ4G2GkFbF"
"job-G6jX7yQJPXfFjzffKJzpqfj7"  "applet-G6jX7JQJPXf3V99x4Gx7K09X"
"job-G6jVzJ0JPXf5Q6ZF4G85JG09"  "applet-G6jVxQQJPXfGZ0BF33KZfX5Y"
```

If I had access to the original executions and input files, I could use a `bash` loop to re-run these jobs. Since I don't, I'll `echo` the command that *should* be run:

```
jq '.[] | select (.state | contains("failed")) | .id, .executable' \
rap-jobs.json | paste - - | \
while read JOB_ID APP_ID; do echo dx run $APP_ID --clone $JOB_ID; done
```

This produces the following output:

```
dx run "applet-G6jj9b0JPXf5Q6ZF4G85K156" --clone "job-G6jj9k8JPXfG42094KG5JFX4"
dx run "applet-G6jg9p8JPXf4Q9Pb4GgPK8Vp" --clone "job-G6jj1zQJPXf34z8v4KqjZKP1"
dx run "applet-G6jg9p8JPXf4Q9Pb4GgPK8Vp" --clone "job-G6jg9vQJPXfGbJb54GFkJ33Y"
dx run "applet-G6jg6pQJPXf7ypXq33B75Qq1" --clone "job-G6jg7Y0JPXfG6q53G12vQZK8"
dx run "applet-G6jfg90JPXfGZkVb7PPxjpPY" --clone "job-G6jg57QJPXf90Jjv4K8pgkG7"
dx run "applet-G6jZjG0JPXf7ZxZP4G5v0X1k" --clone "job-G6jZk6jJPXf1q1Py5VKX6gJK"
dx run "applet-G6jYXq0JPXf5Q6ZF4G85JVgG" --clone "job-G6jYY28JPXfFvFXY4GXB6jG2"
dx run "applet-G6jY7zQJPXfG42094KG5Gkyy" --clone "job-G6jY9FQJPXf3pj894GFJ02jy"
dx run "applet-G6jY7zQJPXfG42094KG5Gkyy" --clone "job-G6jY858JPXfBKX1X0j434BY5"
dx run "applet-G6jY6zQJPXf81J984K6kfB3V" --clone "job-G6jY740JPXf7V2vJ4G2Gkfj7"
dx run "applet-G6jY5jjJPXf6Ffqg4GqF4KPg" --clone "job-G6jY5v8JPXfPGQq15k77zPJ9"
dx run "applet-G6jY39jJPXfG42094KG5GkV9" --clone "job-G6jY4k0JPXfPGQq15k77zP9Q"
dx run "applet-G6jXJJjJPXf7V2vJ4G2GkFbF" --clone "job-G6jXPJQJPXfBbf694G3Fg07K"
dx run "applet-G6jX7JQJPXf3V99x4Gx7K09X" --clone "job-G6jX7yQJPXfFjzffKJzpqfj7"
dx run "applet-G6jVxQQJPXfGZ0BF33KZfX5Y" --clone "job-G6jVzJ0JPXf5Q6ZF4G85JG09"
```

If you were using **`dx find jobs`**, then the equivalent would be this:

```
dx find jobs --state failed --json | jq '.[] | .id, .executable' | paste - - | \
while read JOB_ID APP_ID; do echo dx run $APP_ID --clone $JOB_ID; done
```

## Review <a href="#review" id="review"></a>

You should now be able to:

* Describe how users interact with the DNAnexus Platform
* Explain the purpose of using JSON on the DNAnexus platform
* Articulate the basic elements of JSON
* Describe and read basic JSON structures on the platform
* Parse JSON responses from the platform using `jq` and pipes to other filters or Unix programs

### Helpful Tips

* Learn the [dxapp.json specification](https://documentation.dnanexus.com/developer/apps/3rd-party-and-community-apps/third-party-app-style-guide#dxapp.json)
* Use an Editor like Visual Studio Code with JSON Crack plugin
* Use JSON checking tools to make sure your JSON is well formed
  * <https://jsonlint.com/>&#x20;
  * Run through jq
* Use **dx get** to get app code and **dxapp.json** for an existing app

### Resources

[Full Documentation](https://documentation.dnanexus.com/)

To create a support ticket if there are technical issues:

1. Go to the Help header (same section where Projects and Tools are) inside the platform
2. Select "Contact Support"
3. Fill in the Subject and Message to submit a support ticket.
