JSON on the Platform
Last updated
Was this helpful?
Last updated
Was this helpful?
Be sure to install .
(JSON) is a data exchange format designed to be easy for humans and machines to read. You will encounter JSON several places on the DNAnexus platform such as when you create and edit native applets and workflows. As shown in Figure 1, JSON is used to communicate with the DNAnexus Application Programming Interface (API) You will need to understand the responses from the API will help you debug applets, find failed jobs, and relaunch analyses.
Here is an example of an objects inside other objects describing the output of the FastQC app that creates two files as outputs, one of an HTML report and the other of a text file containing statistics on the input FASTQ:
In a later chapter, you will use a file called dxapp.json to build custom applets on DNAnexus. To see a full example from a working app, run dx get app-fastqc
to download the source code for the FastQC app. This should create a fastqc directory that contains the file dxapp.json.
Following is a portion of this file showing a typical JSON document you'll encounter on DNAnexus:
The root element of this JSON document is an object, as denoted by the curly brackets.
The value of inputSpec is a list, as denoted by the square brackets.
Each value in the list is another object.
The first three values of this object are strings.
The patterns value is a list of strings representing file globs that match the input file extensions.
The following links explain the dxapp.json file in greater detail:
JSON is a strict format that is easy to get wrong if you are manually editing a file. For this reason, we suggest you use text editors that understand JSON syntax, highlight data structures, and spot common mistakes. For instance, a JSON object looks very similar to a Python dictionary, which allows a trailing comma in a list. Open the python3
REPL (read-evaluate-print-loop) and enter the following to verify:
A similar trailing comma in JSON would make the document invalid. To see this, go to JSONlint.com, paste this into the input box, and press the "Validate JSON" button:
The result should reformat the JSON onto three lines as follows:
The second line should be highlighted in red, and the "Results" below show that a JSON value is expected after the last comma and before the closing square bracket.
Remove the offending comma and revalidate the document to see the "Results" change to "Valid JSON." You may also want to install a command-line tool like jsonlint
that can show similar errors:
JSON is not dependent on whitespace, so the previous example could be compressed to the following:
The jq
program will format JSON into an indented data structure that is easier to read. In the following example, we execute jq
with the filter .
to indicate we wish to see the entire document, which is the last argument. Depending on your terminal, the keys may be shown in one color and the values in a different color:
The power of jq
lies in the filter argument, which allows you to extract and manipulate the contents of the document. Use the filter .report_html to extract the value for key report_html that lies at the root of the document:
::: note If you request a key that does not exist, you will get the JavaScript value null
, indicating no value is present: :::
Filters may chain keys to search further into the document structure. In the following example, we can extract the file identifier by chaining .report_html.dnanexus_link:
Unix-type operating systems such as Linux and FreeBSD/macOS have three special filehandles:
STDIN (standard in)
STDOUT (standard out)
STDERR (standard error)
STDOUT and STDERR control the output of programs where the first is usually the console and the second is an error channel to segregate errors from regular output. For instance, the STDOUT of jq
can be redirected to a file using the >
operator:
STDIN is an input filehandle created by using a pipe (|
) in the following example:
Alternatively, you can read from an input redirect using <
:
Many dx
commands can return JSON by appending the --json
flag to them. For instance, dx describe app-fastqc
will return a table of metadata about the FastQC app. In the following example, I will request the same data as JSON and will pipe it into the head
program to see the first 10 lines:
As with previous examples, the result is a JSON document with an object at the root level; therefore, I can pipe the output into jq .id
to extract the app identifier:
I can use dx find projects --public
to view a list of public projects. Using head
, I can see the root of the JSON is a list:
The jq
filter .[]
will iterate over the values of a list at the root, so I can use .[].id
in the following command to extract the project identifier of each. As this returns over 100 results, I'll use head
to show the first few lines:
You can also use pipes inside of the jq
filter to extract the same data:
You may wish to re-run an analysis, possibly with slightly different inputs. For this example, I'll use the job.json file rather than using the pipe
Redirect this to a file:
::: note If you had access to the original job ID, you would run the following: :::
Edit the input.json file, perhaps to indicate a different kmer_size, then re-run the app using the new input:
Sometimes I find jobs that some jobs have failed when processing large batches of data. I can use dx find jobs --state failed
to return a list of failed jobs that I might see if the input files were corrupt or were especially large, causing the instances to run out of disk space or memory. First, I'll show you how to use more advanced filtering in jq
. The file jobs.json shows example output from dx find jobs --json
that I'll use to extract the state of the jobs:
A select
statement in jq
can find the "failed" jobs, and pipes join to more filters to extract the job IDs and the app IDs:
To be useful in a bash
loop, I need the job and app IDs on the same line, so I can use paste
for this:
If I had access to the original executions and input files, I could use a bash
loop to re-run these jobs. Since I don't, I'll echo
the command that should be run:
This produces the following output:
If you were using dx find jobs
, then the equivalent would be this:
You should now be able to:
Describe how users interact with the DNAnexus Platform
Explain the purpose of using JSON on the DNAnexus platform
Articulate the basic elements of JSON
Describe and read basic JSON structures on the platform
Parse JSON responses from the platform using jq
and pipes to other filters or Unix programs
Use an Editor like Visual Studio Code with JSON Crack plugin
Use JSON checking tools to make sure your JSON is well formed
Run through jq
Use dx get to get app code and dxapp.json for an existing app
To create a support ticket if there are technical issues:
Go to the Help header (same section where Projects and Tools are) inside the platform
Select "Contact Support"
Fill in the Subject and Message to submit a support ticket.
Learn the