In this example, you will translate the bash app from the previous chapter into Workflow Definition Language (WDL).
You will learn how to:
Use Java Jar files to validate and compile WDL
Use WDL to define an applet's inputs, outputs, and runtime specs
Compile a WDL task into an applet
Getting Started
You will not use a wizard to start this applet, so manually create a directory for your work. Create a file called fastq_trimmer.wdl with the following contents:
The input block defines the same inputs, a File called input_file and an Int (integer) value called quality_score with a default value of 30.
This line defines a variable called basename which uses the basename function to get the filename of the input file.
The command block will be executed at runtime. It uses the tilde/twiddle syntax (~{}) to derefence variables. The output is written to a filename using the basename of the input.
The output defines a single File called output_file.
The runtime specifies a Biocontainers/Docker that contains the FASTX toolkit binaries.
Checking and Compiling the WDL
To start, validate your WDL with WOMtool:
Before compiling the WDL into an applet, use dx pwd to ensure you are in your desired project. If not, run dx select to select a different project, then use the following command to compile the applet:
Use dx run as in the previous chapter to run the applet with the -h|--help option to that the usage looks identical to the bash version:
From the perspective of the user, there is no difference between native/bash applets and those written in WDL. You should use whichever syntax you find most convenient to the task at hand. For instance, this applet leverages an existing Docker container created by the Biocontainers Community rather than adding the binary as a resource.
You can run the applet using the command-line arguments as shown, or you can create a JSON file with the arguments as follows:
You can run the applet and watch the job with the following command:
The output will look quite different from the bash app, but the basics are still the same. In this version, notice that you do not need to download the inputs or upload the outputs. Once the input files are in place, the command block is run and the input files and variables are dereferenced properly. When the job has completed, run dx describe to see the inputs and outputs:
Download the output file to ensure it looks like a correct result:
Documentation with Makefiles
You may find it useful to create a Makefile with all the steps documented in a runnable fashion:
Now you can run make compile rather than type out the rather long Java command.
Review
The WDL version of the FastQTrimmmer applet is arguable simpler than the bash version. It uses just one file, fastq_trimmer.wdl, and about 20 lines of text, whereas the bash version requires at least dxapp.json, a bash script, and the resources tarball.
In this chapter, you learned how to:
Use a Biocontainers Docker image for the necessary binary executables from FASTX toolkit
Define the same inputs, outputs, and commands as the bash applet from Chapter 3
Use a Makefile to define project shortcuts to validate, compile, and run an applet
$ dx run applet-GJ2pgv80vx84zJ4XJF6GPXz7 -f inputs.json -y --watch
Using input JSON:
{
"input_file": {
"$dnanexus_link": "file-GJ2k2V80vx88z3zyJbVXZj3G"
},
"quality_score": 35
}
Calling applet-GJ2pgv80vx84zJ4XJF6GPXz7 with output destination
project-GJ2k24j0vx804FPyBbxqpQBk:/
Job ID: job-GJ2ppvQ0vx88k8bv9pvGyjGX
Job Log
-------
Watching job job-GJ2ppvQ0vx88k8bv9pvGyjGX. Press Ctrl+C to stop watching.
$ dx describe job-GJ2ppvQ0vx88k8bv9pvGyjGX
Result 1:
ID job-GJ2ppvQ0vx88k8bv9pvGyjGX
Class job
Job name fastq_trimmer
Executable name fastq_trimmer
Project context project-GJ2k24j0vx804FPyBbxqpQBk
Region aws:us-east-1
Billed to org-sos
Workspace container-GJ2ppx80773k09b8F6qKGJBb
Applet applet-GJ2pgv80vx84zJ4XJF6GPXz7
Instance Type mem1_ssd1_v2_x2
Priority high
State done
Root execution job-GJ2ppvQ0vx88k8bv9pvGyjGX
Origin job job-GJ2ppvQ0vx88k8bv9pvGyjGX
Parent job -
Function main
Input input_file = file-GJ2k2V80vx88z3zyJbVXZj3G
quality_score = 35
Output output_file = file-GJ2pv300773ypy03Jg2vYZ9f
...