Example 2: Word Count (wc)
Last updated
Was this helpful?
Last updated
Was this helpful?
You can write the wc
applet using (WDL), which is a high-level way to define and chain tasks. You will start by defining a single task, which compiles to an applet on the DNAnexus platform.
In this example, you will:
Write the wc
applet using WDL
In the bash
applet, the inputs, outputs, and runtime specifications are defined in the dxapp.json file, and the code that runs lives in a separate file. WDL combines all of this into a single file. Create a new directory for your work, and then add the following to a file called wc.wdl:
A task
in WDL will compile to an applet in DNAnexus.
The command
block contains the bash
code that will be executed at runtime.
The output
block equates to the outputSpec from the previous chapter. As with inputs, each output must declare a type.
The runtime
block equates to the runSpec from the previous chapter. Here, you define that the task will use a Docker image of Ubuntu Linux 20.04.
First, ensure you have a working Java compiler and have installed all the Java Jar files as described in Chapter 1. Use WOMtool to validate the WDL syntax:
If you installed the Python miniwdl
program, you can also use it to check the syntax. The output on success is something like a parse tree:
To demonstrate the output on error, I'll change the word File
to Fiel
:
Here is the equivalent error from WOMtool:
The two tools are written in different languages (Java and Python) and have different stringencies of parsing and different ways of reporting errors. You may find it helpful to use both to track down errors.
First, use dx pwd
to check if you are in your wc project; if not, use dx select
to change. Now you can use the dxCompiler jar file you downloaded in Chapter 1 to compile the WDL into an applet:
Run the new applet from the CLI with the help flag to inspect the usage:
Whether you use bash
or WDL to write an applet, the compiled result works the same for the user.
If you look in the web interface, you should see a new wc_wdl object in the project as shown in Figure 1.
Click on the applet to launch the user interface as shown in Figure 2. Select an input file and launch the applet.
As with the bash
version, you can launch the applet using the command line arguments:
The dx cat
command allows you to quickly see the contents of the output file without having to download it to your computer:
This is the same output as from the previous chapter.
Depending on your comfort level with WDL, you may or may not find this version simpler than the bash
version. The result is the same no matter how you write the applet, so it's a matter of taste as to which you should select.
In this chapter, you learned how to:
Write a WDL task
Use WOMtool and miniwdl
to validate WDL syntax
Compile a WDL task into an applet
Use the JSON output from dx describe
and jq
to extract the outputs of a job
Use dx cat
to see the contents of a file on the DNAnexus platform
To create a support ticket if there are technical issues:
Go to the Help header (same section where Projects and Tools are) inside the platform
Select "Contact Support"
Fill in the Subject and Message to submit a support ticket.
There are several versions of WDL, and this indicates the file will use .
The input
block equates to the inputSpec from the previous chapter. Each input value is declared with a . Here the input is a File
.
The output from the job will look different, but the result will be the same. You can use dx describe
with the --json
option to get a JSON document describing the entire job and pipe this to the tool to extract the output
section: