# Example 3: fastq\_trimmer

In this example, you will translate the `bash` app from the previous chapter into Workflow Definition Language (WDL).

You will learn how to:

* Use Java Jar files to validate and compile WDL
* Use WDL to define an applet's inputs, outputs, and runtime specs
* Compile a WDL task into an applet

## Getting Started

You will not use a wizard to start this applet, so manually create a directory for your work. Create a file called *fastq\_trimmer.wdl* with the following contents:

```
version 1.0 

task fastq_trimmer { 
    input { 
        File input_file
        Int quality_score = 30
    }

    String basename = basename(input_file) 

    command <<<
        fastq_quality_trimmer -Q 33 -t ~{quality_score} \ 
            -i ~{input_file} -o ~{basename}.filtered.fastq
    >>>

    output { 
        File output_file = "~{basename}.filtered.fastq"
    }

    runtime { 
        docker: "biocontainers/fastxtools:v0.0.14_cv2"
    }
}
```

* This line indicates that the WDL follows the [1.0 specification](https://github.com/openwdl/wdl/blob/main/versions/1.0/SPEC.md).
* The `task` defines the body of the applet.
* The `input` block defines the same inputs, a `File` called *input\_file* and an `Int` (integer) value called *quality\_score* with a default value of 30.
* This line defines a variable called *basename* which uses the [`basename`](https://github.com/openwdl/wdl/blob/main/versions/1.0/SPEC.md#string-basenamestring) function to get the filename of the input file.
* The `command` block will be executed at runtime. It uses the tilde/twiddle syntax (`~{}`) to derefence variables. The output is written to a filename using the `basename` of the input.
* The `output` defines a single `File` called *output\_file*.
* The `runtime` specifies a Biocontainers/Docker that contains the FASTX toolkit binaries.

## Checking and Compiling the WDL

To start, validate your WDL with WOMtool:

```
$ java -jar ~/womtool.jar validate fastq_trimmer.wdl
Success!
```

Before compiling the WDL into an applet, use **`dx pwd`** to ensure you are in your desired project. If not, run **`dx select`** to select a different project, then use the following command to compile the applet:

```
$ java -jar ~/dxCompiler.jar compile fastq_trimmer.wdl
[warning] Project is unspecified...using currently selected project project-GJ2k24j0vx804FPyBbxqpQBk
applet-GJ2pgv80vx84zJ4XJF6GPXz7
```

Use `dx run` as in the previous chapter to run the applet with the `-h|--help` option to that the usage looks identical to the `bash` version:

```
usage: dx run applet-GJ2pgv80vx84zJ4XJF6GPXz7 [-iINPUT_NAME=VALUE ...]

Applet: fastq_trimmer

Inputs:
  input_file: -iinput_file=(file)

  quality_score: [-iquality_score=(int, default=30)]

 Reserved for dxCompiler
  overrides___: [-ioverrides___=(hash)]

  overrides______dxfiles: [-ioverrides______dxfiles=(file) [-ioverrides______dxfiles=... [...]]]

Outputs:
  output_file: output_file (file)
```

From the perspective of the user, there is no difference between native/`bash` applets and those written in WDL. You should use whichever syntax you find most convenient to the task at hand. For instance, this applet leverages an existing Docker container created by the [Biocontainers Community](https://biocontainers.pro) rather than adding the binary as a resource.

You can run the applet using the command-line arguments as shown, or you can create a JSON file with the arguments as follows:

```
$ cat inputs.json
{
    "input_file": {
        "$dnanexus_link": "file-GJ2k2V80vx88z3zyJbVXZj3G"
    },
    "quality_score": 35
}
```

You can run the applet and watch the job with the following command:

```
$ dx run applet-GJ2pgv80vx84zJ4XJF6GPXz7 -f inputs.json -y --watch

Using input JSON:
{
    "input_file": {
        "$dnanexus_link": "file-GJ2k2V80vx88z3zyJbVXZj3G"
    },
    "quality_score": 35
}

Calling applet-GJ2pgv80vx84zJ4XJF6GPXz7 with output destination
project-GJ2k24j0vx804FPyBbxqpQBk:/

Job ID: job-GJ2ppvQ0vx88k8bv9pvGyjGX

Job Log
-------
Watching job job-GJ2ppvQ0vx88k8bv9pvGyjGX. Press Ctrl+C to stop watching.
```

The output will look quite different from the `bash` app, but the basics are still the same. In this version, notice that you do not need to download the inputs or upload the outputs. Once the input files are in place, the `command` block is run and the input files and variables are dereferenced properly. When the job has completed, run `dx describe` to see the inputs and outputs:

```
$ dx describe job-GJ2ppvQ0vx88k8bv9pvGyjGX
Result 1:
ID                    job-GJ2ppvQ0vx88k8bv9pvGyjGX
Class                 job
Job name              fastq_trimmer
Executable name       fastq_trimmer
Project context       project-GJ2k24j0vx804FPyBbxqpQBk
Region                aws:us-east-1
Billed to             org-sos
Workspace             container-GJ2ppx80773k09b8F6qKGJBb
Applet                applet-GJ2pgv80vx84zJ4XJF6GPXz7
Instance Type         mem1_ssd1_v2_x2
Priority              high
State                 done
Root execution        job-GJ2ppvQ0vx88k8bv9pvGyjGX
Origin job            job-GJ2ppvQ0vx88k8bv9pvGyjGX
Parent job            -
Function              main
Input                 input_file = file-GJ2k2V80vx88z3zyJbVXZj3G
                      quality_score = 35
Output                output_file = file-GJ2pv300773ypy03Jg2vYZ9f
...
```

Download the output file to ensure it looks like a correct result:

```
$ dx download file-GJ2pv300773ypy03Jg2vYZ9f
[===========================================================>]
Completed 14,357,774 of 14,357,774 bytes (100%) ~/fastq_trimmer_wdl/small-celegans-sample.fastq.filtered.fastq
$ wc -l small-celegans-sample.fastq.filtered.fastq
   98624 small-celegans-sample.fastq.filtered.fastq
```

## Documentation with Makefiles

You may find it useful to create a *Makefile* with all the steps documented in a runnable fashion:

```
WDL = fastq_trimmer.wdl
PROJECT_ID = project-GJ2k24j0vx804FPyBbxqpQBk
DXCOMPILER = java -jar ~/dxCompiler.jar
CROMWELL = java -jar ~/cromwell.jar
WOMTOOL = java -jar ~/womtool.jar
WORKFLOW_ID = applet-GJ2pgv80vx84zJ4XJF6GPXz7

validate:
    $(WOMTOOL) validate $(WDL)

check:
    miniwdl check $(WDL)

compile:
    $(DXCOMPILER) compile $(WDL) \
        -archive \
        -folder /workflows \
        -project $(PROJECT_ID)

run:
    dx run $(WORKFLOW_ID) \
        -f inputs.json \
        --destination $(PROJECT_ID):/output \
        -y --watch
```

Now you can run **`make compile`** rather than type out the rather long Java command.

## Review

The WDL version of the FastQTrimmmer applet is arguable simpler than the `bash` version. It uses just one file, *fastq\_trimmer.wdl*, and about 20 lines of text, whereas the `bash` version requires at least *dxapp.json*, a `bash` script, and the resources tarball.

In this chapter, you learned how to:

* Use a Biocontainers Docker image for the necessary binary executables from FASTX toolkit
* Define the same inputs, outputs, and commands as the `bash` applet from Chapter 3
* Use a *Makefile* to define project shortcuts to validate, compile, and run an applet

## Resources

[Full Documentation](https://documentation.dnanexus.com/)

To create a support ticket if there are technical issues:

1. Go to the Help header (same section where Projects and Tools are) inside the platform
2. Select "Contact Support"
3. Fill in the Subject and Message to submit a support ticket.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://academy.dnanexus.com/buildingworkflows/wdl/wdl_fastq_trimmer.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
