Example 5: workflow
In this example, you will learn:
How to to accept a BAM file as a workflow input
Break the BAM into slices by chromosome
Distribute the slices in parallel to count the number of alignments in each
Getting Started
To begin, create a new directory called view_and_count and a workflow.wdl file.
Here is the workflow
defintion you should add:
The name of this workflow is bam_chrom_counter.
The workflow accepts a single, required
File
input that will be calledbam
as it is expected to be a BAM file.The first
call
will be to theslice_bam
task that will break the BAM into one file per chromosome. The input for this task is the workflow's BAM file.The workflow defines two outputs: a BAM index file and an array of integer values representing the number of alignments in each of the BAM slices.
The inputs to this task are the BAM file and the name of the Docker image.
The command block uses triple-angle brackets because it must use the dollar sign (
$
) in shell code.The
$()
syntax in bash calls theseq
function to create a sequence of integer values up the 22 human non-sex chromosomes.The output of this task is the BAM index, which is the given BAM file plus the suffix .bai, and the sliced alignment files.
The count_bam
task is written to handle just one BAM slice:
This BAM input will be a slice of alignments for a given region. Naming this
bam
does not interfere with thebam
variable in the workflow or any other task.
At this point, I like to use miniwdl
to check the syntax:
As no errors are reported, I will compile this onto the DNAnexus platform:
Finally, I will run this workflow using a sample BAM file:
Return to the DNAnexus website to monitor the progress of the analysis.
Placing Task Definitions in Files
As the number of tasks increase, workflow definitions can get quite long. You can shorten the workflow.wdl by placing each task in a separate file, which also makes it easier to reuse a task in a separate workflow. To do this, create a subdirectory called tasks, and then create a file called tasks/slice_bam.wdl with the following contents:
Also create the file tasks/count_bam.wdl with the following contents:
Both of the preceding tasks are identical to the original definitions, but note that the files include a version
that matches the version of the workflow. Change workflow.wdl as follows:
Call
task_slice_bam.slice_bam
from the imported file usingas
to give it the same name as in the original workflow.Do the same with
task_count_bam.count_bam
.
Use miniwdl
to check your syntax, then use dxCompiler to create an app.
Review
In this lesson, you learned how to:
Accept a file as a workflow input
Define a non-input declaration
Use
scatter
to run tasks in parallelUse the output from one task as the input to another task
Mix
~
and$
in command blocks to dereference WDL and shell variablesImport WDL from external sources such as local files or remote URIs.
Resources
To create a support ticket if there are technical issues:
Go to the Help header (same section where Projects and Tools are) inside the platform
Select "Contact Support"
Fill in the Subject and Message to submit a support ticket.
Last updated
Was this helpful?