Example 4: cnvkit
There is an existing public Docker image available for CNVkit ("etal/cnvkit:latest"), so another option is to build a WDL version that will download and use this image at runtime rather than installing the Python and R modules ourselves.
In this example, you will:
Use WDL and Docker to build the CNVkit
Getting Started
To start, create a new directory called cnvkit_wdl parallel to the bash directory. Inside this new directory, create the file workflow.wdl with the following contents:
version 1.0
task cnvkit_wdl_kyc {
input {
Array[File] bam_tumor
File reference
}
command <<<
cnvkit.py batch \
~{sep=" " bam_tumor} \
-r ~{reference} \
-p $(expr $(nproc) -1) \
-d output/ \
--scatter
>>>
runtime {
docker: "etal/cnvkit:latest"
cpu: 16
}
output {
Array[File]+ cns = glob("output/[!.call]*.cns")
Array[File]+ cns_filtered = glob("output/*.call.cns")
Array[File]+ plot = glob("output/*-scatter.png")
}
}
Next, ensure you have a working Java compiler and then download the latest dxCompiler Jar file. You can use the following command to place the 2.10.3 release into your home directory:
$ cd && wget https://github.com/dnanexus/dxCompiler/releases/download/2.10.3/dxCompiler-2.10.3.jar
Use the dxCompiler to turn workflow.wdl into an applet equivalent to the bash version. In the following command, the workflow and all related applets will be placed into a workflows directory in the given project to keep all this neatly contained. The given the project ID project-GFf2Bq8054J0v8kY8zJ1FGQF
is the caris_cnvkit project, so change this to if you wish to place this into a different project. Note the use of the -archive
option to archive any existing version of the applet and allow the new version to take precendence and the -reorg
to reorganize the output files. As shown in the following command, successful compilation will result in printing the new workflow's ID:
$ java -jar ~/dxCompiler-2.10.3.jar compile workflow.wdl \
-archive \
-reorg \
-folder /workflows \
-project project-GFf2Bq8054J0v8kY8zJ1FGQF
applet-GFyVxpQ0VGFgGQBy4vJ0kxK2
Run the new workflow with the -h|--help
flag to verify the inputs:
$ dx run applet-GFyVxpQ0VGFgGQBy4vJ0kxK2 -h
usage: dx run applet-GFyVxpQ0VGFgGQBy4vJ0kxK2 [-iINPUT_NAME=VALUE ...]
Applet: cnvkit_wdl_kyc
Inputs:
bam_tumor: [-ibam_tumor=(file) [-ibam_tumor=... [...]]]
reference: -ireference=(file)
Reserved for dxCompiler
overrides___: [-ioverrides___=(hash)]
overrides______dxfiles: [-ioverrides______dxfiles=(file) [-ioverrides______dx>
Outputs:
cns: cns (array:file)
cns_filtered: cns_filtered (array:file)
plot: plot (array:file)
As with the bash version, you can launch the workflow from the CLI as follows:
$ dx run -y --watch applet-GFyVxpQ0VGFgGQBy4vJ0kxK2 \
-ibam_tumor=file-GFxXjV006kZVQPb20G85VXBp \
-ireference=file-GFxXvpj06kZfP0QVKq2p2FGF \
--destination project-GFyPxb00VGFz5JZQ4f5x424q:/users/kyclark
The resulting output will show the JSON you can alternatively use to launch the job:
$ cat inputs.json
{
"bam_tumor": [
{
"$dnanexus_link": "file-GFxXjV006kZVQPb20G85VXBp"
}
],
"reference": {
"$dnanexus_link": "file-GFxXvpj06kZfP0QVKq2p2FGF"
}
}
Following is the command you can use to launch the workflow from the CLI with the JSON file:
$ dx run -y --watch applet-GFyVxpQ0VGFgGQBy4vJ0kxK2 -f inputs.json \
--destination project-GFyPxb00VGFz5JZQ4f5x424q:/users/kyclark
As before, you can use the web interface to monitor the progress of the workflow and inspect the outputs.
Saving a Docker Image
Run the following command to start a new cloud workstation:
$ dx run -imax_session_length="1d" app-cloud_workstation --ssh -y
From the cloud workstation, pull the CNVkit Docker image:
$ docker pull etal/cnvkit:latest
Save and compress the image to a file:
$ docker save etal/cnvkit:latest | gzip - > cnvkit.tar.gz
Add the tarball to the project:
$ dx upload cnvkit.tar.gz --path project-GFyPxb00VGFz5JZQ4f5x424q:/
[===========================================================>]
Uploaded 503,092,072 of 503,092,072 bytes (100%) cnvkit.tar.gz
ID file-GFyq05j0VGFqJqq54q98pbBK
Class file
Project project-GFyPxb00VGFz5JZQ4f5x424q
Folder /
Name cnvkit.tar.gz
State closing
Visibility visible
Types -
Properties -
Tags -
Outgoing links -
Created Thu Aug 18 03:20:55 2022
Created by kyclark
via the job job-GFypx3Q0VGFgb71g4gYY3GF3
Last modified Thu Aug 18 03:20:57 2022
Media type
archivalState "live"
cloudAccount "cloudaccount-dnanexus"
Update the WDL to use the tarball:
version 1.0
task cnvkit_wdl_tarball {
input {
Array[File] bam_tumor
File reference
}
command <<<
cnvkit.py batch \
~{sep=" " bam_tumor} \
-r ~{reference} \
-p $(expr $(nproc) -1) \
-d output/ \
--scatter
>>>
runtime {
docker: "dx://file-GFyq05j0VGFqJqq54q98pbBK"
cpu: 16
}
output {
Array[File]+ cns = glob("output/[!.call]*.cns")
Array[File]+ cns_filtered = glob("output/*.call.cns")
Array[File]+ plot = glob("output/*-scatter.png")
}
}
Build the app and run it.
Review
In this chapter, you learned another strategy for packaging an applet's dependencies using Docker and then running the applet's code inside the Docker image using WDL.
Resources
To create a support ticket if there are technical issues:
Go to the Help header (same section where Projects and Tools are) inside the platform
Select "Contact Support"
Fill in the Subject and Message to submit a support ticket.
Last updated
Was this helpful?