# Example 4: cnvkit

There is an existing public Docker image available for CNVkit ("etal/cnvkit:latest"), so another option is to build a WDL version that will download and use this image at runtime rather than installing the Python and R modules ourselves.

In this example, you will:

* Use WDL and Docker to build the CNVkit

## Getting Started

To start, create a new directory called *cnvkit\_wdl* parallel to the bash directory. Inside this new directory, create the file *workflow\.wdl* with the following contents:

```
version 1.0

task cnvkit_wdl_kyc {
    input {
        Array[File] bam_tumor
        File reference
    }

    command <<<
        cnvkit.py batch \
            ~{sep=" " bam_tumor} \
            -r ~{reference} \
            -p $(expr $(nproc) -1) \
            -d output/ \
            --scatter
    >>>

    runtime {
        docker: "etal/cnvkit:latest"
        cpu: 16
    }

    output {
        Array[File]+ cns = glob("output/[!.call]*.cns")
        Array[File]+ cns_filtered = glob("output/*.call.cns")
        Array[File]+ plot = glob("output/*-scatter.png")
    }
}
```

Next, ensure you have a working Java compiler and then download the latest dxCompiler Jar file. You can use the following command to place the 2.10.3 release into your home directory:

```
$ cd && wget https://github.com/dnanexus/dxCompiler/releases/download/2.10.3/dxCompiler-2.10.3.jar
```

Use the dxCompiler to turn *workflow\.wdl* into an applet equivalent to the bash version. In the following command, the workflow and all related applets will be placed into a *workflows* directory in the given project to keep all this neatly contained. The given the project ID `project-GFf2Bq8054J0v8kY8zJ1FGQF` is the *caris\_cnvkit* project, so change this to if you wish to place this into a different project. Note the use of the `-archive` option to archive any existing version of the applet and allow the new version to take precendence and the `-reorg` to reorganize the output files. As shown in the following command, successful compilation will result in printing the new workflow's ID:

```
$ java -jar ~/dxCompiler-2.10.3.jar compile workflow.wdl \
        -archive \
        -reorg \
        -folder /workflows \
        -project project-GFf2Bq8054J0v8kY8zJ1FGQF
applet-GFyVxpQ0VGFgGQBy4vJ0kxK2
```

Run the new workflow with the `-h|--help` flag to verify the inputs:

```
$ dx run applet-GFyVxpQ0VGFgGQBy4vJ0kxK2 -h
usage: dx run applet-GFyVxpQ0VGFgGQBy4vJ0kxK2 [-iINPUT_NAME=VALUE ...]

Applet: cnvkit_wdl_kyc

Inputs:
  bam_tumor: [-ibam_tumor=(file) [-ibam_tumor=... [...]]]

  reference: -ireference=(file)

 Reserved for dxCompiler
  overrides___: [-ioverrides___=(hash)]

  overrides______dxfiles: [-ioverrides______dxfiles=(file) [-ioverrides______dx>

Outputs:
  cns: cns (array:file)

  cns_filtered: cns_filtered (array:file)

  plot: plot (array:file)
```

As with the bash version, you can launch the workflow from the CLI as follows:

```
$ dx run -y --watch applet-GFyVxpQ0VGFgGQBy4vJ0kxK2 \
            -ibam_tumor=file-GFxXjV006kZVQPb20G85VXBp \
            -ireference=file-GFxXvpj06kZfP0QVKq2p2FGF \
            --destination project-GFyPxb00VGFz5JZQ4f5x424q:/users/kyclark
```

The resulting output will show the JSON you can alternatively use to launch the job:

```
$ cat inputs.json
{
    "bam_tumor": [
        {
            "$dnanexus_link": "file-GFxXjV006kZVQPb20G85VXBp"
        }
    ],
    "reference": {
        "$dnanexus_link": "file-GFxXvpj06kZfP0QVKq2p2FGF"
    }
}
```

Following is the command you can use to launch the workflow from the CLI with the JSON file:

```
$ dx run -y --watch applet-GFyVxpQ0VGFgGQBy4vJ0kxK2 -f inputs.json \
            --destination project-GFyPxb00VGFz5JZQ4f5x424q:/users/kyclark
```

As before, you can use the web interface to monitor the progress of the workflow and inspect the outputs.

## Saving a Docker Image

Run the following command to start a new cloud workstation:

```
$ dx run -imax_session_length="1d" app-cloud_workstation --ssh -y
```

From the cloud workstation, pull the CNVkit Docker image:

```
$ docker pull etal/cnvkit:latest
```

Save and compress the image to a file:

```
$ docker save etal/cnvkit:latest | gzip - > cnvkit.tar.gz
```

Add the tarball to the project:

```
$ dx upload cnvkit.tar.gz --path project-GFyPxb00VGFz5JZQ4f5x424q:/
[===========================================================>]
Uploaded 503,092,072 of 503,092,072 bytes (100%) cnvkit.tar.gz
ID                    file-GFyq05j0VGFqJqq54q98pbBK
Class                 file
Project               project-GFyPxb00VGFz5JZQ4f5x424q
Folder                /
Name                  cnvkit.tar.gz
State                 closing
Visibility            visible
Types                 -
Properties            -
Tags                  -
Outgoing links        -
Created               Thu Aug 18 03:20:55 2022
Created by            kyclark
 via the job          job-GFypx3Q0VGFgb71g4gYY3GF3
Last modified         Thu Aug 18 03:20:57 2022
Media type
archivalState         "live"
cloudAccount          "cloudaccount-dnanexus"
```

Update the WDL to use the tarball:

```
version 1.0

task cnvkit_wdl_tarball {
    input {
        Array[File] bam_tumor
        File reference
    }

    command <<<
        cnvkit.py batch \
            ~{sep=" " bam_tumor} \
            -r ~{reference} \
            -p $(expr $(nproc) -1) \
            -d output/ \
            --scatter
    >>>

    runtime {
        docker: "dx://file-GFyq05j0VGFqJqq54q98pbBK"
        cpu: 16
    }

    output {
        Array[File]+ cns = glob("output/[!.call]*.cns")
        Array[File]+ cns_filtered = glob("output/*.call.cns")
        Array[File]+ plot = glob("output/*-scatter.png")
    }
}
```

Build the app and run it.

## Review

In this chapter, you learned another strategy for packaging an applet's dependencies using Docker and then running the applet's code inside the Docker image using WDL.

## Resources

[Full Documentation](https://documentation.dnanexus.com/)

To create a support ticket if there are technical issues:

1. Go to the Help header (same section where Projects and Tools are) inside the platform
2. Select "Contact Support"
3. Fill in the Subject and Message to submit a support ticket.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://academy.dnanexus.com/buildingworkflows/wdl/wdl_cnvkit.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
