Example 3: cnvkit

This example will build on the asset you created in the bash version. You will:

  • Learn how to download the input type array:file

  • Use regular expressions to classify output files

Getting Started

We'll call our new applet python_cnvkit. If you want to start from dx-app-wizard, use the following specs for the inputs and outputs:

Input Name
Type
Optional
Default Value

bam_tumor

array:file

No

NA

reference

file

No

NA

The output specs are as follows:

Output Name
Type

cns

array:file

cns_filtered

array:file

plot

array:file

You can also copy the bash applet directory and update the runSpec in dxapp.json to run a Python script and use the CNVKit asset from before:

    "runSpec": {
        "timeoutPolicy": {
            "*": {
                "hours": 48
            }
        },
        "interpreter": "python3",
        "file": "src/python_cnvkit.py",
        "distribution": "Ubuntu",
        "release": "20.04",
        "version": "0",
        "assetDepends": [{"id": "record-GgP33b00BppJKpyyFxGpZJYf"}],
    }

Here is the input.json:

Python Code

Update src/python_cnvkit.py to the following:

  1. Use a Python list comprehension to generate a list of file IDs for the tumor BAM files.

  2. Download the reference file.

  3. Initialize a list to hold the download BAM paths.

  4. Download each BAM file into a directory and append the path to the bam_files list.

  5. Create, print, and run the command to execute CNVkit.

  6. Find all the files created in the output directory. The os.listdir function only returns the filenames, so append the directory name.

  7. For each of the output file categories, filter the output files and upload the output files matching the expected extension.

  8. Compile the given regular expression.

  9. Create a DX file ID link for each uploaded file.

  10. Filter the given files for those matching the regex.

NOTE: The regex (?<!.call).cns$ uses a negative lookbehind to ensure that .call is not preceding .cns.

Here is the output from the job:

Review

  • You used a for loop to download multiple input BAM files into a local directory.

  • You used regular expressions to classify the output files into the three output labels.

Resources

Full Documentation

To create a support ticket if there are technical issues:

  1. Go to the Help header (same section where Projects and Tools are) inside the platform

  2. Select "Contact Support"

  3. Fill in the Subject and Message to submit a support ticket.

Last updated

Was this helpful?