Example 2: fastq_quality_trimmer
In this chapter, you'll learn to create an applet that uses the executable from the FASTX-Toolkit collection of command-line tools for processing short-read FASTA and FASTQ files. You'll use the applet to run FastQTrimmer on a FASTQ file, creating a trimmed reads file that you can then use for further analysis.
You will learn the following:
How to accept an optional integer argument from the user
How to add resource files to an applet such as a binary executable that can be used in your applet code
Starting the Applet
Run dx-app-wizard mytrimmer to create the mytrimmer applet. You have already added the app name, so you can press enter when prompted. You can add a title and summary if you would like, as well as version.
Start the input specification with the input FASTQ:
Input Specification
You will now be prompted for each input parameter to your app. Each parameter
should have a unique name that uses only the underscore "_" and alphanumeric
characters, and does not start with a number.
1st input name (<ENTER> to finish): input_file
Label (optional human-readable name) []: Input file
Your input parameter must be of one of the following classes:
applet array:file array:record file int
array:applet array:float array:string float record
array:boolean array:int boolean hash string
Choose a class (<TAB> twice for choices): file
This is an optional parameter [y/n]: nNext, indicate an optional integer for the quality score:
Press Enter to skip a third input and move to the output specification, which should define a single output file:
Press enter to exit the output section.
Set a timeout policy if you would like.
Answer the remaining questions to create a bash applet. The applet does not need access to the internet or parent project, and you can choose the default instance type.
Open the mytrimmer/dxapp.json in a text editor to view the inputSpec:
To make input file selection more convenient for the user, edit the patterns for the file extensions of the input_file to be those commonly used for FASTQ files:
These patterns are used in the web interface to filter files for the user, but it's not a requirement that the input files match these patterns. The file filter can be turned off by the user, so these patterns are merely suggestions.
Adding a Binary Resource
Next, you will add a binary executable file from the FASTX toolkit. Download and unpack the FASTX toolkit binaries:
Then make the executable with the make file. This will create your executable.
The files are also here to download and for you to unpack:
Create the directory resources/usr/bin inside the mytrimmer directory:
When the app is bundled, the directory structure in the resources directory will be compressed and unpacked as is on the instance, so you should create a directory that is in the standard $PATH such as /usr/bin or /usr/local/bin.
This applet only requires the fastq_quality_trimmer binary, so copy it to the preceding directory:
You should remove the downloaded binary artefacts as they are no longer needed.
Writing the Applet
Update mytrimmer/src/mytrimmer.sh with the following code:
The variables
$input_fileand$input_file_nameare based on the inputSpec name input_file. The first is a record-like string{"$dnanexus_link": "file-GJ2k2V80vx88z3zyJbVXZj3G"}, while the latter is the filename small-celegans-sample.fastq.The variable
$input_file_prefixis the name of the input file without the file extension, so small-celegans-sample, which is used to create the output filename small-celegans-sample.filtered.fastq. See the documentation.Run
fastq_quality_trimmerusing the given$quality_scoreand write to the output filename. The-Qoption is an undocumented option to indicate that the scores are in phred 33.Upload the output file, which returns another record-like string describing the newly created file.
Add the newly uploaded record as a file output of the job.
You don't need to indicate the full path to fastq_quality_trimmer because it will exist in the directory /usr/local/bin, which is in the standard $PATH.
Creating a Project for the Data and Applet
Add the sample FASTQ file to the project either by using the URL importer as shown in Figure 6, or download the file to your computer and upload through the web interface or using dx upload:
Use dx build to build the applet:
Run the applet with the -h|--help flag from the CLI to see the usage:
Run the applet using the file ID of the FASTA file you uploaded:
The job's output should end with something like the following:
You can select the output file and view the results.
You can download the output file and check that the filtering actually removed some of the input sequences by using wc to count the original file and the result:
Run the applet with a higher quality score and verify that the result includes even fewer sequences.
Review
In this chapter, you learned how to do the following:
Indicate an optional argument with a default value
Add a binary executable to a project in the resources directory and use that binary in your applet
How to use variations on the input file variables to get the full filename or the filename prefix without the extension.
Resources
To create a support ticket if there are technical issues:
Go to the Help header (same section where Projects and Tools are) inside the platform
Select "Contact Support"
Fill in the Subject and Message to submit a support ticket.
Last updated
Was this helpful?