Working Examples

All these working examples demonstrate prefered ways to solve some of the common problems encountered when developing WDLs. These examples are not really meaningful for non-WDL developers.

Warning

Remember to have the jaws environment activated. See Set up JAWS Environment

How to Run an Example

To run the following examples, click on an example link and clone the repo. For the subworkflow example, you would do:

git clone https://code.jgi.doe.gov/official-jgi-workflows/wdl-specific-repositories/jaws-tutorial-examples.git

# If you clicked on the subworkflow link below, you can see the directory name is subworkflows_and_conditionals
cd subworkflows_and_conditionals

And then follow the command in the README.md of that example.

jaws submit main.wdl inputs.json dori

Examples

Subworkflows

Simplest example

alignment example

Conditionals determining which subworkflow to run

subworkflows_and_conditionals

Importing subworkflows by URL

subworkflow_remote_subs

Scattering

Scattering is the prefered way to run jobs in parallel. “Scatter-gathering” represents the parallization of jobs and the subsequent combining of results into one array for downstream usage.

scattering

Sharding

“Sharding” is the term used for splitting a file (like fasta, fastq, txt, etc.) into pieces that can then be processed in parallel.

sharding input files

Using Refdata

Here is an example of how to use a database (e.g. for blast) within your WDL so that you don’t have to make a copy of it everytime you submit the WDL. Your reference files can be put into a directory dedicated to them on /global/dna/shared/databases/<your-username> on perlmutter. When you run a WDL, the keyword /refdata can be used within the commands section to access your refdata files. This works because /refdata is mounted into the docker container and points to your refdata files no matter which compute site you are using since the files get synced nightly.

using reference databases

Custom Data Structure

Here is an example of creating your own datastructure, i.e. instead of using the pre-defined ones like “String, Int, Array, Map, etc.” This may be useful to use for input that you want organized in a certain way. For more info, see the official documentation spec1.0

using datastructure for reference data