JAWS Quickstart

Summary

To start running a pipeline in JAWS, please follow the setup instructions provided here.

Run an Example WDL in JAWS

  1. Activate the environment as you previously set up:

module load jaws
  1. Clone the example code:

git clone https://code.jgi.doe.gov/official-jgi-workflows/wdl-specific-repositories/jaws-tutorial-examples.git
cd jaws-tutorial-examples/quickstart
  1. List all the sites available to JAWS:

jaws list-sites
  1. Submit a workflow using JAWS

Currently JAWS can run on the following resources:

  • DORI (at LBNL)

  • PERLMUTTER (at NERSC)

  • JGI (at LBNL)

  • TAHOMA (at PNNL)

  • NMDC (at NERSC)

  • NMDC_TAHOMA (at PNNL)

When submitting a JAWS run, you must specify the site to use (i.e. JGI, etc.)

jaws submit --tag metagenome_alignment align.wdl inputs.json dori

# you should see something like this
100%|███████████████████████████████████| 2929/2929 [00:00<00:00, 1081055.65it/s]
Copied 2929 bytes in 0.0 seconds.
100%|███████████████████████████████████| 792/792 [00:00<00:00, 349231.37it/s]
Copied 792 bytes in 0.0 seconds.
{
"run_id": 35970
}

Monitoring the Job

From the output above, we see that the run_id was 35970.

# if you don't remember the JAWS run ID, you can run the following commands to see your run's ID
jaws queue
or
jaws history

# check status of the run
jaws log 35970
# and
jaws status 35970

# check the status of each task/shard
jaws tasks 35970

Get the results

Once the run status has changed to download complete, the files listed in the output{} workflow section will be moved to your team’s directory.

jaws status <RUN_ID> display the output_dir for a specific run.

Then, you can find the expected file tree structure:

/<TEAM PATH>/<USER_ID>/<RUN_ID>/<Cromwell_ID>

Additionally, if a run doesn’t succeed, you can download the entire Cromwell execution folder for each task that didn’t complete to the team’s directory. Use the command jaws download <RUN_ID> to achieve this.

The Output Directory Explained

Cromwell will create a directory structure similar to this…

../_images/jaws_cromwell.svg

Each task of your workflow gets run inside the execution directory so it is here that you can find any output files including the stderr, stdout & script file.

So for our theoretical submission:

jaws submit align.wdl inputs.json dori

We should see an output folder that looks like this:

../_images/jaws_cromwell_1.svg

Further Debugging Ideas

  1. The errors.json file shows the contents of the stderr, stdout, and the metadata output created by Cromwell. It should only show content when the task results in an error.

  2. The script file is generated by Cromwell to run your tasks’s code (from command{}). In this script, there is a lot of boilerplate code, however, you will be able to see exactly what Cromwell was trying to run.

Description of Cromwell Generated Files

This is what happens on the backend:

../_images/HTCondor_Slurm.svg

These are the files you might find in the execution directory:

script.submit  Cromwell runs this. It creates and passes necessary files to Condor.
stdout.submit  Stderr from "script.submit"
stderr.submit  Stdout from "script.submit"
submitFile     Contains resource specifications on how Condor is to run the job.
execution.log  Log produced by Condor containing running resources.
dockerScript   Defines the shifter or singularity command that will run "script".
script         Code representing your "command{}" that is to be run on the slurm node.
stderr         Stderr from "script"
stdout         Stdout from "script"
rc             The return code from "script"