nextflow-pipeline-debugging

star 6

Guide for analyzing pipeline output and debugging Nextflow workflows. Use this when you need to inspect channel contents, trace process execution, or analyze intermediate files.

FrancisCrickInstitute By FrancisCrickInstitute schedule Updated 2/27/2026

name: nextflow-pipeline-debugging description: Guide for analyzing pipeline output and debugging Nextflow workflows. Use this when you need to inspect channel contents, trace process execution, or analyze intermediate files.

Analyzing and Debugging Pipeline Output

This guide covers techniques for analyzing pipeline output and debugging Nextflow workflows effectively.

Prerequisites

  • Nextflow pipeline execution (at least one run completed or in progress)
  • Access to the pipeline working directory and results

Table of Contents

  1. Inspecting Channel Contents
  2. Using Workflow Trace Files
  3. Analyzing the Results Folder
  4. Working with the Work Directory
  5. Common Debugging Strategies

Inspecting Channel Contents

Using .view() to Debug Channels

The .view() operator is the simplest way to inspect what's flowing through your channels during pipeline execution.

Basic usage:

// View all channel contents
ch_data.view()

// View with a custom label
ch_data.view { "Processing: $it" }

// View with structured output
ch_data.view { meta, file -> 
    "Sample: ${meta.id}, File: ${file.name}" 
}

When to use .view():

  • Debugging data structure issues (meta maps, file paths)
  • Verifying channel emissions after operators
  • Checking data flow between processes
  • Confirming multiplicity (how many items are emitted)

Example debugging scenario:

// Problem: Not sure what structure the channel has
ch_input
    .view { "Before map: $it" }  // Debug original structure
    .map { meta, bam, bai -> [meta, bam] }
    .view { "After map: $it" }   // Debug transformed structure
    .set { ch_processed }

Using Workflow Trace Files

Understanding Execution Traces

Nextflow generates trace files that provide detailed information about each process execution.

Default location:

results/pipeline_info/execution_trace_YYYY-MM-DD_HH-MM-SS.txt

Reading Trace Files

The trace file is a tab-delimited file with columns including:

  • task_id: Unique task identifier
  • hash: Work directory hash (maps to work/XX/YYYYYY...)
  • name: Process name
  • status: COMPLETED, FAILED, CACHED, etc.
  • exit: Exit code (0 = success)
  • submit, start, complete: Timestamps
  • duration, realtime: Execution times
  • %cpu, %mem: Resource usage
  • rss, vmem, peak_rss, peak_vmem: Memory metrics
  • rchar, wchar: I/O metrics

Finding Failed Tasks

The CLI output of nextflow will indicate if any tasks failed along with the workfolder hash. You can also use the trace file to find more details about these tasks.

# Find all failed tasks
grep -v "COMPLETED" results/pipeline_info/execution_trace_*.txt | grep -v "CACHED"

# Find tasks with non-zero exit codes
awk -F'\t' '$6 != 0 && NR > 1 {print $2, $3, $4, $6}' results/pipeline_info/execution_trace_*.txt

# Find the work directory for a specific process
grep "PROCESS_NAME" results/pipeline_info/execution_trace_*.txt | awk -F'\t' '{print $2}'

Analyzing the Results Folder

Published Outputs

The results folder contains outputs that have been explicitly published. Use results folders to verify a tool is producing expected outputs, check for anomalies, and compare across samples.

Typical structure:

results/
├── pipeline_info/          # Trace, timeline, DAG, reports
├── [process_name]/         # Process-specific outputs
│   ├── sample1_output.txt
│   └── sample2_output.txt
└── multiqc/               # Quality control reports (if applicable)

Working with the Work Directory

Understanding the Work Directory

Each process execution creates a unique subdirectory in work/ containing:

  • Staging area: Input files (symlinks or copies)
  • Output files: All files generated by the process
  • .command.* files: Execution metadata and logs

Work directory structure:

work/
└── XX/
    └── YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY/
        ├── input_file.txt -> /path/to/actual/file
        ├── output_file.txt
        ├── .command.sh        # The actual script executed
        ├── .command.run       # Wrapper script (with container/env)
        ├── .command.out       # stdout
        ├── .command.err       # stderr
        ├── .command.log       # Combined log
        ├── .command.begin     # Start timestamp
        └── .exitcode          # Exit code

Finding the Work Directory for a Process

Method 1: Using execution trace

# Get the hash for a specific process/sample
grep "PROCESS_NAME.*sample_id" results/pipeline_info/execution_trace_*.txt | \
    awk -F'\t' '{print $2}'

# Navigate to work directory (hash format is XX/YYYYYY...)
cd work/[hash]

Method 2: Using Nextflow CLI

  • Use the CLI output during execution to find the work directory hash for failed tasks.

Debugging with Work Directory Files

Inspect what command was run:

cat .command.sh              # The actual command
cat .command.run             # Full execution wrapper (with container)

Check outputs and errors:

cat .command.out             # Standard output
cat .command.err             # Standard error
cat .command.log             # Combined log
cat .exitcode                # Exit code (0 = success)

Common Debugging Strategies

  1. Start with the CLI output: Look for any error messages or failed tasks indicated in the terminal output during execution.
  2. Use .view() to inspect channels: Add .view() operators at key points in your workflow to check the structure and contents of channels.
  3. Check the execution trace: Use the trace files to find failed tasks, their work directory hashes, and resource usage patterns.
  4. Inspect the work directory: For failed tasks, navigate to the corresponding work directory and check the command scripts, outputs, and logs for clues about what went wrong.
  5. Compare outputs: If some samples succeed and others fail, compare the outputs and logs between them to identify differences that may indicate the issue.
Install via CLI
npx skills add https://github.com/FrancisCrickInstitute/lyra --skill nextflow-pipeline-debugging
Repository Details
star Stars 6
call_split Forks 3
navigation Branch main
article Path SKILL.md
More from Creator
FrancisCrickInstitute
FrancisCrickInstitute Explore all skills →