📚 Understanding nf-test: A Complete Guide

An example by Claude Code and used in my pipeline CHARLA.

Overview of the Test Structure

Here's the complete test file broken down:

nextflow_process { // This declares a Nextflow process test

// Test metadata - helps organize and filter tests
name "Test Process SIMPLIFY_HEADERS"
script "../simplify_headers.nf" // Path to the module being tested
process "SIMPLIFY_HEADERS" // Name of the process in the module
tag "modules" // Tags for filtering (e.g., nf-test test --tag modules)
tag "modules_local"
tag "simplify_headers"

// First test case
test("Should simplify FASTA headers") {

when { // Setup: Define what inputs to provide
process {
"""
input[0] = [
[ id: 'test_sample' ], // Meta map
file("\${projectDir}/tests/data/test_sample.fasta") // Input file
]
"""
}
}

then { // Assertions: Check the results
assertAll(
{ assert process.success }, // Did the process complete successfully?
{ assert snapshot(process.out).match() }, // Does output match saved snapshot?
{ assert process.out.fasta.size() == 1 }, // Is there exactly 1 output?
{ assert process.out.fasta[0][1].toString().endsWith('.simpleheader.fa') }, // Correct filename?
{ assert process.out.versions } // Does versions file exist?
)
}
}
}

---
🔍 Breaking Down Each Component

1️⃣ Test Data (tests/data/test_sample.fasta)

>read1_test_sequence with some metadata and long header
ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT
>read2_another_test_sequence extra=data quality=high
GCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTA

Why this data?
- Headers have extra metadata (like with some metadata) to test header simplification
- Multiple sequences to verify the process handles batch data correctly

---
2️⃣ Test Input Structure

input[0] = [
[ id: 'test_sample' ], // META MAP: Nextflow's way to track sample metadata
file("${projectDir}/tests/data/test_sample.fasta") // FILE: The actual data
]

Key concepts:
- input[0] = First input channel of the process
- [ id: 'test_sample' ] = Meta map (always first in tuple) - carries sample ID through the pipeline
- ${projectDir} = Critical! Points to project root so nf-test can find files in its isolated work directory
- This matches the process signature: tuple val(meta), path(fasta)

---
3️⃣ Assertion Types

✅ Basic Success Check

{ assert process.success }
- Did the process exit with code 0?
- Most fundamental check - if this fails, something is seriously wrong

📸 Snapshot Testing

{ assert snapshot(process.out).match() }
- Compares ALL outputs to a saved "snapshot" (the .snap file)
- First run: Creates snapshot
- Future runs: Compares against snapshot
- Detects ANY change in outputs (filenames, MD5 checksums, structure)

Snapshot file structure:
{
"Should simplify FASTA headers": {
"content": [
{
"fasta": [
[
{ "id": "test_sample" },
"test_sample.simpleheader.fa:md5,91e1215e830a905d787eebb0fca483c6"
]
],
"versions": ["versions.yml:md5,4c42e069dfb4e1df8b6aadb045ab7b86"]
}
]
}
}

The MD5 hash 91e1215e830a905d787eebb0fca483c6 is a fingerprint of the file contents. If the output changes even slightly, the hash changes and the test fails.

🔢 Output Count Verification

{ assert process.out.fasta.size() == 1 }
- Checks that exactly 1 output was produced
- process.out.fasta = The named output channel from the module
- Prevents accidental duplicate outputs

📝 Output Naming Check

{ assert process.out.fasta[0][1].toString().endsWith('.simpleheader.fa') }
Breaking it down:
- process.out.fasta = The output channel
- [0] = First (and only) item in the channel
- [1] = Second element of the tuple (the file; [0] would be the meta map)
- .toString() = Convert file path to string
- .endsWith('.simpleheader.fa') = Verify correct suffix

✔️ Output Exists Check

{ assert process.out.versions }
- Simple existence check
- Returns true if the channel has any content

---
🧪 Different Test Types Explained

Test #1: Basic Happy Path

test("Should simplify FASTA headers") {
when {
process {
"""
input[0] = [
[ id: 'test_sample' ],
file("\${projectDir}/tests/data/test_sample.fasta")
]
"""
}
}
then {
assertAll(
{ assert process.success },
{ assert snapshot(process.out).match() },
{ assert process.out.fasta.size() == 1 },
{ assert process.out.fasta[0][1].toString().endsWith('.simpleheader.fa') },
{ assert process.out.versions }
)
}
}
Purpose: Test the primary use case with real data

---
Test #2: Dynamic Test Data

test("Should handle already simplified headers") {
when {
process {
"""
// Create test data on-the-fly
def simple_fasta = file('test_simple.fa')
simple_fasta.text = '>read1\\nACGTACGT\\n>read2\\nGCTAGCTA\\n'

input[0] = [
[ id: 'test_simple' ],
simple_fasta
]
"""
}
}
then { ... }
}
Purpose: Test edge case (already simple headers) without creating extra test files

Why create data in the test?
- Keeps test data close to the test logic
- Good for edge cases that don't need permanent test files
- Headers are already simple (>read1, >read2), so nothing should change

---
Test #3: Negative Testing

test("Should fail on invalid FASTA input") {
when {
process {
"""
// Create INVALID test data
def invalid_fasta = file('test_invalid.fa')
invalid_fasta.text = 'This is not a FASTA file\\nNo headers here\\n'

input[0] = [
[ id: 'test_invalid' ],
invalid_fasta
]
"""
}
}
then {
assertAll(
{ assert process.failed } // We EXPECT failure!
)
}
}
Purpose: Verify the process correctly handles bad inputs

Key point: process.failed instead of process.success
- This is a negative test - we expect it to fail
- Tests that error handling works correctly

---
Test #4: Stub Run

test("Should use stub run") {
options "-stub" // Special flag: use stub instead of real code

when {
process {
"""
input[0] = [
[ id: 'test_stub' ],
file("\${projectDir}/tests/data/test_sample.fasta")
]
"""
}
}
then {
assertAll(
{ assert process.success },
{ assert snapshot(process.out).match() }
)
}
}

What is stub mode?
- In the module, there's a stub: section that creates empty placeholder files
- Used for testing pipeline logic WITHOUT running heavy computations
- Fast! Just creates dummy outputs

The stub section in the module:
stub:
"""
touch ${meta.id}.simpleheader.fa
cat <<-END_VERSIONS > versions.yml
"SIMPLIFY_HEADERS":
awk: \$(awk --version | head -n1 | sed 's/^.*awk //; s/ .*$//')
END_VERSIONS
"""

Notice the stub snapshot has MD5 d41d8cd98f00b204e9800998ecf8427e - that's the hash of an empty file!

---
🎯 Running the Tests

# Run all tests for this module
nf-test test modules/local/tests/simplify_headers.nf.test --profile local

# Run only tests with a specific tag
nf-test test --tag simplify_headers

# Run and update snapshots (use when outputs intentionally change)
nf-test test modules/local/tests/simplify_headers.nf.test --update-snapshot

# Run in verbose mode
nf-test test modules/local/tests/simplify_headers.nf.test --verbose

---
🚨 Common Patterns & Best Practices

✅ DO:

1. Use ${projectDir} for all file paths
2. Test both success AND failure cases
3. Use snapshots for complex outputs
4. Include stub tests for fast CI/CD
5. Use descriptive test names

❌ DON'T:

1. Use relative paths like 'tests/data/file.fa' (will break in nf-test's work dir)
2. Only test the happy path (test errors too!)
3. Forget to add checkIfExists: true if you want early file validation
4. Make tests dependent on each other (each test should be independent)

---
📊 What Happens When You Run This Test?

1. nf-test creates an isolated work directory: .nf-test/tests/<hash>/
2. Copies your test script there
3. Runs Nextflow with your test inputs
4. Collects all outputs
5. Compares against assertions and snapshots
6. Reports PASS/FAIL

Output example:
Test Process SIMPLIFY_HEADERS

Test [e38ca34a] 'Should simplify FASTA headers' PASSED (6.478s)
Test [606ff8a3] 'Should handle already simplified headers' PASSED (6.116s)
Test [ced5c39d] 'Should preserve sequence count' PASSED (6.003s)
Test [e6afcfc8] 'Should fail on invalid FASTA input' PASSED (5.805s)
Test [def6a8a0] 'Should use stub run' PASSED (6.089s)

SUCCESS: Executed 5 tests in 30.51s

---
🎓 Key Takeaways

1. Structure: when { } defines inputs, then { } defines expectations
2. Meta maps: Always first in tuples, carry sample metadata through pipeline
3. Snapshots: Automatic regression testing - detects any output changes
4. Negative tests: Test failure modes too!
5. Stub tests: Fast testing without heavy computation
6. ${projectDir}: Critical for file paths in nf-test

This test suite ensures your SIMPLIFY_HEADERS module works correctly and continues to work as you make changes!

Just Learning Bioinformatics

Search This Blog

📚 Understanding nf-test: A Complete Guide

Comments

Post a Comment