๐Ÿ“š Understanding nf-test: A Complete Guide

 
  
An example by Claude Code and used in my pipeline CHARLA.


  Overview of the Test Structure

  Here's the complete test file broken down:

  nextflow_process {  // This declares a Nextflow process test

      // Test metadata - helps organize and filter tests
      name "Test Process SIMPLIFY_HEADERS"
      script "../simplify_headers.nf"      // Path to the module being tested
      process "SIMPLIFY_HEADERS"            // Name of the process in the module
      tag "modules"                         // Tags for filtering (e.g., nf-test test --tag modules)
      tag "modules_local"
      tag "simplify_headers"

      // First test case
      test("Should simplify FASTA headers") {

          when {  // Setup: Define what inputs to provide
              process {
                  """
                  input[0] = [
                      [ id: 'test_sample' ],                          // Meta map
                      file("\${projectDir}/tests/data/test_sample.fasta")  // Input file
                  ]
                  """
              }
          }

          then {  // Assertions: Check the results
              assertAll(
                  { assert process.success },                    // Did the process complete successfully?
                  { assert snapshot(process.out).match() },      // Does output match saved snapshot?
                  { assert process.out.fasta.size() == 1 },     // Is there exactly 1 output?
                  { assert process.out.fasta[0][1].toString().endsWith('.simpleheader.fa') },  // Correct filename?
                  { assert process.out.versions }                // Does versions file exist?
              )
          }
      }
  }

  ---
  ๐Ÿ” Breaking Down Each Component

  1️⃣ Test Data (tests/data/test_sample.fasta)

  >read1_test_sequence with some metadata and long header
  ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT
  >read2_another_test_sequence extra=data quality=high
  GCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTA

  Why this data?
  - Headers have extra metadata (like with some metadata) to test header simplification
  - Multiple sequences to verify the process handles batch data correctly

  ---
  2️⃣ Test Input Structure

  input[0] = [
      [ id: 'test_sample' ],          // META MAP: Nextflow's way to track sample metadata
      file("${projectDir}/tests/data/test_sample.fasta")  // FILE: The actual data
  ]

  Key concepts:
  - input[0] = First input channel of the process
  - [ id: 'test_sample' ] = Meta map (always first in tuple) - carries sample ID through the pipeline
  - ${projectDir} = Critical! Points to project root so nf-test can find files in its isolated work directory
  - This matches the process signature: tuple val(meta), path(fasta)

  ---
  3️⃣ Assertion Types

  ✅ Basic Success Check

  { assert process.success }
  - Did the process exit with code 0?
  - Most fundamental check - if this fails, something is seriously wrong

  ๐Ÿ“ธ Snapshot Testing

  { assert snapshot(process.out).match() }
  - Compares ALL outputs to a saved "snapshot" (the .snap file)
  - First run: Creates snapshot
  - Future runs: Compares against snapshot
  - Detects ANY change in outputs (filenames, MD5 checksums, structure)

  Snapshot file structure:
  {
      "Should simplify FASTA headers": {
          "content": [
              {
                  "fasta": [
                      [
                          { "id": "test_sample" },
                          "test_sample.simpleheader.fa:md5,91e1215e830a905d787eebb0fca483c6"
                      ]
                  ],
                  "versions": ["versions.yml:md5,4c42e069dfb4e1df8b6aadb045ab7b86"]
              }
          ]
      }
  }

  The MD5 hash 91e1215e830a905d787eebb0fca483c6 is a fingerprint of the file contents. If the output changes even slightly, the hash changes and the test fails.

  ๐Ÿ”ข Output Count Verification

  { assert process.out.fasta.size() == 1 }
  - Checks that exactly 1 output was produced
  - process.out.fasta = The named output channel from the module
  - Prevents accidental duplicate outputs

  ๐Ÿ“ Output Naming Check

  { assert process.out.fasta[0][1].toString().endsWith('.simpleheader.fa') }
  Breaking it down:
  - process.out.fasta = The output channel
  - [0] = First (and only) item in the channel
  - [1] = Second element of the tuple (the file; [0] would be the meta map)
  - .toString() = Convert file path to string
  - .endsWith('.simpleheader.fa') = Verify correct suffix

  ✔️ Output Exists Check

  { assert process.out.versions }
  - Simple existence check
  - Returns true if the channel has any content

  ---
  ๐Ÿงช Different Test Types Explained

  Test #1: Basic Happy Path

  test("Should simplify FASTA headers") {
      when {
          process {
              """
              input[0] = [
                  [ id: 'test_sample' ],
                  file("\${projectDir}/tests/data/test_sample.fasta")
              ]
              """
          }
      }
      then {
          assertAll(
              { assert process.success },
              { assert snapshot(process.out).match() },
              { assert process.out.fasta.size() == 1 },
              { assert process.out.fasta[0][1].toString().endsWith('.simpleheader.fa') },
              { assert process.out.versions }
          )
      }
  }
  Purpose: Test the primary use case with real data

  ---
  Test #2: Dynamic Test Data

  test("Should handle already simplified headers") {
      when {
          process {
              """
              // Create test data on-the-fly
              def simple_fasta = file('test_simple.fa')
              simple_fasta.text = '>read1\\nACGTACGT\\n>read2\\nGCTAGCTA\\n'
              
              input[0] = [
                  [ id: 'test_simple' ],
                  simple_fasta
              ]
              """
          }
      }
      then { ... }
  }
  Purpose: Test edge case (already simple headers) without creating extra test files

  Why create data in the test?
  - Keeps test data close to the test logic
  - Good for edge cases that don't need permanent test files
  - Headers are already simple (>read1, >read2), so nothing should change

  ---
  Test #3: Negative Testing

  test("Should fail on invalid FASTA input") {
      when {
          process {
              """
              // Create INVALID test data
              def invalid_fasta = file('test_invalid.fa')
              invalid_fasta.text = 'This is not a FASTA file\\nNo headers here\\n'
              
              input[0] = [
                  [ id: 'test_invalid' ],
                  invalid_fasta
              ]
              """
          }
      }
      then {
          assertAll(
              { assert process.failed }  // We EXPECT failure!
          )
      }
  }
  Purpose: Verify the process correctly handles bad inputs

  Key point: process.failed instead of process.success
  - This is a negative test - we expect it to fail
  - Tests that error handling works correctly

  ---
  Test #4: Stub Run

  test("Should use stub run") {
      options "-stub"  // Special flag: use stub instead of real code

      when {
          process {
              """
              input[0] = [
                  [ id: 'test_stub' ],
                  file("\${projectDir}/tests/data/test_sample.fasta")
              ]
              """
          }
      }
      then {
          assertAll(
              { assert process.success },
              { assert snapshot(process.out).match() }
          )
      }
  }

  What is stub mode?
  - In the module, there's a stub: section that creates empty placeholder files
  - Used for testing pipeline logic WITHOUT running heavy computations
  - Fast! Just creates dummy outputs

  The stub section in the module:
  stub:
  """
  touch ${meta.id}.simpleheader.fa
  cat <<-END_VERSIONS > versions.yml
  "SIMPLIFY_HEADERS":
      awk: \$(awk --version | head -n1 | sed 's/^.*awk //; s/ .*$//')
  END_VERSIONS
  """

  Notice the stub snapshot has MD5 d41d8cd98f00b204e9800998ecf8427e - that's the hash of an empty file!

  ---
  ๐ŸŽฏ Running the Tests

  # Run all tests for this module
  nf-test test modules/local/tests/simplify_headers.nf.test --profile local

  # Run only tests with a specific tag
  nf-test test --tag simplify_headers

  # Run and update snapshots (use when outputs intentionally change)
  nf-test test modules/local/tests/simplify_headers.nf.test --update-snapshot

  # Run in verbose mode
  nf-test test modules/local/tests/simplify_headers.nf.test --verbose

  ---
  ๐Ÿšจ Common Patterns & Best Practices

  ✅ DO:

  1. Use ${projectDir} for all file paths
  2. Test both success AND failure cases
  3. Use snapshots for complex outputs
  4. Include stub tests for fast CI/CD
  5. Use descriptive test names

  ❌ DON'T:

  1. Use relative paths like 'tests/data/file.fa' (will break in nf-test's work dir)
  2. Only test the happy path (test errors too!)
  3. Forget to add checkIfExists: true if you want early file validation
  4. Make tests dependent on each other (each test should be independent)

  ---
  ๐Ÿ“Š What Happens When You Run This Test?

  1. nf-test creates an isolated work directory: .nf-test/tests/<hash>/
  2. Copies your test script there
  3. Runs Nextflow with your test inputs
  4. Collects all outputs
  5. Compares against assertions and snapshots
  6. Reports PASS/FAIL

  Output example:
  Test Process SIMPLIFY_HEADERS

    Test [e38ca34a] 'Should simplify FASTA headers' PASSED (6.478s)
    Test [606ff8a3] 'Should handle already simplified headers' PASSED (6.116s)
    Test [ced5c39d] 'Should preserve sequence count' PASSED (6.003s)
    Test [e6afcfc8] 'Should fail on invalid FASTA input' PASSED (5.805s)
    Test [def6a8a0] 'Should use stub run' PASSED (6.089s)

  SUCCESS: Executed 5 tests in 30.51s

  ---
  ๐ŸŽ“ Key Takeaways

  1. Structure: when { } defines inputs, then { } defines expectations
  2. Meta maps: Always first in tuples, carry sample metadata through pipeline
  3. Snapshots: Automatic regression testing - detects any output changes
  4. Negative tests: Test failure modes too!
  5. Stub tests: Fast testing without heavy computation
  6. ${projectDir}: Critical for file paths in nf-test

  This test suite ensures your SIMPLIFY_HEADERS module works correctly and continues to work as you make changes!

Comments