|
| 1 | +# Process Entry Execution - Implementation Summary |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This document describes the implementation of the process entry execution feature in Nextflow, which allows users to execute individual processes directly from the command line without writing explicit workflow definitions. |
| 6 | + |
| 7 | +## Features Implemented |
| 8 | + |
| 9 | +### 1. Automatic Single Process Execution |
| 10 | +- **Usage**: `nextflow run script.nf --param value` |
| 11 | +- **Behavior**: Scripts containing exactly one process execute automatically |
| 12 | +- **Implementation**: `createSingleProcessWorkflow()` in BaseScript.groovy |
| 13 | + |
| 14 | +### 2. Multi-Process Entry Selection |
| 15 | +- **Usage**: `nextflow run script.nf -entry process:NAME --param value` |
| 16 | +- **Behavior**: Scripts with multiple processes require explicit process selection |
| 17 | +- **Implementation**: Enhanced `-entry` option parsing in `run0()` method |
| 18 | + |
| 19 | +### 3. Command-Line Parameter Mapping |
| 20 | +- **Feature**: Automatic mapping of `--param value` arguments to process input channels |
| 21 | +- **Supported Types**: `val`, `path`, `env`, `tuple`, `each` |
| 22 | +- **Implementation**: `createProcessInputChannelsWithMapping()` and related methods |
| 23 | + |
| 24 | +## Architecture |
| 25 | + |
| 26 | +### Core Components |
| 27 | + |
| 28 | +#### 1. Process Entry Detection (`BaseScript.groovy:180-205`) |
| 29 | +```groovy |
| 30 | +// Enhanced entry parsing to support process:NAME syntax |
| 31 | +if( binding.entryName.startsWith('process:') ) { |
| 32 | + final processName = binding.entryName.substring(8) |
| 33 | + final processDef = meta.getProcess(processName) |
| 34 | + entryFlow = createProcessEntryWorkflow(processDef) |
| 35 | +} |
| 36 | +``` |
| 37 | + |
| 38 | +#### 2. Parameter Mapping Pipeline |
| 39 | +The parameter mapping follows a three-step pipeline: |
| 40 | + |
| 41 | +**Step 1: Input Definition Extraction** (`extractProcessInputDefinitions()`) |
| 42 | +- Clones the process body closure to avoid side effects |
| 43 | +- Uses `ProcessInputExtractionDelegate` to intercept internal Nextflow DSL calls |
| 44 | +- Captures `_in_val(TokenVar(name))` calls generated by the compiler |
| 45 | +- Returns structured input specifications: `[type: 'val', name: 'paramName']` |
| 46 | + |
| 47 | +**Step 2: Parameter-to-Channel Mapping** (`mapParametersToChannels()`) |
| 48 | +- Iterates through input specifications using traditional for loops |
| 49 | +- Looks up parameter values in `session.params` (populated from command-line) |
| 50 | +- Creates appropriate Nextflow channels based on input type |
| 51 | + |
| 52 | +**Step 3: Channel Creation** (`createChannelForInputType()`) |
| 53 | +- Converts parameter values to typed Nextflow channels |
| 54 | +- Handles type-specific logic (path validation, collection handling, etc.) |
| 55 | +- Provides fallback behavior for missing parameters |
| 56 | + |
| 57 | +#### 3. Synthetic Workflow Generation |
| 58 | +Both single and multi-process execution create synthetic workflows: |
| 59 | +```groovy |
| 60 | +def workflowLogic = { -> |
| 61 | + def inputChannels = createProcessInputChannelsWithMapping(processDef) |
| 62 | + this.invokeMethod(processName, inputChannels) |
| 63 | +} |
| 64 | +``` |
| 65 | + |
| 66 | +### Key Classes |
| 67 | + |
| 68 | +#### ProcessInputExtractionDelegate |
| 69 | +- **Purpose**: Intercepts compiled process body execution to extract input definitions |
| 70 | +- **Key Methods**: `_in_val()`, `_in_path()`, `_in_env()`, etc. |
| 71 | +- **Design**: Uses method interception to capture TokenVar objects from compiled DSL |
| 72 | + |
| 73 | +#### TraditionalInputParsingDelegate |
| 74 | +- **Purpose**: Handles explicit `input { }` block declarations (legacy support) |
| 75 | +- **Usage**: Less common in modern Nextflow code |
| 76 | +- **Design**: Direct method mapping for input type declarations |
| 77 | + |
| 78 | +## Technical Details |
| 79 | + |
| 80 | +### Parameter Type Handling |
| 81 | + |
| 82 | +| Input Type | Command-Line Example | Channel Creation | Special Handling | |
| 83 | +|------------|---------------------|------------------|------------------| |
| 84 | +| `val name` | `--name "value"` | `Channel.of(paramValue)` | Direct value wrapping | |
| 85 | +| `path file` | `--file "input.txt"` | `Channel.of(Paths.get(paramValue))` | Path conversion + existence warning | |
| 86 | +| `env var` | `--var "VALUE"` | `Channel.of(paramValue)` | Optional parameter support | |
| 87 | +| `tuple items` | `--items "a,b,c"` | `Channel.of([paramValue])` | Collection wrapping | |
| 88 | +| `each item` | `--item "a,b,c"` | `Channel.fromIterable(split(','))` | Comma-separated parsing | |
| 89 | + |
| 90 | +### Error Handling |
| 91 | + |
| 92 | +#### Process Selection Errors |
| 93 | +```groovy |
| 94 | +// Unknown process name with suggestions |
| 95 | +def guess = allProcessNames.closest(processName) |
| 96 | +throw new IllegalArgumentException("Unknown process entry name: ${processName} -- Did you mean?\n${guess}") |
| 97 | +``` |
| 98 | + |
| 99 | +#### Missing Parameter Errors |
| 100 | +```groovy |
| 101 | +// Type-specific error messages |
| 102 | +throw new IllegalArgumentException("Missing required value parameter: --${paramName}") |
| 103 | +``` |
| 104 | + |
| 105 | +#### Multiple Process Detection |
| 106 | +```groovy |
| 107 | +// Clear guidance for multi-process scripts |
| 108 | +throw new AbortOperationException("Multiple processes found (${processNames}). Use -entry process:NAME to specify which process to execute.") |
| 109 | +``` |
| 110 | + |
| 111 | +### Performance Optimizations |
| 112 | + |
| 113 | +1. **Traditional for loops** instead of iterators for parameter processing |
| 114 | +2. **Process body cloning** to avoid side effects during input extraction |
| 115 | +3. **Lazy evaluation** of parameter mapping (only when needed) |
| 116 | +4. **Minimal object allocation** in hot paths |
| 117 | + |
| 118 | +## Usage Examples |
| 119 | + |
| 120 | +### Single Process Script |
| 121 | +```groovy |
| 122 | +#!/usr/bin/env nextflow |
| 123 | +
|
| 124 | +process analyzeData { |
| 125 | + input: |
| 126 | + val sampleName |
| 127 | + path inputFile |
| 128 | + |
| 129 | + script: |
| 130 | + """ |
| 131 | + echo "Analyzing ${sampleName} from ${inputFile}" |
| 132 | + """ |
| 133 | +} |
| 134 | +``` |
| 135 | + |
| 136 | +**Execution**: `nextflow run analyze.nf --sampleName "sample1" --inputFile "data.txt"` |
| 137 | + |
| 138 | +### Multi-Process Script |
| 139 | +```groovy |
| 140 | +#!/usr/bin/env nextflow |
| 141 | +
|
| 142 | +process processA { |
| 143 | + input: val name |
| 144 | + script: "echo Processing A: ${name}" |
| 145 | +} |
| 146 | +
|
| 147 | +process processB { |
| 148 | + input: val name |
| 149 | + script: "echo Processing B: ${name}" |
| 150 | +} |
| 151 | +``` |
| 152 | + |
| 153 | +**Execution**: `nextflow run multi.nf -entry process:processA --name "test"` |
| 154 | + |
| 155 | +## Integration Points |
| 156 | + |
| 157 | +### ScriptMeta.groovy Enhancements |
| 158 | +- `hasSingleExecutableProcess()`: Detects single process + no workflows |
| 159 | +- `hasMultipleExecutableProcesses()`: Detects multiple processes + no workflows |
| 160 | + |
| 161 | +### BaseScript.groovy Structure |
| 162 | +``` |
| 163 | +run0() - Entry point detection and workflow selection |
| 164 | +├── createSingleProcessWorkflow() - Single process execution |
| 165 | +├── createProcessEntryWorkflow() - Multi-process execution |
| 166 | +└── createProcessInputChannelsWithMapping() - Parameter mapping pipeline |
| 167 | + ├── extractProcessInputDefinitions() - Input extraction |
| 168 | + ├── mapParametersToChannels() - Parameter-to-channel mapping |
| 169 | + ├── createChannelForInputType() - Channel creation |
| 170 | + └── createDefaultChannelForInputType() - Error handling |
| 171 | +``` |
| 172 | + |
| 173 | +## Testing |
| 174 | + |
| 175 | +The implementation includes comprehensive test coverage: |
| 176 | + |
| 177 | +1. **Single Process Tests**: `single_process_test.nf` |
| 178 | +2. **Multi-Process Tests**: `complete_param_mapping_test.nf` |
| 179 | +3. **Parameter Type Tests**: Various input types (val, path, tuple, each) |
| 180 | +4. **Error Condition Tests**: Missing parameters, invalid process names |
| 181 | +5. **Edge Case Tests**: Empty processes, complex parameter combinations |
| 182 | + |
| 183 | +## Future Enhancements |
| 184 | + |
| 185 | +### Potential Improvements |
| 186 | +1. **Complex tuple support**: Multi-element tuples with mixed types |
| 187 | +2. **Parameter validation**: Type checking and constraints |
| 188 | +3. **Default parameter values**: Support for optional parameters with defaults |
| 189 | +4. **Configuration integration**: Process-specific parameter configuration |
| 190 | +5. **Interactive mode**: Parameter prompting for missing values |
| 191 | + |
| 192 | +### Performance Considerations |
| 193 | +1. **Input caching**: Cache extracted input definitions for repeated use |
| 194 | +2. **Compilation optimization**: Pre-compile parameter mapping logic |
| 195 | +3. **Memory efficiency**: Reduce object allocation in parameter processing |
| 196 | + |
| 197 | +This implementation successfully bridges the gap between command-line tool execution and workflow orchestration, making Nextflow processes more accessible for direct execution scenarios while maintaining full compatibility with existing workflow-based usage patterns. |
0 commit comments