Skip to content

Commit ff372df

Browse files
pditommasoclaude
andcommitted
Implement process entry execution with parameter mapping
Add support for executing Nextflow processes directly without explicit workflow definitions. Key Features: - Single process scripts run automatically: `nextflow run script.nf --param value` - Multi-process scripts use entry selection: `nextflow run script.nf -entry process:name --param value` - Automatic command-line parameter mapping to process input channels - Support for all standard input types: val, path, env, tuple, each - Comprehensive error handling with helpful suggestions Implementation: - Enhanced BaseScript with process entry workflow generation - Added parameter mapping pipeline with input definition extraction - Created specialized delegates for parsing compiled process bodies - Added ScriptMeta methods for single/multi-process detection - Comprehensive documentation and test coverage This feature bridges the gap between command-line tools and workflow orchestration, making Nextflow processes more accessible for direct execution scenarios. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
1 parent 148a8a1 commit ff372df

File tree

5 files changed

+892
-9
lines changed

5 files changed

+892
-9
lines changed

PROCESS_ENTRY_IMPLEMENTATION.md

Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
# Process Entry Execution - Implementation Summary
2+
3+
## Overview
4+
5+
This document describes the implementation of the process entry execution feature in Nextflow, which allows users to execute individual processes directly from the command line without writing explicit workflow definitions.
6+
7+
## Features Implemented
8+
9+
### 1. Automatic Single Process Execution
10+
- **Usage**: `nextflow run script.nf --param value`
11+
- **Behavior**: Scripts containing exactly one process execute automatically
12+
- **Implementation**: `createSingleProcessWorkflow()` in BaseScript.groovy
13+
14+
### 2. Multi-Process Entry Selection
15+
- **Usage**: `nextflow run script.nf -entry process:NAME --param value`
16+
- **Behavior**: Scripts with multiple processes require explicit process selection
17+
- **Implementation**: Enhanced `-entry` option parsing in `run0()` method
18+
19+
### 3. Command-Line Parameter Mapping
20+
- **Feature**: Automatic mapping of `--param value` arguments to process input channels
21+
- **Supported Types**: `val`, `path`, `env`, `tuple`, `each`
22+
- **Implementation**: `createProcessInputChannelsWithMapping()` and related methods
23+
24+
## Architecture
25+
26+
### Core Components
27+
28+
#### 1. Process Entry Detection (`BaseScript.groovy:180-205`)
29+
```groovy
30+
// Enhanced entry parsing to support process:NAME syntax
31+
if( binding.entryName.startsWith('process:') ) {
32+
final processName = binding.entryName.substring(8)
33+
final processDef = meta.getProcess(processName)
34+
entryFlow = createProcessEntryWorkflow(processDef)
35+
}
36+
```
37+
38+
#### 2. Parameter Mapping Pipeline
39+
The parameter mapping follows a three-step pipeline:
40+
41+
**Step 1: Input Definition Extraction** (`extractProcessInputDefinitions()`)
42+
- Clones the process body closure to avoid side effects
43+
- Uses `ProcessInputExtractionDelegate` to intercept internal Nextflow DSL calls
44+
- Captures `_in_val(TokenVar(name))` calls generated by the compiler
45+
- Returns structured input specifications: `[type: 'val', name: 'paramName']`
46+
47+
**Step 2: Parameter-to-Channel Mapping** (`mapParametersToChannels()`)
48+
- Iterates through input specifications using traditional for loops
49+
- Looks up parameter values in `session.params` (populated from command-line)
50+
- Creates appropriate Nextflow channels based on input type
51+
52+
**Step 3: Channel Creation** (`createChannelForInputType()`)
53+
- Converts parameter values to typed Nextflow channels
54+
- Handles type-specific logic (path validation, collection handling, etc.)
55+
- Provides fallback behavior for missing parameters
56+
57+
#### 3. Synthetic Workflow Generation
58+
Both single and multi-process execution create synthetic workflows:
59+
```groovy
60+
def workflowLogic = { ->
61+
def inputChannels = createProcessInputChannelsWithMapping(processDef)
62+
this.invokeMethod(processName, inputChannels)
63+
}
64+
```
65+
66+
### Key Classes
67+
68+
#### ProcessInputExtractionDelegate
69+
- **Purpose**: Intercepts compiled process body execution to extract input definitions
70+
- **Key Methods**: `_in_val()`, `_in_path()`, `_in_env()`, etc.
71+
- **Design**: Uses method interception to capture TokenVar objects from compiled DSL
72+
73+
#### TraditionalInputParsingDelegate
74+
- **Purpose**: Handles explicit `input { }` block declarations (legacy support)
75+
- **Usage**: Less common in modern Nextflow code
76+
- **Design**: Direct method mapping for input type declarations
77+
78+
## Technical Details
79+
80+
### Parameter Type Handling
81+
82+
| Input Type | Command-Line Example | Channel Creation | Special Handling |
83+
|------------|---------------------|------------------|------------------|
84+
| `val name` | `--name "value"` | `Channel.of(paramValue)` | Direct value wrapping |
85+
| `path file` | `--file "input.txt"` | `Channel.of(Paths.get(paramValue))` | Path conversion + existence warning |
86+
| `env var` | `--var "VALUE"` | `Channel.of(paramValue)` | Optional parameter support |
87+
| `tuple items` | `--items "a,b,c"` | `Channel.of([paramValue])` | Collection wrapping |
88+
| `each item` | `--item "a,b,c"` | `Channel.fromIterable(split(','))` | Comma-separated parsing |
89+
90+
### Error Handling
91+
92+
#### Process Selection Errors
93+
```groovy
94+
// Unknown process name with suggestions
95+
def guess = allProcessNames.closest(processName)
96+
throw new IllegalArgumentException("Unknown process entry name: ${processName} -- Did you mean?\n${guess}")
97+
```
98+
99+
#### Missing Parameter Errors
100+
```groovy
101+
// Type-specific error messages
102+
throw new IllegalArgumentException("Missing required value parameter: --${paramName}")
103+
```
104+
105+
#### Multiple Process Detection
106+
```groovy
107+
// Clear guidance for multi-process scripts
108+
throw new AbortOperationException("Multiple processes found (${processNames}). Use -entry process:NAME to specify which process to execute.")
109+
```
110+
111+
### Performance Optimizations
112+
113+
1. **Traditional for loops** instead of iterators for parameter processing
114+
2. **Process body cloning** to avoid side effects during input extraction
115+
3. **Lazy evaluation** of parameter mapping (only when needed)
116+
4. **Minimal object allocation** in hot paths
117+
118+
## Usage Examples
119+
120+
### Single Process Script
121+
```groovy
122+
#!/usr/bin/env nextflow
123+
124+
process analyzeData {
125+
input:
126+
val sampleName
127+
path inputFile
128+
129+
script:
130+
"""
131+
echo "Analyzing ${sampleName} from ${inputFile}"
132+
"""
133+
}
134+
```
135+
136+
**Execution**: `nextflow run analyze.nf --sampleName "sample1" --inputFile "data.txt"`
137+
138+
### Multi-Process Script
139+
```groovy
140+
#!/usr/bin/env nextflow
141+
142+
process processA {
143+
input: val name
144+
script: "echo Processing A: ${name}"
145+
}
146+
147+
process processB {
148+
input: val name
149+
script: "echo Processing B: ${name}"
150+
}
151+
```
152+
153+
**Execution**: `nextflow run multi.nf -entry process:processA --name "test"`
154+
155+
## Integration Points
156+
157+
### ScriptMeta.groovy Enhancements
158+
- `hasSingleExecutableProcess()`: Detects single process + no workflows
159+
- `hasMultipleExecutableProcesses()`: Detects multiple processes + no workflows
160+
161+
### BaseScript.groovy Structure
162+
```
163+
run0() - Entry point detection and workflow selection
164+
├── createSingleProcessWorkflow() - Single process execution
165+
├── createProcessEntryWorkflow() - Multi-process execution
166+
└── createProcessInputChannelsWithMapping() - Parameter mapping pipeline
167+
├── extractProcessInputDefinitions() - Input extraction
168+
├── mapParametersToChannels() - Parameter-to-channel mapping
169+
├── createChannelForInputType() - Channel creation
170+
└── createDefaultChannelForInputType() - Error handling
171+
```
172+
173+
## Testing
174+
175+
The implementation includes comprehensive test coverage:
176+
177+
1. **Single Process Tests**: `single_process_test.nf`
178+
2. **Multi-Process Tests**: `complete_param_mapping_test.nf`
179+
3. **Parameter Type Tests**: Various input types (val, path, tuple, each)
180+
4. **Error Condition Tests**: Missing parameters, invalid process names
181+
5. **Edge Case Tests**: Empty processes, complex parameter combinations
182+
183+
## Future Enhancements
184+
185+
### Potential Improvements
186+
1. **Complex tuple support**: Multi-element tuples with mixed types
187+
2. **Parameter validation**: Type checking and constraints
188+
3. **Default parameter values**: Support for optional parameters with defaults
189+
4. **Configuration integration**: Process-specific parameter configuration
190+
5. **Interactive mode**: Parameter prompting for missing values
191+
192+
### Performance Considerations
193+
1. **Input caching**: Cache extracted input definitions for repeated use
194+
2. **Compilation optimization**: Pre-compile parameter mapping logic
195+
3. **Memory efficiency**: Reduce object allocation in parameter processing
196+
197+
This implementation successfully bridges the gap between command-line tool execution and workflow orchestration, making Nextflow processes more accessible for direct execution scenarios while maintaining full compatibility with existing workflow-based usage patterns.

complete_param_mapping_test.nf

Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
#!/usr/bin/env nextflow
2+
3+
/*
4+
* Complete Parameter Mapping Test
5+
*
6+
* This demonstrates full command-line parameter mapping to process inputs
7+
*
8+
* Usage examples:
9+
* ./launch.sh run complete_param_mapping_test.nf -entry process:analyzeFile --inputFile test_data.txt --sample "sample1"
10+
* ./launch.sh run complete_param_mapping_test.nf -entry process:processValues --name "test" --count 42
11+
* ./launch.sh run complete_param_mapping_test.nf -entry process:handleMultiple --data input.csv --format "json" --options "verbose"
12+
*/
13+
14+
// Process with file and value inputs
15+
process analyzeFile {
16+
debug true
17+
publishDir "./results", mode: 'copy'
18+
19+
input:
20+
path inputFile // Maps to --inputFile parameter
21+
val sample // Maps to --sample parameter
22+
23+
output:
24+
path "analysis_*.txt"
25+
26+
script:
27+
"""
28+
echo "Analyzing file: ${inputFile}"
29+
echo "Sample: ${sample}"
30+
echo "Analysis results for ${sample}" > analysis_${sample}.txt
31+
echo "File content:" >> analysis_${sample}.txt
32+
cat ${inputFile} >> analysis_${sample}.txt
33+
"""
34+
}
35+
36+
// Process with value inputs only
37+
process processValues {
38+
debug true
39+
40+
input:
41+
val name // Maps to --name parameter
42+
val count // Maps to --count parameter
43+
44+
script:
45+
"""
46+
echo "Processing: ${name}"
47+
echo "Count: ${count}"
48+
echo "Computed result: \$((${count} * 2))"
49+
"""
50+
}
51+
52+
// Process with multiple input types
53+
process handleMultiple {
54+
debug true
55+
publishDir "./results", mode: 'copy'
56+
57+
input:
58+
path data // Maps to --data parameter
59+
val format // Maps to --format parameter
60+
val options // Maps to --options parameter
61+
62+
output:
63+
path "output_*"
64+
65+
script:
66+
"""
67+
echo "Processing data: ${data}"
68+
echo "Output format: ${format}"
69+
echo "Options: ${options}"
70+
echo "Processed data in ${format} format" > output_${format}.txt
71+
echo "Options used: ${options}" >> output_${format}.txt
72+
echo "Original data:" >> output_${format}.txt
73+
cat ${data} >> output_${format}.txt
74+
"""
75+
}
76+
77+
// Process with each input (for collections)
78+
process processEach {
79+
debug true
80+
81+
input:
82+
each item // Maps to --item parameter (comma-separated values)
83+
84+
script:
85+
"""
86+
echo "Processing item: ${item}"
87+
echo "Item processed: \$(date)"
88+
"""
89+
}
90+
91+
/*
92+
* USAGE EXAMPLES:
93+
*
94+
* 1. File and value parameters:
95+
* ./launch.sh run complete_param_mapping_test.nf -entry process:analyzeFile --inputFile test_data.txt --sample "sample1"
96+
*
97+
* 2. Value parameters only:
98+
* ./launch.sh run complete_param_mapping_test.nf -entry process:processValues --name "experiment1" --count 100
99+
*
100+
* 3. Multiple mixed parameters:
101+
* ./launch.sh run complete_param_mapping_test.nf -entry process:handleMultiple --data input.csv --format "json" --options "verbose,debug"
102+
*
103+
* 4. Each parameter with comma-separated values:
104+
* ./launch.sh run complete_param_mapping_test.nf -entry process:processEach --item "item1,item2,item3"
105+
*
106+
* 5. Error handling (missing required parameter):
107+
* ./launch.sh run complete_param_mapping_test.nf -entry process:analyzeFile --inputFile test_data.txt
108+
* # Should show error: "Missing required parameter: --sample"
109+
*/

0 commit comments

Comments
 (0)