-
Notifications
You must be signed in to change notification settings - Fork 727
Implement Process Entry Execution with Command-Line Parameter Mapping #6381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for nextflow-docs-staging canceled.
|
Add support for executing Nextflow processes directly without explicit workflow definitions. Key Features: - Single process scripts run automatically: `nextflow run script.nf --param value` - Multi-process scripts use entry selection: `nextflow run script.nf -entry process:name --param value` - Automatic command-line parameter mapping to process input channels - Support for all standard input types: val, path, env, tuple, each - Comprehensive error handling with helpful suggestions Implementation: - Enhanced BaseScript with process entry workflow generation - Added parameter mapping pipeline with input definition extraction - Created specialized delegates for parsing compiled process bodies - Added ScriptMeta methods for single/multi-process detection - Comprehensive documentation and test coverage This feature bridges the gap between command-line tools and workflow orchestration, making Nextflow processes more accessible for direct execution scenarios. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Paolo Di Tommaso <[email protected]>
ff372df
to
fb7e788
Compare
Signed-off-by: Paolo Di Tommaso <[email protected]>
- Rename ProcessEntryHelper.groovy to ProcessEntryHandler.groovy for clearer naming - Update all class references in BaseScript.groovy to use ProcessEntryHandler - Clean separation of process entry execution feature from main BaseScript class - All functionality preserved: single process auto-execution and multi-process entry selection 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
modules/nextflow/src/main/groovy/nextflow/script/ProcessEntryHandler.groovy
Outdated
Show resolved
Hide resolved
Signed-off-by: Paolo Di Tommaso <[email protected]>
modules/nextflow/src/main/groovy/nextflow/script/ProcessEntryHandler.groovy
Outdated
Show resolved
Hide resolved
--------- Signed-off-by: Rob Syme <[email protected]> Signed-off-by: Ben Sherman <[email protected]> Co-authored-by: Ben Sherman <[email protected]>
Add defensive null checking for sender thread and workflowId to prevent NullPointerException when onFlowComplete is called after initialization failures in onFlowCreate or onFlowBegin methods. Signed-off-by: Rob Syme <[email protected]> Co-authored-by: Ben Sherman <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
I've cleaned up the implementation, added some unit tests and above all the params mapping into process inputs |
@claude review |
Claude finished @pditommaso's task —— View job Code Review for Process Entry Execution ImplementationI've completed a comprehensive review of this PR implementing direct process execution without explicit workflows. Here's my detailed analysis: Todo List:
Overall AssessmentThis is a well-implemented feature that significantly enhances Nextflow's usability for simple process execution scenarios. The implementation is clean, follows existing patterns, and includes comprehensive test coverage. ✅ StrengthsArchitecture & Design
Implementation Quality
Test Coverage
🔍 Areas for Improvement1. Process Body Parsing Complexity (ProcessEntryHandler.groovy:155-310)The // Current approach intercepts _in_val, _in_path, etc.
def _in_val(tokenVar) {
def varName = extractVariableName(tokenVar)
if( varName ) inputStructures.add([type: 'val', name: varName])
} Suggestion: As @bentsherman mentioned in review comments, leveraging the ProcessDef's InputsList/OutputsList when available would simplify this significantly and make it more maintainable. 2. Token Variable Name Extraction (ProcessEntryHandler.groovy:181-252)The variable name extraction logic is fragile with multiple fallback strategies: def extractVariableNameFromToken(token) {
try {
if( token.hasProperty('variable') && token.variable?.hasProperty('name') ) {
return token.variable.name.toString()
}
// Multiple fallbacks...
} catch( Exception e ) {
log.debug "Error extracting variable name from ${token}: ${e.message}"
return null
}
} Risk: This could break with internal AST changes. Consider adding integration tests that verify this parsing works correctly. 3. Error Handling in Process Execution (ProcessEntryHandler.groovy:298-307)try {
bodyClone.call()
} catch (Exception e) {
// Ignore exceptions during parsing - we only want to capture input structures
} Concern: Silently ignoring all exceptions during parsing could mask important errors. Consider logging warnings for unexpected exceptions. 4. Parameter Type ValidationThe current implementation doesn't validate parameter types beyond basic conversion: case 'path':
case 'file':
if( paramValue instanceof String ) {
return Nextflow.file(paramValue)
}
return paramValue Enhancement: Could validate file existence, path accessibility, or other constraints before execution. 🚨 Potential Issues1. Security Considerations
2. Memory Usage
3. Thread Safety
📝 Minor Improvements1. Code Documentation
2. Logging Consistency// Current: Mix of log levels
log.debug "Getting input arguments for process: ${processDef.name}"
log.warn "Could not parse tuple element: ${item} of class ${className}" Suggestion: Consider consistent logging levels and more structured log messages. 3. Test Coverage Gaps
🎯 Integration with Existing FeaturesBaseScript Integration (BaseScript.groovy:193-198)The integration is clean and follows existing patterns: if( meta.hasExecutableProcesses() ) {
def handler = new ProcessEntryHandler(this, session, meta)
entryFlow = handler.createAutoProcessWorkflow()
} ScriptMeta Extensions (ScriptMeta.groovy:310-320)The
🏆 RecommendationAPPROVE - This is a high-quality implementation that adds significant value to Nextflow. The code is well-structured, thoroughly tested, and follows existing patterns. While there are areas for improvement (particularly around process body parsing), the current implementation is solid and the identified issues are not blockers. Priority Actions:
The feature successfully achieves its goals and maintains backward compatibility while providing a much-improved user experience for simple process execution scenarios. |
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
I'm still not 100% sure about this capability, however I'd like to try in on the road. I'm going to merge as an undocumented feature, then we can decided to revert to improve as suggested by #6391 |
I'm not sure why you need to merge it to master to try it out. You can try it out just fine from a PR build |
"on the road" I mean testing with other tools and (internal) users feedback |
Summary
This PR implements a new feature that allows direct execution of individual Nextflow processes without requiring explicit workflow definitions. Users can now run processes directly from the command line with automatic parameter mapping.
Key Features
--param value
arguments to process inputsval
,path
,env
,tuple
, andeach
input typesUsage Examples
Single Process Script
Create a script with one process (auto-execution):
Execute with:
Multi-Process Script (Auto-runs First Process)
Scripts with multiple processes automatically execute the first process defined:
Execute with:
Comprehensive Input Types Example
Execute with:
Parameter Type Handling
val name
--name "value"
Channel.of(paramValue)
path file
--file "input.txt"
Channel.of(Paths.get(paramValue))
env var
--var "VALUE"
Channel.of(paramValue)
tuple items
--items "a,b,c"
Channel.of([paramValue])
each item
--item "a,b,c"
Channel.fromIterable(split(','))
Error Handling
Missing Required Parameters
nextflow run script.nf # Error: Missing required parameter: --sampleName
No Processes Found
nextflow run empty.nf # Error: No processes found for auto-execution
Testing Instructions
Test Single Process Auto-Execution
Create
test_single.nf
:Run:
nextflow run test_single.nf --message "Hello World"
Test Multi-Process Auto-Execution (First Process)
Create
test_multi.nf
:Run:
Test Different Input Types
Create
test_types.nf
:Run:
Test running a nf-core module
nextflow.config
Implementation Details
Architecture
The implementation extends
BaseScript.groovy
with a comprehensive process entry execution system:Key Components
createAutoProcessWorkflow()
: Auto-execution for process-only scriptsparseProcessInputStructures()
: Extract input definitions by intercepting_in_val
,_in_path
, etc. callsgetProcessInputArguments()
: Map command-line parameters to process inputsparseComplexParameters()
: Handle dot notation for complex parametersBaseScript Integration
Enhanced
BaseScript.groovy
to detect standalone processes:ScriptMeta Extensions
hasExecutableProcesses()
: Detect processes without explicit workflowsgetLocalProcessNames()
: Get list of defined processesBackward Compatibility
This feature is fully backward compatible:
-entry workflowName
syntax remains supportedPerformance Considerations
🤖 Generated with Claude Code