-
Couldn't load subscription status.
- Fork 199
feat: Add Alignoth wrapper #4676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughAdds an Alignoth Snakemake wrapper, conda environment manifests (including a pinned linux-64 file), wrapper metadata, test rules and test data, plus a unit test exercising JSON, HTML and directory output modes via the wrapper. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Test as test_alignoth()
participant SM as Snakemake
participant W as wrapper.py
participant A as Alignoth CLI
Test->>SM: invoke rules for json/html/dir
SM->>W: call wrapper with inputs, params, outputs
W->>W: validate required inputs (bam, reference)
W->>W: build command (base + optional vcf/bed + extra)
alt output ends with .json
W->>A: run command (capture stdout -> json file)
A-->>W: stdout JSON
else output ends with .html
W->>A: run command with --html (redirect -> html file)
A-->>W: HTML produced
else output is directory
W->>W: ensure directory exists
W->>A: run command with -o <dir>
A-->>W: files written to dir
else
W->>SM: raise ValueError (invalid output)
end
W->>SM: return / write logs
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (2)**/*.py⚙️ CodeRabbit configuration file
Files:
**/wrapper.py⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (3)📚 Learning: 2024-11-26T14:59:03.678ZApplied to files:
📚 Learning: 2024-11-15T18:31:15.447ZApplied to files:
📚 Learning: 2024-11-21T10:23:03.427ZApplied to files:
🪛 Ruff (0.14.1)bio/alignoth/wrapper.py9-9: Undefined name (F821) 10-10: Undefined name (F821) 11-11: Undefined name (F821) 16-16: Undefined name (F821) 17-17: Undefined name (F821) 19-19: Avoid specifying long messages outside the exception class (TRY003) 22-22: Undefined name (F821) 23-23: Undefined name (F821) 25-25: Avoid specifying long messages outside the exception class (TRY003) 28-28: Undefined name (F821) 29-29: Undefined name (F821) 32-32: Undefined name (F821) 33-33: Undefined name (F821) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Add test_alignoth to verify output-dir target and update Snakefile to use output-dir instead of out/ for alignoth_output_dir rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (3)
bio/alignoth/environment.yaml (1)
6-6: Minor: Consider removing space in version specification.The dependency specification uses
alignoth =1.1.2with a space before the equals sign. While this syntax is valid, the standard conda convention is to omit the space:alignoth=1.1.2.Apply this diff to align with conda conventions:
- - alignoth =1.1.2 + - alignoth=1.1.2bio/alignoth/test/Snakefile (1)
17-31: Consider renaming rule for clarity.The rule
alignoth_tsvoutputs an HTML file (html="out/plot.html"), which creates a naming mismatch. Consider renaming the rule toalignoth_htmlto better reflect its output type.Apply this diff to improve clarity:
-rule alignoth_tsv: +rule alignoth_html: input:bio/alignoth/wrapper.py (1)
42-45: Validate mutual exclusivity ofregionandaroundparameters.The if/elif structure silently ignores
aroundwhen both parameters are set. Consider explicitly validating that only one is provided to avoid confusion.Apply this diff to add validation:
+if region and around: + raise ValueError("Parameters 'region' and 'around' are mutually exclusive") + if region: cmd.append(f"-g {region}") elif around: cmd.append(f"-a {around}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
bio/alignoth/test/1257A.vcf.gzis excluded by!**/*.gz
📒 Files selected for processing (9)
bio/alignoth/environment.linux-64.pin.txt(1 hunks)bio/alignoth/environment.yaml(1 hunks)bio/alignoth/meta.yaml(1 hunks)bio/alignoth/test/Snakefile(1 hunks)bio/alignoth/test/ref.fa(1 hunks)bio/alignoth/test/ref.fa.fai(1 hunks)bio/alignoth/test/test.bed(1 hunks)bio/alignoth/wrapper.py(1 hunks)test_wrappers.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
bio/alignoth/wrapper.pytest_wrappers.py
**/wrapper.py
⚙️ CodeRabbit configuration file
Do not complain about use of undefined variable called
snakemake.
Files:
bio/alignoth/wrapper.py
🪛 Ruff (0.14.1)
bio/alignoth/wrapper.py
8-8: Undefined name snakemake
(F821)
9-9: Undefined name snakemake
(F821)
14-14: Undefined name snakemake
(F821)
15-15: Undefined name snakemake
(F821)
17-17: Avoid specifying long messages outside the exception class
(TRY003)
20-20: Undefined name snakemake
(F821)
21-21: Undefined name snakemake
(F821)
23-23: Avoid specifying long messages outside the exception class
(TRY003)
26-26: Undefined name snakemake
(F821)
27-27: Undefined name snakemake
(F821)
30-30: Undefined name snakemake
(F821)
31-31: Undefined name snakemake
(F821)
33-33: Undefined name snakemake
(F821)
34-34: Undefined name snakemake
(F821)
35-35: Undefined name snakemake
(F821)
36-36: Undefined name snakemake
(F821)
37-37: Undefined name snakemake
(F821)
38-38: Undefined name snakemake
(F821)
39-39: Undefined name snakemake
(F821)
40-40: Undefined name snakemake
(F821)
62-62: Undefined name snakemake
(F821)
63-63: Undefined name snakemake
(F821)
65-67: Avoid specifying long messages outside the exception class
(TRY003)
69-69: Undefined name snakemake
(F821)
70-70: Undefined name snakemake
(F821)
75-75: Undefined name snakemake
(F821)
76-76: Undefined name snakemake
(F821)
78-80: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: testing
- GitHub Check: docs
- GitHub Check: Summary
🔇 Additional comments (11)
bio/alignoth/test/ref.fa.fai (1)
1-1: LGTM!The FAI index format is correct and consistent with the reference FASTA sequence.
bio/alignoth/test/test.bed (1)
1-3: LGTM!The BED format is correct, and the chromosome name and coordinates are consistent with the test reference sequence.
test_wrappers.py (1)
253-267: LGTM!The test function correctly exercises the wrapper for all three output modes (JSON, HTML, and directory) and follows the established testing pattern used throughout this file.
bio/alignoth/test/ref.fa (1)
1-2: LGTM!The FASTA format is valid, and the sequence length matches the FAI index entry (2411 bp).
bio/alignoth/environment.linux-64.pin.txt (1)
1-31: LGTM!The explicit environment file is well-formed and provides reproducible dependency resolution for the linux-64 platform. The pinned packages align with the alignoth=1.1.2 dependency specified in environment.yaml.
bio/alignoth/meta.yaml (1)
1-29: LGTM!The metadata manifest is well-structured and clearly documents the wrapper's inputs, outputs, and parameters. The note about mutually exclusive outputs is helpful for users.
bio/alignoth/test/Snakefile (1)
1-47: Test rules effectively demonstrate wrapper capabilities.The three rules clearly demonstrate the mutually exclusive output modes (JSON, HTML, directory) documented in the wrapper's metadata. The consistent input and parameter structure across rules makes the differences in output handling easy to understand.
bio/alignoth/wrapper.py (4)
11-23: LGTM!The required input validation is correctly implemented using
hasattrwith clear error messages for missing inputs.
25-31: LGTM!Optional input handling is correct and consistent with the required input validation pattern.
52-56: LGTM!The parameter handling for
max_read_depthandno_embed_jsis correct.
62-62: Potential AttributeError when checking output attributes.Direct attribute access with
snakemake.output.htmlwill raiseAttributeErrorif the attribute doesn't exist. Usehasattr()orgetattr()for safe checking, consistent with the input validation pattern.Apply this diff to fix the issue:
# HTML output if html: cmd.append("--html") # Redirect output to file if output specified - if snakemake.output.html: + if hasattr(snakemake.output, "html"): shell(f"{' '.join(cmd)} > {snakemake.output.html} {log}") else: raise ValueError( "HTML output requires a file path defined in output named 'html'" ) # Directory output -elif snakemake.output.dir: +elif hasattr(snakemake.output, "dir"): cmd.append(f"-o {snakemake.output.dir}") if format: cmd.append(f"-f {format}") shell(f"{' '.join(cmd)} {log}") # Default: JSON output to single file via stdout -elif snakemake.output.json: +elif hasattr(snakemake.output, "json"): shell(f"{' '.join(cmd)} > {snakemake.output.json} {log}")Likely an incorrect or invalid review comment.
bio/alignoth/wrapper.py
Outdated
| if highlight: | ||
| for h in highlight: | ||
| cmd.append(f"-h {h}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle highlight parameter type safely.
If highlight is a single string instead of a list, the for loop will iterate over individual characters, creating incorrect -h flags for each character.
Apply this diff to handle both single values and lists:
if highlight:
+ if isinstance(highlight, str):
+ highlight = [highlight]
for h in highlight:
cmd.append(f"-h {h}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if highlight: | |
| for h in highlight: | |
| cmd.append(f"-h {h}") | |
| if highlight: | |
| if isinstance(highlight, str): | |
| highlight = [highlight] | |
| for h in highlight: | |
| cmd.append(f"-h {h}") |
🤖 Prompt for AI Agents
In bio/alignoth/wrapper.py around lines 47 to 49, the code assumes highlight is
iterable and will iterate characters if a single string is passed; change the
handling to treat a single string as one highlight by normalizing highlight into
a sequence (e.g., if highlight is None do nothing; elif isinstance(highlight,
(list, tuple)) use it as-is; else wrap it in a single-item list) and then
iterate that normalized list to append the "-h <value>" flags so individual
characters are not treated as separate highlights.
Co-authored-by: Filipe G. Vieira <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
bio/alignoth/wrapper.py (1)
16-33: Consider using proper shell quoting for robustness.The command arguments embed file paths directly in f-strings (e.g.,
f"-b {snakemake.input.bam}"). If any input paths contain spaces or special characters, the command will fail. While Snakemake inputs are typically controlled, using proper quoting makes the wrapper more robust.Consider using
shlex.quote()to properly escape values:+import shlex + # BAM input if snakemake.input.get("bam", ""): - cmd.append(f"-b {snakemake.input.bam}") + cmd.extend(["-b", shlex.quote(snakemake.input.bam)]) else: raise ValueError("BAM input required") # Reference if snakemake.input.get("reference", ""): - cmd.append(f"-r {snakemake.input.reference}") + cmd.extend(["-r", shlex.quote(snakemake.input.reference)]) else: raise ValueError("Reference input required") # Optional VCF if snakemake.input.get("vcf", ""): - cmd.append(f"-v {snakemake.input.vcf}") + cmd.extend(["-v", shlex.quote(snakemake.input.vcf)]) # Optional BED if snakemake.input.get("bed", ""): - cmd.append(f"--bed {snakemake.input.bed}") + cmd.extend(["--bed", shlex.quote(snakemake.input.bed)])Alternatively, if you prefer to keep the f-string style, you can quote within the f-string:
cmd.append(f"-b {shlex.quote(snakemake.input.bam)}")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
bio/alignoth/meta.yaml(1 hunks)bio/alignoth/test/Snakefile(1 hunks)bio/alignoth/wrapper.py(1 hunks)test_wrappers.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- bio/alignoth/test/Snakefile
- bio/alignoth/meta.yaml
- test_wrappers.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
bio/alignoth/wrapper.py
**/wrapper.py
⚙️ CodeRabbit configuration file
Do not complain about use of undefined variable called
snakemake.
Files:
bio/alignoth/wrapper.py
🪛 Ruff (0.14.1)
bio/alignoth/wrapper.py
9-9: Undefined name snakemake
(F821)
10-10: Undefined name snakemake
(F821)
11-11: Undefined name snakemake
(F821)
16-16: Undefined name snakemake
(F821)
17-17: Undefined name snakemake
(F821)
19-19: Avoid specifying long messages outside the exception class
(TRY003)
22-22: Undefined name snakemake
(F821)
23-23: Undefined name snakemake
(F821)
25-25: Avoid specifying long messages outside the exception class
(TRY003)
28-28: Undefined name snakemake
(F821)
29-29: Undefined name snakemake
(F821)
32-32: Undefined name snakemake
(F821)
33-33: Undefined name snakemake
(F821)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: docs
- GitHub Check: testing
- GitHub Check: Summary
🔇 Additional comments (1)
bio/alignoth/wrapper.py (1)
37-48: Well-structured output handling.The three-way output handling (HTML, JSON, directory) is correctly implemented and matches the expected behavior described in the PR summary. The use of
Path(output).mkdir(parents=True, exist_ok=True)properly ensures the directory exists before writing.
| if snakemake.input.get("bed", ""): | ||
| cmd.append(f"--bed {snakemake.input.bed}") | ||
|
|
||
| cmd.append(f"{extra}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid appending empty extra parameter.
Line 35 appends extra even when it's an empty string (the default). This adds an unnecessary empty element to the command list. Additionally, if extra contains multiple space-separated arguments, appending it as a single string means it will be treated as one argument when the list is joined.
Apply this diff to fix both issues:
-cmd.append(f"{extra}")
+if extra:
+ import shlex
+ cmd.extend(shlex.split(extra))This ensures:
- Empty
extrais not appended - Multiple arguments in
extraare properly split and added as separate list elements
🤖 Prompt for AI Agents
In bio/alignoth/wrapper.py around line 35, the code unconditionally appends the
extra string which can add an empty element or treat multiple space-separated
args as one; change it so that you only add extra when non-empty and split it
into separate arguments (use shlex.split or equivalent) and extend the cmd list
with the resulting tokens instead of appending the raw string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
bio/alignoth/wrapper.py (1)
35-35: Still appending emptyextraparameter and not splitting multiple arguments.This issue was flagged in a previous review but remains unresolved. The current code:
- Appends an empty string when
extra=""(the default), adding unnecessary empty element to the command- Treats space-separated arguments as a single argument instead of splitting them properly
Apply this diff to fix both issues:
-cmd.append(f"{extra}") +if extra: + import shlex + cmd.extend(shlex.split(extra))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
bio/alignoth/test/Snakefile(1 hunks)bio/alignoth/wrapper.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
bio/alignoth/wrapper.py
**/wrapper.py
⚙️ CodeRabbit configuration file
Do not complain about use of undefined variable called
snakemake.
Files:
bio/alignoth/wrapper.py
🪛 Ruff (0.14.1)
bio/alignoth/wrapper.py
9-9: Undefined name snakemake
(F821)
10-10: Undefined name snakemake
(F821)
11-11: Undefined name snakemake
(F821)
16-16: Undefined name snakemake
(F821)
17-17: Undefined name snakemake
(F821)
19-19: Avoid specifying long messages outside the exception class
(TRY003)
22-22: Undefined name snakemake
(F821)
23-23: Undefined name snakemake
(F821)
25-25: Avoid specifying long messages outside the exception class
(TRY003)
28-28: Undefined name snakemake
(F821)
29-29: Undefined name snakemake
(F821)
32-32: Undefined name snakemake
(F821)
33-33: Undefined name snakemake
(F821)
49-51: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: docs
- GitHub Check: testing
- GitHub Check: Summary
🔇 Additional comments (6)
bio/alignoth/wrapper.py (6)
1-8: LGTM!Standard wrapper metadata and appropriate imports for command execution and path handling.
9-11: LGTM!Standard Snakemake wrapper setup: log formatting, parameter retrieval, and output path extraction.
13-19: LGTM!Proper input validation for required BAM file with clear error messaging.
21-25: LGTM!Proper validation for required reference input with appropriate error handling.
27-29: LGTM!Correct handling of optional VCF input.
31-33: LGTM!Correct handling of optional BED input.
| rule alignoth_tsv: | ||
| input: | ||
| bam="NA12878.bam", | ||
| reference="ref.fa", | ||
| vcf="1257A.vcf.gz" | ||
| output: | ||
| "out/plot.html" | ||
| params: | ||
| extra="-g 1:50-280 -h coolregion:200-210" | ||
| log: | ||
| "logs/log.txt" | ||
| wrapper: | ||
| "master/bio/alignoth" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misleading rule name: outputs HTML but named alignoth_tsv.
The rule name suggests TSV output, but it actually produces an HTML file (out/plot.html). Consider renaming to alignoth_html for clarity.
Apply this diff to fix the rule name:
-rule alignoth_tsv:
+rule alignoth_html:
input:
bam="NA12878.bam",
reference="ref.fa",
vcf="1257A.vcf.gz"
output:
"out/plot.html"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| rule alignoth_tsv: | |
| input: | |
| bam="NA12878.bam", | |
| reference="ref.fa", | |
| vcf="1257A.vcf.gz" | |
| output: | |
| "out/plot.html" | |
| params: | |
| extra="-g 1:50-280 -h coolregion:200-210" | |
| log: | |
| "logs/log.txt" | |
| wrapper: | |
| "master/bio/alignoth" | |
| rule alignoth_html: | |
| input: | |
| bam="NA12878.bam", | |
| reference="ref.fa", | |
| vcf="1257A.vcf.gz" | |
| output: | |
| "out/plot.html" | |
| params: | |
| extra="-g 1:50-280 -h coolregion:200-210" | |
| log: | |
| "logs/log.txt" | |
| wrapper: | |
| "master/bio/alignoth" |
🤖 Prompt for AI Agents
In bio/alignoth/test/Snakefile around lines 15 to 27 the rule is named
alignoth_tsv but its output is an HTML file (out/plot.html); rename the rule to
alignoth_html to reflect the actual output, update any references/calls to this
rule in the Snakefile or other workflow files to use alignoth_html, and ensure
the new name is consistent with logging or wrapper usage.
QC
snakemake-wrappers.While the contributions guidelines are more extensive, please particularly ensure that:
test.pywas updated to call any added or updated example rules in aSnakefileinput:andoutput:file paths in the rules can be chosen arbitrarilyinput:oroutput:)tempfile.gettempdir()points tometa.yamlcontains a link to the documentation of the respective tool or command underurl:Summary by CodeRabbit