-
Notifications
You must be signed in to change notification settings - Fork 0
rfc: sprocket test #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c50ef7f to
439168c
Compare
182296a to
5371c9d
Compare
36d0af6 to
66dcd35
Compare
|
This looks great, a few thoughts.
|
|
Thanks for the review and feedback!
Are you referring to the builtin tests? Because yes, those will be written in Rust, although I consider that an implementation detail users shouldn't have to be concerned with.
As the proposal currently stands, that is correct. No arbitrary code, although:
I think this is probably going to be a "need to have" feature for many users. I have not fully landed on any implementation specifics right now, which is why it's off at the end of the document. I'd welcome some brainstorming on what this might look like!
Yes, absolutely!
I'm not entirely sure. Could you elaborate? Perhaps a bit off-target from the question, but I think another must have feature (which I neglected to include in this first draft) will be some test annotation and filtering capabilities. At the most basic level, users need to be able to note "this test is slow" and only run the "slow" tests conditionally (not on every commit). A more advanced use case would be something like our |
I'll comment if I have any ideas.
Agree filtering would be really nice to have. For example, I use I don't know the Rust space very well, so these questions my be completely stupid but here goes:
|
I welcome other people's thoughts on this, but my gut instinct is that this would be out of scope for this PR. I am most immediately focused on enabling unit testing, where the core target is really the individual tasks which together compose a workflow, as opposed to testing workflows in their entirety. This is mentioned in my RFC but maybe not elaborated on very far, but I view testing workflows in their entirety as being a sufficiently distinct use case. Most tasks can be configured to run fast and light, the same is often not true for workflows. However the API differences between tasks and workflows are practically non-existent (they have the same specifications for inputs and outputs, the rest is reduced to implementation details), which means it would be odd to block workflows from being run in this framework. So my initial answer to this question (though I could certainly have my mind changed) is that we are punting on this problem to deal with further down the line. I do want the problem of workflow validation to be better addressed, but I'm also trying not to bite off more than I can chew 😅 |
This is partially addressed in the prior art portion of the RFC. There are some existing frameworks and tools, but IMO they aren't a great fit for what I'm setting out to achieve. I think we can build something better suited for WDL users than currently exists. If there is existing tooling I have not mentioned that sufficiently addresses this case, I am not aware of it and would appreciate being corrected before I dedicate more of my time and energy on this 🤣 I will say that I have not investigated what Rust crates are currently out there for enabling something like I'm setting out to build, but I would like to avoid re-inventing any wheels and will happily outsource any work I think could be better handled by someone else!
The builtin conditions I wrote up are very similar to what The builtins I included "for initial release" are intended to be:
As stated in Future Possibilities, the tests detailed are meant as a starting point, and they can and should be added to! But again, trying not to take too large a bite here 😂 |
| ] | ||
| prefix = "test.merged" | ||
| [merge_sam_files.tests] | ||
| custom = "quickcheck.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the below, it looks like anything found in custom is always passed a single argument (the outputs.json file), but that should be clarified here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, and just generally feel that you should explicitly include the command line that is going to be run (e.g., <custom_executable> inputs.json outputs.json)
|
|
||
| out_bam=$(jq -r .bam "$out_json") | ||
|
|
||
| samtools quickcheck "$out_bam" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this work in a CI environment? Does the CI need to maintain and install a list of tools? Should the users be encouraged to run these in containers or should the test framework allow specification of a container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. In short, Does the CI need to maintain and install a list of tools? yes.
Running sprocket test within a container would cause "docker in docker" headaches, but maybe the test framework could support spinning up a container just for the custom test execution? Although that might be more headache than it's worth.
I don't think expecting CI maintainers to get their dependencies in order before running sprocket test is a terrible dealbreaker. We're already doing that on workflows for pytest-workflow - https://github.com/stjudecloud/workflows/blob/main/.github/workflows/pytest.yaml#L37-L44
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a future version, I wonder if it would be a good idea to allow custom bash invocations within a container. For example, you can elide the creation of a bash script altogether if you did something like
[[merge_sam_files.assertions.custom]]
container = "ubuntu:latest"
command = "samtools quickcheck $1"There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern with this is that it's adding complexity (for both user and implementer). Not opposed to the idea in itself, but I question if it's worth the complexity? I can add it to the future possibilities section, as I think it's worthwhile to track more formally than just this comment thread.
| { include_if_all = "0x0", exclude_if_any = "0x900", include_if_any = "0x0", exclude_if_all = "0x0" }, | ||
| { include_if_all = "00", exclude_if_any = "0x904", include_if_any = "3", exclude_if_all = "0" }, | ||
| ] | ||
| [[bam_to_fastq.matrix]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason ot separate out some inputs into separate matrix tables? It seems like it would be much clearer to have a single matrix table like:
[[bam_to_fastq.matrix]]
bam = [
"$FIXTURES/test1.bam",
"$FIXTURES/test2.bam",
"$FIXTURES/test3.bam",
]
bam_index = [
"$FIXTURES/test1.bam.bai",
"$FIXTURES/test2.bam.bai",
"$FIXTURES/test3.bam.bai",
]
bitwise_filter = [
{ include_if_all = "0x0", exclude_if_any = "0x900", include_if_any = "0x0", exclude_if_all = "0x0" },
{ include_if_all = "00", exclude_if_any = "0x904", include_if_any = "3", exclude_if_all = "0" },
]
paired_end = [true, false]
retain_collated_bam = [true, false]
append_read_number = [true, false]
output_singletons = [true, false]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, is this saying running every combination found under a matrix against every matrix? So your example would yield 3 * 2 * 2 * 2 * 2 * 2 (96) tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have kept reading. I see now that you were demonstrating the second case with 96 tests.
|
|
||
| Users will be able to annotate each test with arbitrary tags which will allow them to run subsets of the entire test suite. They will also be able to run the tests in a specific file, as opposed to the default `sprocket test` behavior which will be to recurse the `test` directory and run all found tests. This will facilitate a variety of applications, most notably restricting the run to only what the developer knows has changed and parallelizing CI runs. | ||
|
|
||
| We may also want to give some tags special meaning: it is common to annotate "slow" tests and to exclude them from runs by default and we may want to make reduce friction in configuring that case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think sprocket test should end up with arguments like --include-tag and --exclude-tag that would control which tags get included or excluded from the default set.
I think that makes sense. Yeah I'm not able to help on prior art - just thought I'd ask the question - i'm happy with your answer. The part on custom tests https://stjude-rust-labs.github.io/rfcs/branches/rfcs/sprocket-test/0001-sprocket-test.html#custom-tests looks good to me. We're currently using Python for WDL tests, so that's perfect if Python scripts are supported. |
|
If you don't have one already, will there be a repo for demonstration purposes where folks can see how you structure these tests for one or more WDLs? |
|
(disregard if not a good fit for this PR) Maybe this is just an implementation detail, but curious to see what the top of the output will look like. e.g., I like pytest's output that has useful info about versions/etc. for |
just copy and pasted the existing examples into this branch - stjudecloud/workflows#263 |
Yes, perfect |
acfoltzer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for writing this up, Ari!
My comments fall into two categories: first, there are a few specific comments regarding alternatives I'd like to explore that don't put us on the hook for authoring and supporting an entirely new tool for an entirely novel (semantically, at least) test definition language. The remainder and majority of the comments are premised on us exploring those possibilities but deciding to stick with the TOML-based approach that you've outlined so far in the draft.
Please take a look at the higher-level approach comments first, down in the Drawbacks and Alternatives sections. I don't want to get you too bogged down in the details if you end up finding one of the alternatives compelling enough to change course.
text/0001-sprocket-test.md
Outdated
|
|
||
| The Sprocket test framework is primarily specified in TOML, which is expected to be within a `tests/` directory at the root of the WDL workspace. `sprocket test` does not require any special syntax or modification of actual WDL files, and any spec-compliant WDL workspace can create a `tests` directory respected by Sprocket. | ||
|
|
||
| The `tests/` directory is expected to mirror the WDL workspace, where it should have the same path structure, but with `.wdl` file extensions replaced by `.toml` extensions. The TOML files contain tests for tasks and workflows defined at their respective WDL counterpart in the main workspace. e.g. a WDL file located at `data_structures/flag_filter.wdl` would have accompanying tests defined in `tests/data_structures/flag_filter.toml`. Following this structure frees the TOML from having to contain any information about _where_ to find the entrypoints of each test. All the test entrypoints (tasks or workflows) in `tests/data_structures/flag_filter.toml` are expected to be defined in the WDL located at `data_structures/flag_filter.wdl`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we should be separating test definitions from the code they're testing like this. Having parallel file system hierarchies has the advantage for writing a set of tests external to workflows that the test author does not control, but other than that it introduces headaches.
Just from an editor experience point of view, if I'm looking at /some/path/to/foo.wdl and want to create a corresponding test, I might have to manually create up to four directories (counting /tests/) instead of making a new file in the same directory, which is a whole lot simpler in Emacs at least.
Is "spec-compliant WDL workspace" a defined concept? We'd need to nail that down in order to resolve questions like "what happens if there are multiple tests/ directories in the hierarchy?"
More importantly, the further a test definition is from the code it is testing, the more likely it will be neglected when changes are made. If we're lucky, any deficiencies in the change will trigger a test failure that will make us come back around and improve the test as we fix the bug, but otherwise it's way too easy for something to stay out-of-sight and out-of-mind if it's shunted off in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we should be separating test definitions from the code they're testing like this.
I agree with you after having this all pointed out. I can rework this to instead be based off a sibling file of the same basename, but with the differing file extension.
Is "spec-compliant WDL workspace" a defined concept? We'd need to nail that down in order to resolve questions like "what happens if there are multiple tests/ directories in the hierarchy?"
No, it is currently not defined. Although doc is based off the same concept. I think the definition would just be a directory which is recursively searched for .wdl files. My instinct here is that none of the sprocket commands should "go up" and search parent directories, but instead "look down" from the CWD (as many of the current commands operate).
|
|
||
| ## E2E testing | ||
|
|
||
| As stated in the "motivation" section, this proposal is ignoring end-to-end (or E2E) tests and is really just focused on enabling unit testing for CI purposes. Perhaps some of this could be re-used for an E2E API, but I have largely ignored that aspect. (Also I have lots of thoughts about what that might look like, but for brevity will not elaborate further.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can use this tool to run workflows, providing their inputs and making assertions about their outputs, what distinguishes this proposal from E2E testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a user could use this for E2E testing if they really wanted to, it just wouldn't be very ergonomic.
I think that the proposed TOML tests are going to be too limited for proper E2E testing, especially given that many bioinformatics tools/pipelines are non-deterministic. E2E testing using this framework would probably end up making hefty use of the custom feature, which is not the best UX.
To wax poetically, this framework can be used as a hammer if we treat all forms of testing as nails, but I'd argue that not all forms of testing are nails and different tools should be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, do you have a reference for what constitutes "proper E2E testing"? It sounds like there's much more interesting properties involved than I was thinking about
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose I don't really have any citations here 😅 It's something that we've been discussing as a desire to implement for a long time, but we haven't quite nailed down specifics. I think an example would be most illustrative:
We want to provide some form of evidence that version X of a pipeline (more concretely, a WDL workflow) is functionally equivalent to version Y. This problem of functional equivalence is a bit nebulous, and the current state of affairs is that fear of non-equivalence locks people into not updating their software. E2E testing (by my definition at least) is about proving functional equivalence by some metric(s). This would probably have to be defined per-workflow and be very domain-specific (i.e. not something easily generalized).
For the rnaseq-standard pipeline, an E2E test we've discussed is comparing some "truth" run of the pipeline's feature_counts output to the current commit's output, and considering an R^2 value above ~99% as being functionally equivalent.
We can then show this evidence to biologists and say "have no fear! Please update your software 🙏 "
CC to see if @adthrasher is in agreement with the above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that makes a lot of sense! I think these properties fall under the classic V&V banner, though there's a lot of grey area and overlap between these terms.
The approach you describe with rnaseq-standard seems like something we should absolutely target. I'll have to look into literature about fuzzing probabilistic systems, but it could also be interesting to set up a test bench that continuously runs and compares the output for randomly-generated or randomly-perturbed inputs between tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some bio context you may not be aware of yet:
Fuzzing biological data gets tricky fast. There's been some vague discussion of building out that functionality into this Rust Labs tool - https://github.com/stjude-rust-labs/fq?tab=readme-ov-file#generate
My understanding of the general consensus in bioinformatics is that "fake" data has very limited value. Grain of salt, not super well read in this area, but as I understand the distrust of synthetic data goes further than the current methods not being robust enough, but is a matter of principle that synthetic data will never capture the "unknown unknowns" inherent to biological data.
A more common approach (again, from my limited POV) is to subset real biological data until it's useful for test purposes, but ultimately everything needs to run the gamut on real cohorts.
Fuzzed data would be helpful to have for this RFC in that it's useful for integration testing and just ensuring nothing is broken. But for V&V or E2E or whatever we call it, I think convincing biologists that the methods are sound using synthetic data may not be suitable.
Not to say that what you're describing is a dead end, but more so that the scope of what it could be used for is maybe more limited than in other fields of computation.
| # Rationale and alternatives | ||
| [rationale-and-alternatives]: #rationale-and-alternatives | ||
|
|
||
| REVIEWERS: I've thought through quite a wide variety of implementations that have not made it into writing, and I'm not sure how valuable my musings on alternatives I _didn't like_ are. I can expand on this section if it would be informative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative to explore is to make it possible for a test author to drive Sprocket from a pytest, Rust #[test], or other general purpose programming environment. I believe this is the direction that pytest-wdl takes , and I think that is worth a closer look even though that particular project seems abandoned. Taking advantage of a widely-used and widely-supported testing framework would save us from having to support yet another CLI front-end, and would probably offer smoother integration into CI and reporting systems.
This would probably work best by creating a programmatic interface from Python to Sprocket, which would be a whole lot of work. That effort could pay off beyond testing, though, given Python's popularity in bioinformatics. Alternately, a wrapper around the Sprocket CLI probably wouldn't be that bad to write, even if doing an FFI sounds more fun 😉
text/0001-sprocket-test.md
Outdated
|
|
||
| ## Custom tests | ||
|
|
||
| While the builtin test conditions should try and address many common use cases, users need a way to test for things outside the scope of the builtins (especially at launch, when the builtins will be minimal). There needs to be a way for users to execute arbitrary code on the outputs of a task or workflow for validation. This will be exposed via the `tests.custom` test, which will accept a name or array of names of user-written supplied executables (most commonly shell or Python scripts) which are expected to be found in a `tests/custom/` directory. These executables will be invoked with a positional argument which is a path to the task or workflow's `outputs.json`. Users will be responsible for parsing that JSON and performing any validation they desire. So long as the invoked executable exits with a code of zero, the test will be considered as passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the inputs.json available as well would be useful for writing custom assertions where the output is dependent on some property of the input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I started considering add either other positional args or ENV variables that could be useful to have access to. I think there's some additional info we may want to expose beyond just an outputs.json, but not sure how important nailing all that down is for the RFC. This seems like something we can iterate on during development
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure how important nailing all that down is for the RFC. This seems like something we can iterate on during development
The thing I'd watch out for here is the potential for churn if positional arguments are changed around. Since we're targeting an experimental opt-in mode for now, that probably isn't a big threat.
text/0001-sprocket-test.md
Outdated
|
|
||
| ## Custom tests | ||
|
|
||
| While the builtin test conditions should try and address many common use cases, users need a way to test for things outside the scope of the builtins (especially at launch, when the builtins will be minimal). There needs to be a way for users to execute arbitrary code on the outputs of a task or workflow for validation. This will be exposed via the `tests.custom` test, which will accept a name or array of names of user-written supplied executables (most commonly shell or Python scripts) which are expected to be found in a `tests/custom/` directory. These executables will be invoked with a positional argument which is a path to the task or workflow's `outputs.json`. Users will be responsible for parsing that JSON and performing any validation they desire. So long as the invoked executable exits with a code of zero, the test will be considered as passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could also imagine providing a hook for a custom executable which could provide the inputs rather than asserting on the outputs. This would be a way for a user to write their own version of matrix testing by e.g. emitting an array of objects that each would be suitable to use as an input.json.
a-frantz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the big picture Qs raised are being discussed in Slack. This response addresses some of the medium picture Qs 😅
text/0001-sprocket-test.md
Outdated
|
|
||
| The Sprocket test framework is primarily specified in TOML, which is expected to be within a `tests/` directory at the root of the WDL workspace. `sprocket test` does not require any special syntax or modification of actual WDL files, and any spec-compliant WDL workspace can create a `tests` directory respected by Sprocket. | ||
|
|
||
| The `tests/` directory is expected to mirror the WDL workspace, where it should have the same path structure, but with `.wdl` file extensions replaced by `.toml` extensions. The TOML files contain tests for tasks and workflows defined at their respective WDL counterpart in the main workspace. e.g. a WDL file located at `data_structures/flag_filter.wdl` would have accompanying tests defined in `tests/data_structures/flag_filter.toml`. Following this structure frees the TOML from having to contain any information about _where_ to find the entrypoints of each test. All the test entrypoints (tasks or workflows) in `tests/data_structures/flag_filter.toml` are expected to be defined in the WDL located at `data_structures/flag_filter.wdl`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we should be separating test definitions from the code they're testing like this.
I agree with you after having this all pointed out. I can rework this to instead be based off a sibling file of the same basename, but with the differing file extension.
Is "spec-compliant WDL workspace" a defined concept? We'd need to nail that down in order to resolve questions like "what happens if there are multiple tests/ directories in the hierarchy?"
No, it is currently not defined. Although doc is based off the same concept. I think the definition would just be a directory which is recursively searched for .wdl files. My instinct here is that none of the sprocket commands should "go up" and search parent directories, but instead "look down" from the CWD (as many of the current commands operate).
|
|
||
| ## E2E testing | ||
|
|
||
| As stated in the "motivation" section, this proposal is ignoring end-to-end (or E2E) tests and is really just focused on enabling unit testing for CI purposes. Perhaps some of this could be re-used for an E2E API, but I have largely ignored that aspect. (Also I have lots of thoughts about what that might look like, but for brevity will not elaborate further.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a user could use this for E2E testing if they really wanted to, it just wouldn't be very ergonomic.
I think that the proposed TOML tests are going to be too limited for proper E2E testing, especially given that many bioinformatics tools/pipelines are non-deterministic. E2E testing using this framework would probably end up making hefty use of the custom feature, which is not the best UX.
To wax poetically, this framework can be used as a hammer if we treat all forms of testing as nails, but I'd argue that not all forms of testing are nails and different tools should be used.
|
|
||
| ## Test Data | ||
|
|
||
| Most WDL tasks and workflows have `File` type inputs and outputs, so there should be an easy way to incorporate test files into the framework. This can be accomplished with a `tests/fixtures/` directory in the root of the workspace which can be referred to from any TOML test. If the string `$FIXTURES` is found within a TOML string value within the `inputs` table, the correct path to the `fixtures` directory will be dynamically inserted at test run time. This avoids having to track relative paths from TOML that may be arbitrarily nested in relation to test data. For example, let's assume there are `test.bam`, `test.bam.bai`, and `reference.fa.gz` files located within the `tests/fixtures/` directory; the following TOML `inputs` table could be used regardless of where that actual `.toml` file resides within the WDL workspace: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitting the fixtures into a tests/ directory while keeping the tests themselves alongside the WDL documents in place seems a bit strange to me. Just wanted to flag it here, maybe we should choose one or the other to prevent the splitting of context for users.
elaborating on pytest-workflow and mentioning adoption by WDL spec
|
I've realized that the TOML format is incompatible with WDL due to it lacking an equivalent to WDL's The obvious alternatives are YAML and JSON, both of which I find painful to write by hand. YAML is easier for humans to read and write than JSON, so I plan to have this test framework specified in YAML. It should be a pretty straightforward drop in replacement for TOML in this RFC. I do not intend to update this document at this point in time, as I think the RFC has more or less run its course, but I'm leaving this comment for posterity and also to solicit ideas for alternatives I may not have considered. Alternatives I considered but decided against:
|
|
Please see PR - stjude-rust-labs/sprocket#468 for the proposed YAML test syntax. The linked PR is meant to solidify the matrix computation from user-defined inputs. It is possible to see all the executions which will be run (in a future version of the |
|
Open questions not yet settled which will need to be addressed soon:
If anyone has any thoughts on these open questions, I'd love to hear them! They will be addressed in future PRs which will link back to here. |
|
given the statement of intentions merged in #4, this RFC will be merged tomorrow as all the above conversations seem to have run their course. Future Sprocket PRs (like stjude-rust-labs/sprocket#468 ) will continue to link to this PR for context, although the details of New conversations may continue to use this PR as an anchor of sorts, particularly if this document has made any glaring or fundamental mistake. The remaining details (as outlined in this comment) would be best discussed on upcoming PRs which deal with their implementation rather than this PR, which will now function primarily as an archive of design goals and intentions. |
* Create 0001-sprocket-test.md * Update ci-build.sh * Update 0000-placeholder.md * docs: TODO -> REVIEWERS * docs: link to template * revise: elaborate on pytest-workflow * feat: more prior art * feat: elaborate on some other future possibilities * feat: test filtering * feat: future possibility: caching * feat: custom test details * feat: more discussion of the custom test design * chore: typos, rephrasing awkward clauses, etc * chore: review feedback * revise: "test" -> "assertion" * chore: review feedback elaborating on pytest-workflow and mentioning adoption by WDL spec * ci: remove bad key 4793746
View the rendered doc here -
https://stjude-rust-labs.github.io/rfcs/branches/rfcs/sprocket-test/0001-sprocket-test.htmlhttps://stjude-rust-labs.github.io/rfcs/0001-sprocket-test.html