-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Correct Runtime Artifact custom path logic/testing. #12402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct Runtime Artifact custom path logic/testing. #12402
Conversation
|
Skipping CI for Draft Pull Request. |
5afbe00 to
a867d8a
Compare
d3b3810 to
52f2f5a
Compare
0bca2f8 to
987593b
Compare
| def validate_custom_artifact_path(num: int, out_dataset: Output[Dataset]): | ||
| with open(out_dataset.path, 'w') as f: | ||
| f.write(str(2 * num)) | ||
| out_dataset._set_path('/etc/test/file/path') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the user was supposed to call it via out_dataset.set_path('/etc/test/file/path') or out_dataset.custom_path = '/etc/test/file/path'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Jira for this task specifies I can call set_path on the output artifact - I took that to mean that when a user calls set_path, it sets the custom path. But I also see both interpretations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mprahl let me know if you think it makes more sense to update to your comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alyssacgoins it was more that it should be set_path and not _set_path to test the way the user would leverage the feature.
1d4b9d4 to
c6042ed
Compare
| output_artifact_task = generate_artifact() | ||
| output_artifact_task.output.set_path('/etc/test/file/path') | ||
| # Generate an artifact, set a custom path and validate the path. | ||
| task = validate_custom_artifact_path(num=1).set_caching_options(False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add another task that imports the artifact to ensure the correct thing got uploaded and then redownloaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
| def validate_custom_artifact_path(num: int, out_dataset: Output[Dataset]): | ||
| with open(out_dataset.path, 'w') as f: | ||
| f.write(str(2 * num)) | ||
| out_dataset.set_path('/etc/test/file/path') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use a writable path for non-root users like something in /tmp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
88b62d4 to
c3341e1
Compare
| def component_output_artifact(out_dataset: Output[Dataset]): | ||
| with open(out_dataset.path, 'w') as f: | ||
| f.write('Hello, World!') | ||
| out_dataset.set_path('/tmp/output/dataset') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call this before writing to out_dataset.path. Then validate_input_artifact should assert the contents are as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow - to clarify, which line should be executed before writing to out_dataset.path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
253f9f3 to
0e0b1d2
Compare
backend/src/v2/metadata/converter.go
Outdated
| } | ||
| // If runtime artifact is set with a custom path, add this path to the custom properties map. | ||
| if runtimeArtifact.CustomPath != nil { | ||
| artifact.CustomProperties["custom_path"] = stringMLMDValue(*runtimeArtifact.CustomPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be storing this as a custom property. It's not relevant to the artifact once the artifact is uploaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that metadata_store.Artifact, which I believe is a third-party resource, does not have a custom_path field. If I'm following the logic correctly, type metadata_store.Artifact is uploaded to MLMD - I added custom_path as a property in order to preserve this value upon retrieval from MLMD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved
backend/src/v2/metadata/converter.go
Outdated
| // Retrieve custom_path value from artifact.CustomProperties, if present. | ||
| var customPathStr string | ||
| if artifact.CustomProperties != nil { | ||
| customPath := artifact.CustomProperties["custom_path"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The custom path flow should be:
- Executor sets the custom path property on the output artifact in the executor output
- The
uploadOutputArtifactsfunction checks if the custom path is present and uses that to upload instead of the predefined path - The custom path is no longer needed and be ignored and not set on the MLMD artifact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue I found is that, while I modified pipeline_spec.RuntimeArtifact to include the custom_path field, and uploadOutputArtifacts uploads to that custom path, when the MLMD artifact is retrieved with no custom path set on the object, the executor does not know what path to retrieve the file from when launcher_v2.retrieveArtifactPath() executes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved
e5f5704 to
9eb3e37
Compare
| raise ValueError(f"File uri is {input_list.path} but should be {exp_path}.") | ||
| def validate_input_artifact(in_dataset: Input[Dataset]) -> bool: | ||
| if in_dataset is None: | ||
| raise ValueError("Input artifact is None.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please have the test read the input dataset at in_dataset.path and assert it equals Hello, World!?
| def _get_custom_path(self) -> str: | ||
| return self._custom_path | ||
|
|
||
| def _set_path(self, path: str) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: It'd be nice if this was kept at its original location to not shadow the original commit history.
6315e42 to
e91af08
Compare
mprahl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/lgtm
Signed-off-by: agoins <[email protected]>
e91af08 to
19d000b
Compare
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mprahl The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Description of your changes:
Updates the logic added in #12248 and also adds spec files to compiled workflow and pipeline files in order to correct the
pipeline_with_artifact_custom_pathtest case.Checklist: