Skip to content

Conversation

@alyssacgoins
Copy link
Contributor

@alyssacgoins alyssacgoins commented Oct 31, 2025

Description of your changes:
Updates the logic added in #12248 and also adds spec files to compiled workflow and pipeline files in order to correct the pipeline_with_artifact_custom_path test case.

Checklist:

@google-oss-prow
Copy link

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@alyssacgoins alyssacgoins force-pushed the model-upload-custom-path-fix branch 2 times, most recently from 5afbe00 to a867d8a Compare October 31, 2025 20:26
@alyssacgoins alyssacgoins force-pushed the model-upload-custom-path-fix branch from d3b3810 to 52f2f5a Compare November 3, 2025 13:02
@alyssacgoins alyssacgoins changed the title Update logic and testing. Correct Runtime Artifact custom path logic/testing. Nov 3, 2025
@alyssacgoins alyssacgoins marked this pull request as ready for review November 3, 2025 13:35
@alyssacgoins alyssacgoins force-pushed the model-upload-custom-path-fix branch from 0bca2f8 to 987593b Compare November 3, 2025 13:40
def validate_custom_artifact_path(num: int, out_dataset: Output[Dataset]):
with open(out_dataset.path, 'w') as f:
f.write(str(2 * num))
out_dataset._set_path('/etc/test/file/path')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the user was supposed to call it via out_dataset.set_path('/etc/test/file/path') or out_dataset.custom_path = '/etc/test/file/path'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Jira for this task specifies I can call set_path on the output artifact - I took that to mean that when a user calls set_path, it sets the custom path. But I also see both interpretations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mprahl let me know if you think it makes more sense to update to your comment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alyssacgoins it was more that it should be set_path and not _set_path to test the way the user would leverage the feature.

@alyssacgoins alyssacgoins force-pushed the model-upload-custom-path-fix branch 2 times, most recently from 1d4b9d4 to c6042ed Compare November 3, 2025 21:25
output_artifact_task = generate_artifact()
output_artifact_task.output.set_path('/etc/test/file/path')
# Generate an artifact, set a custom path and validate the path.
task = validate_custom_artifact_path(num=1).set_caching_options(False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add another task that imports the artifact to ensure the correct thing got uploaded and then redownloaded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

def validate_custom_artifact_path(num: int, out_dataset: Output[Dataset]):
with open(out_dataset.path, 'w') as f:
f.write(str(2 * num))
out_dataset.set_path('/etc/test/file/path')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use a writable path for non-root users like something in /tmp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@alyssacgoins alyssacgoins force-pushed the model-upload-custom-path-fix branch 4 times, most recently from 88b62d4 to c3341e1 Compare November 7, 2025 15:52
def component_output_artifact(out_dataset: Output[Dataset]):
with open(out_dataset.path, 'w') as f:
f.write('Hello, World!')
out_dataset.set_path('/tmp/output/dataset')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call this before writing to out_dataset.path. Then validate_input_artifact should assert the contents are as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow - to clarify, which line should be executed before writing to out_dataset.path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@alyssacgoins alyssacgoins force-pushed the model-upload-custom-path-fix branch 3 times, most recently from 253f9f3 to 0e0b1d2 Compare November 7, 2025 21:09
}
// If runtime artifact is set with a custom path, add this path to the custom properties map.
if runtimeArtifact.CustomPath != nil {
artifact.CustomProperties["custom_path"] = stringMLMDValue(*runtimeArtifact.CustomPath)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be storing this as a custom property. It's not relevant to the artifact once the artifact is uploaded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that metadata_store.Artifact, which I believe is a third-party resource, does not have a custom_path field. If I'm following the logic correctly, type metadata_store.Artifact is uploaded to MLMD - I added custom_path as a property in order to preserve this value upon retrieval from MLMD

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

// Retrieve custom_path value from artifact.CustomProperties, if present.
var customPathStr string
if artifact.CustomProperties != nil {
customPath := artifact.CustomProperties["custom_path"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The custom path flow should be:

  1. Executor sets the custom path property on the output artifact in the executor output
  2. The uploadOutputArtifacts function checks if the custom path is present and uses that to upload instead of the predefined path
  3. The custom path is no longer needed and be ignored and not set on the MLMD artifact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I found is that, while I modified pipeline_spec.RuntimeArtifact to include the custom_path field, and uploadOutputArtifacts uploads to that custom path, when the MLMD artifact is retrieved with no custom path set on the object, the executor does not know what path to retrieve the file from when launcher_v2.retrieveArtifactPath() executes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

@alyssacgoins alyssacgoins force-pushed the model-upload-custom-path-fix branch 3 times, most recently from e5f5704 to 9eb3e37 Compare November 10, 2025 23:00
raise ValueError(f"File uri is {input_list.path} but should be {exp_path}.")
def validate_input_artifact(in_dataset: Input[Dataset]) -> bool:
if in_dataset is None:
raise ValueError("Input artifact is None.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please have the test read the input dataset at in_dataset.path and assert it equals Hello, World!?

def _get_custom_path(self) -> str:
return self._custom_path

def _set_path(self, path: str) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: It'd be nice if this was kept at its original location to not shadow the original commit history.

@alyssacgoins alyssacgoins force-pushed the model-upload-custom-path-fix branch from 6315e42 to e91af08 Compare November 11, 2025 15:45
Copy link
Collaborator

@mprahl mprahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@alyssacgoins alyssacgoins force-pushed the model-upload-custom-path-fix branch from e91af08 to 19d000b Compare November 11, 2025 16:26
@google-oss-prow google-oss-prow bot removed the lgtm label Nov 11, 2025
@mprahl
Copy link
Collaborator

mprahl commented Nov 11, 2025

/approve
/lgtm

@google-oss-prow google-oss-prow bot added the lgtm label Nov 11, 2025
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mprahl

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mprahl mprahl merged commit b01a82a into kubeflow:master Nov 11, 2025
126 of 129 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants