-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the bug
Local Mode is supposed to closely mirror the remote execution behaviour. Yet, execution ids in local mode seem to be created with uuid.uuid4()
class _LocalPipeline(object):
...
def start(self, **kwargs):
execution_id = str(uuid4())
execution = _LocalPipelineExecution(
execution_id=execution_id,
pipeline=self.pipeline,
local_session=self.local_session,
**kwargs,
)
This gives a string like 88f5746b-e64b-4c62-96bb-232a9ad513e7
instead of a 12 digit alphanumeric sequence like 2drr2511ngo3
that I get from all my online executions.
This leads to problems downstream, e.g. where I build some variable I pass to other AWS services that have a length restriction (typically 63 characters) and I get an error in local mode because my variable (e.g. a SageMaker unversioned Model Name) exceeds the character restriction, while in remote execution, the problem does not appear.
More generally, the id just looks different and is much longer, which is inconsistent.
To reproduce
The below code works partially if you have a python:3.13
image available. It will fail before completion, but here's the head of the logs (which is sufficient to show my problem):
2025-09-10 07:27:45,244 sagemaker.remote_function INFO Uploading serialized function code to s3://sagemaker-eu-central-1-982361546614/minimal-step-decorator-local/NoopStep/2025-09-10-07-27-44-120/function
2025-09-10 07:27:45,370 sagemaker.remote_function INFO Uploading serialized function arguments to s3://sagemaker-eu-central-1-982361546614/minimal-step-decorator-local/NoopStep/2025-09-10-07-27-44-120/arguments
WARNING:sagemaker.workflow.utilities:Popping out 'TrainingJobName' from the pipeline definition by default since it will be overridden at pipeline execution time. Please utilize the PipelineDefinitionConfig to persist this field in the pipeline definition if desired.
INFO:sagemaker.local.entities:Starting execution for pipeline minimal-step-decorator-local. Execution ID is 6c61cb71-f1eb-4e33-bfed-7509da4c7e32
# logs cut off here
You see INFO:sagemaker.local.entities:Starting execution for pipeline minimal-step-decorator-local. Execution ID is 6c61cb71-f1eb-4e33-bfed-7509da4c7e32
, which shows the demonstrated structure. Here's the code that produces the above log excerpt.
# minimal_step_decorator_local_pipeline.py
import os
import json
from sagemaker.workflow.function_step import step
from sagemaker.workflow.pipeline import Pipeline
from sagemaker.workflow.pipeline_context import LocalPipelineSession
# ---- 1) Define a do-nothing function and make it a pipeline step ----
@step(name="NoopStep", image_uri="python:3.13", instance_type='ml.m5.large', instance_count=1)
def noop():
"""
Minimal @step: returns a small JSON-serializable object.
No inputs, no training, no processing, no hyperparams.
"""
# Nothing happens here; returning a constant keeps it serializable.
return {"status": "ok", "note": "hello from a minimal step"}
# ---- 2) Build the pipeline object for Local Mode ----
def build_pipeline():
# Create the LocalPipelineSession so the run is local
local_sess = LocalPipelineSession()
# Calling the function returns a DelayedReturn proxy that Pipelines understands
leaf = noop()
# You only pass leaf nodes; SDK infers any dependencies automatically.
# Even though we run locally, Pipeline API still expects an IAM role for create/upsert.
pipeline = Pipeline(
name="minimal-step-decorator-local",
steps=[leaf],
sagemaker_session=local_sess,
)
return pipeline
# ---- 3) Run locally ----
if __name__ == "__main__":
pipeline = build_pipeline()
# Provide a role ARN for the pipeline definition (required by .create/.upsert).
# You can export SAGEMAKER_ROLE_ARN in your shell to avoid editing code.
role_arn = os.environ.get("SAGEMAKER_ROLE_ARN", "SageMakerFullAccess")
if not role_arn:
raise RuntimeError(
"Set an IAM role ARN in env var SAGEMAKER_ROLE_ARN, e.g.\n"
"export SAGEMAKER_ROLE_ARN=arn:aws:iam::<account-id>:role/<sagemaker-execution-role>"
)
# Create or update the pipeline definition (metadata goes to the service;
# steps will execute locally thanks to LocalPipelineSession).
pipeline.upsert(role_arn=role_arn)
# Start the local execution
execution = pipeline.start()
print(f"Started local pipeline execution: {execution.arn}")
# In Local Mode, execution.result() isn't supported — list steps instead.
steps = execution.list_steps()
print("Pipeline steps:")
print(json.dumps(steps, indent=2))
print("Done.")
I tried to find an appropriate unit test first, but they all work around the start()
method of _LocalPipeline
, so there is no easy case.
Expected behavior
Local Mode pipeline executions should return a 12-character, alphanumeric execution id to mimic the remote execution behaviour better.
Screenshots or logs
System information
A description of your system. Please provide:
- SageMaker Python SDK version: v2.251.1 (latest release).
- Python version: 3.13, but I don't think it matters.
- CPU or GPU: Doesn't matter.
- Custom Docker image (Y/N): Well, official Python 3.13 for reproducibility but I don't think that matters.
Additional context
This seems a trivial fix. Please help improving the experience with the LocalMode, it makes my developer workflow so much more productive. One inconsistency in local mode is enough to break the flow and one has to switch to remote execution only, which is a waste of time.
I also opened a support case with AWS Premium Support: 175757749100726