Skip to content

Execution ID has different length and characters in SageMaker Local Mode vs remote execution #5269

@lorenzwalthert

Description

@lorenzwalthert

Describe the bug

Local Mode is supposed to closely mirror the remote execution behaviour. Yet, execution ids in local mode seem to be created with uuid.uuid4()

class _LocalPipeline(object): 
    ... 
    def start(self, **kwargs):
        execution_id = str(uuid4())
        execution = _LocalPipelineExecution(
            execution_id=execution_id,
            pipeline=self.pipeline,
            local_session=self.local_session,
            **kwargs,
        )

This gives a string like 88f5746b-e64b-4c62-96bb-232a9ad513e7 instead of a 12 digit alphanumeric sequence like 2drr2511ngo3 that I get from all my online executions.

This leads to problems downstream, e.g. where I build some variable I pass to other AWS services that have a length restriction (typically 63 characters) and I get an error in local mode because my variable (e.g. a SageMaker unversioned Model Name) exceeds the character restriction, while in remote execution, the problem does not appear.

More generally, the id just looks different and is much longer, which is inconsistent.

To reproduce

The below code works partially if you have a python:3.13 image available. It will fail before completion, but here's the head of the logs (which is sufficient to show my problem):

2025-09-10 07:27:45,244 sagemaker.remote_function INFO     Uploading serialized function code to s3://sagemaker-eu-central-1-982361546614/minimal-step-decorator-local/NoopStep/2025-09-10-07-27-44-120/function
2025-09-10 07:27:45,370 sagemaker.remote_function INFO     Uploading serialized function arguments to s3://sagemaker-eu-central-1-982361546614/minimal-step-decorator-local/NoopStep/2025-09-10-07-27-44-120/arguments
WARNING:sagemaker.workflow.utilities:Popping out 'TrainingJobName' from the pipeline definition by default since it will be overridden at pipeline execution time. Please utilize the PipelineDefinitionConfig to persist this field in the pipeline definition if desired.
INFO:sagemaker.local.entities:Starting execution for pipeline minimal-step-decorator-local. Execution ID is 6c61cb71-f1eb-4e33-bfed-7509da4c7e32
# logs cut off here

You see INFO:sagemaker.local.entities:Starting execution for pipeline minimal-step-decorator-local. Execution ID is 6c61cb71-f1eb-4e33-bfed-7509da4c7e32, which shows the demonstrated structure. Here's the code that produces the above log excerpt.

# minimal_step_decorator_local_pipeline.py
import os
import json

from sagemaker.workflow.function_step import step
from sagemaker.workflow.pipeline import Pipeline
from sagemaker.workflow.pipeline_context import LocalPipelineSession


# ---- 1) Define a do-nothing function and make it a pipeline step ----
@step(name="NoopStep", image_uri="python:3.13", instance_type='ml.m5.large', instance_count=1)
def noop():
    """
    Minimal @step: returns a small JSON-serializable object.
    No inputs, no training, no processing, no hyperparams.
    """
    # Nothing happens here; returning a constant keeps it serializable.
    return {"status": "ok", "note": "hello from a minimal step"}


# ---- 2) Build the pipeline object for Local Mode ----
def build_pipeline():
    # Create the LocalPipelineSession so the run is local
    local_sess = LocalPipelineSession()

    # Calling the function returns a DelayedReturn proxy that Pipelines understands
    leaf = noop()

    # You only pass leaf nodes; SDK infers any dependencies automatically.
    # Even though we run locally, Pipeline API still expects an IAM role for create/upsert.
    pipeline = Pipeline(
        name="minimal-step-decorator-local",
        steps=[leaf],
        sagemaker_session=local_sess,
    )
    return pipeline


# ---- 3) Run locally ----
if __name__ == "__main__":
    pipeline = build_pipeline()

    # Provide a role ARN for the pipeline definition (required by .create/.upsert).
    # You can export SAGEMAKER_ROLE_ARN in your shell to avoid editing code.
    role_arn = os.environ.get("SAGEMAKER_ROLE_ARN", "SageMakerFullAccess")
    if not role_arn:
        raise RuntimeError(
            "Set an IAM role ARN in env var SAGEMAKER_ROLE_ARN, e.g.\n"
            "export SAGEMAKER_ROLE_ARN=arn:aws:iam::<account-id>:role/<sagemaker-execution-role>"
        )

    # Create or update the pipeline definition (metadata goes to the service;
    # steps will execute locally thanks to LocalPipelineSession).
    pipeline.upsert(role_arn=role_arn)

    # Start the local execution
    execution = pipeline.start()
    print(f"Started local pipeline execution: {execution.arn}")

    # In Local Mode, execution.result() isn't supported — list steps instead.
    steps = execution.list_steps()
    print("Pipeline steps:")
    print(json.dumps(steps, indent=2))
    print("Done.")

I tried to find an appropriate unit test first, but they all work around the start() method of _LocalPipeline, so there is no easy case.

Expected behavior

Local Mode pipeline executions should return a 12-character, alphanumeric execution id to mimic the remote execution behaviour better.

Screenshots or logs

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: v2.251.1 (latest release).
  • Python version: 3.13, but I don't think it matters.
  • CPU or GPU: Doesn't matter.
  • Custom Docker image (Y/N): Well, official Python 3.13 for reproducibility but I don't think that matters.

Additional context

This seems a trivial fix. Please help improving the experience with the LocalMode, it makes my developer workflow so much more productive. One inconsistency in local mode is enough to break the flow and one has to switch to remote execution only, which is a waste of time.

I also opened a support case with AWS Premium Support: 175757749100726

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions