feat: allow for dynamic server start command and custom run.yaml #84

nathan-weinberg · 2025-10-16T18:05:09Z

What does this PR do?

this commit changes the server start command to be dynamically generated based on Llama Stack version

it allows users to pass a custom run.yaml if they so choose will keeping the official run.yaml we ship with the distro image as a default

Closes #83

Summary by CodeRabbit

Refactor
- Container startup is now configurable at build time; build logic selects an appropriate startup command and emits clearer validation/info when generating container configs.
- The default runtime argument (path to the run YAML) is now provided separately so the startup command is more flexible.
Documentation
- Added "Running with a custom run YAML" guidance in two places with examples and a dependency notice.

coderabbitai · 2025-10-16T18:05:31Z

Walkthrough

Decouples the container ENTRYPOINT from the run.yaml path: templates now accept a configurable entrypoint and a CMD for /opt/app-root/run.yaml. distribution/build.py adds get_entrypoint() (version-aware), updates containerfile generation to inject the entrypoint and validate/clean output. README adds "Running with a custom run YAML" guidance.

Changes

Cohort / File(s)	Summary
Containerfile `distribution/Containerfile`	ENTRYPOINT changed from `["llama","stack","run","/opt/app-root/run.yaml"]` to `["llama","stack","run"]`; new `CMD ["/opt/app-root/run.yaml"]` added.
Containerfile template `distribution/Containerfile.in`	Replaced fixed ENTRYPOINT with templated `{entrypoint}` and added `CMD ["/opt/app-root/run.yaml"]`.
Build & entrypoint logic `distribution/build.py`	Added `get_entrypoint(llama_stack_version)` using `packaging.version`; updated `generate_containerfile(..., entrypoint)` signature and logic to validate template existence, inject `entrypoint`, prepend a generated-header, and strip blank lines; main now computes and passes `entrypoint`.
Docs `README.md`	Added "Running with a custom run YAML" section (inserted twice) with a podman example and a dependency note about matching image dependencies to custom YAML.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI as build.py:main
  participant GE as get_entrypoint()
  participant PV as packaging.version
  participant GC as generate_containerfile()
  participant TF as Containerfile.in
  participant Out as distribution/Containerfile

  CLI->>GE: get_entrypoint(llama_stack_version)
  GE->>PV: parse & compare version (or detect source install)
  PV-->>GE: comparison result
  GE-->>CLI: return templated `entrypoint` string
  CLI->>GC: generate_containerfile(deps, install_flag, entrypoint)
  GC->>TF: load template & substitute `{entrypoint}`
  GC->>GC: prepend header, strip blank lines
  GC->>Out: write formatted Containerfile
  note right of Out#a8dda8: Output now uses configurable ENTRYPOINT + CMD "/opt/app-root/run.yaml"

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Review get_entrypoint() logic and version comparison edge-cases in distribution/build.py.
Verify template substitution and blank-line stripping behavior in generate_containerfile.
Confirm resulting distribution/Containerfile preserves intended quoting/JSON array ENTRYPOINT format.

Poem

🐰
I nudged the entry, set the path free,
versions checked kindly, then wiggled with glee.
Template trimmed neat, CMD tucked in line,
the container hops off, all tidy and fine—
carrots packed, deployment feels divine 🥕

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "feat: allow for dynamic server start command and custom run.yaml" directly and accurately reflects the primary changes in the changeset. The modifications to Containerfile, Containerfile.in, build.py, and README.md all center on implementing these two core features: replacing the hardcoded entrypoint with a version-aware dynamic mechanism and enabling users to supply custom run.yaml files. The title is concise, specific, and clearly communicates the main changes without unnecessary detail or vagueness.
Linked Issues Check	✅ Passed	The code changes comprehensively address all four primary objectives from issue #83. The PR implements dynamic entrypoint selection via the new get_entrypoint() function that adapts the startup command based on Llama Stack version [#83], separates ENTRYPOINT and CMD to allow users to override run.yaml by mounting custom files and providing them as CMD arguments [#83], and includes README documentation demonstrating how to use custom run.yaml [#83]. The distribution image now handles custom run.yaml internally rather than requiring operator-level handling [#83]. Version-aware entrypoint logic ensures maintainers need not manually update the command for CLI changes [#83].
Out of Scope Changes Check	✅ Passed	All changes in the pull request are directly scoped to implementing issue #83 objectives. The modifications to Containerfile and Containerfile.in implement the dynamic entrypoint and custom run.yaml support; build.py changes provide the version-aware entrypoint selection logic needed for that feature; and README.md additions document how users can leverage the new custom run.yaml capability. No unrelated functionality, refactoring, or features outside the scope of supporting dynamic server startup and custom run.yaml handling are present in the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

distribution/build.py (1)
55-73: Consider making the version threshold configurable.

The hardcoded threshold version "0.2.23" (line 64) matches the current CURRENT_LLAMA_STACK_VERSION (line 17). When the version is bumped in the future, maintainers must remember to update this threshold if the entrypoint behavior needs to change, creating a maintenance burden.

Consider extracting the threshold as a module-level constant or deriving it from CURRENT_LLAMA_STACK_VERSION:
# Near the top of the file, after CURRENT_LLAMA_STACK_VERSION
ENTRYPOINT_CHANGE_VERSION = "0.2.23"  # Version where entrypoint changed from python -m to llama stack run

def get_entrypoint(llama_stack_version):
    """Determine the appropriate ENTRYPOINT based on llama-stack version."""
    if is_install_from_source(llama_stack_version):
        return 'ENTRYPOINT ["llama", "stack", "run"]'

    try:
        current_version = version.parse(llama_stack_version)
        threshold_version = version.parse(ENTRYPOINT_CHANGE_VERSION)
        # ... rest of function
Additionally, consider narrowing the exception handling on line 70 from except Exception to except version.InvalidVersion to catch only version parsing errors, allowing other unexpected errors to surface properly.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c752da and 07f27cb.

📒 Files selected for processing (3)

distribution/Containerfile (1 hunks)
distribution/Containerfile.in (1 hunks)
distribution/build.py (5 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-09-15T14:25:54.837Z

Learnt from: nathan-weinberg
PR: opendatahub-io/llama-stack-distribution#33
File: distribution/Containerfile:17-21
Timestamp: 2025-09-15T14:25:54.837Z
Learning: In the opendatahub-io/llama-stack-distribution repository, the distribution/Containerfile is auto-generated by distribution/build.py based on configuration in build.yaml. When providers are added to build.yaml, the build script automatically regenerates the Containerfile with the required dependencies. Changes to the Containerfile should not be flagged as manual edits if they correspond to legitimate changes in the build configuration.

Applied to files:

distribution/Containerfile
distribution/Containerfile.in

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-test-push (linux/amd64)
GitHub Check: Summary

🔇 Additional comments (3)

distribution/build.py (2)

15-15: LGTM!

The packaging.version import is appropriate for semantic version comparison and ensures correct handling of version strings.

195-216: LGTM!

The integration of the entrypoint parameter into generate_containerfile() and the main flow is correct and complete. The function signature update, template substitution, and call site changes properly implement the dynamic entrypoint feature.

Also applies to: 246-250
distribution/Containerfile (1)
62-63: Critical: Environment variable won't expand in CMD exec form.

Docker does not expand environment variables in JSON array (exec) form. The literal string "${APP_ROOT}/run.yaml" will be passed to the entrypoint instead of the expanded path /opt/app-root/run.yaml, causing the default startup to fail.

Solution: Use the literal path in the JSON array:
 ENTRYPOINT ["llama", "stack", "run"]
-CMD ["${APP_ROOT}/run.yaml"]
+CMD ["/opt/app-root/run.yaml"]
Note: Users can still override the config path at runtime with docker run <image> /custom/path/to/run.yaml, which satisfies the PR objective to allow custom run.yaml files.
⛔ Skipped due to learnings
Learnt from: nathan-weinberg
PR: opendatahub-io/llama-stack-distribution#33
File: distribution/Containerfile:17-21
Timestamp: 2025-09-15T14:25:54.837Z
Learning: In the opendatahub-io/llama-stack-distribution repository, the distribution/Containerfile is auto-generated by distribution/build.py based on configuration in build.yaml. When providers are added to build.yaml, the build script automatically regenerates the Containerfile with the required dependencies. Changes to the Containerfile should not be flagged as manual edits if they correspond to legitimate changes in the build configuration.

distribution/Containerfile.in

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0382ced and a90eb0c.

📒 Files selected for processing (1)

README.md (1 hunks)

🧰 Additional context used

🪛 LanguageTool

README.md

[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...

(EN_EXCESSIVE_EXCLAMATION)

[grammar] ~70-~70: Ensure spelling is correct
Context: ...es already pre-installed. There is no guarentee that your custom run YAML will nessesar...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarentee that your custom run YAML will nessesarily work with the ...

(QB_NEW_EN_HYPHEN)

[grammar] ~70-~70: Ensure spelling is correct
Context: ...uarentee that your custom run YAML will nessesarily work with the included dependencies.

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

[grammar] ~70-~70: There might be a mistake here.
Context: ...ily work with the included dependencies.

(QB_NEW_EN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-test-push (linux/amd64)
GitHub Check: Summary

README.md

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

README.md (1)

59-59: Consider more concise wording for clarity.

The phrase "To do so, run the image in the following way. The "path" mentioned should be the path to your custom run YAML file." is slightly verbose. Consider condensing to:

"To use a custom run YAML, run the image with the path to your file as the final argument, mounted as a volume:"

This removes redundancy and gets straight to the instruction.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a90eb0c and 0d59530.

📒 Files selected for processing (1)

README.md (1 hunks)

🧰 Additional context used

🪛 LanguageTool

README.md

[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...

(EN_EXCESSIVE_EXCLAMATION)

[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarantee that your custom run YAML will necessarily work with the ...

(QB_NEW_EN_HYPHEN)

[grammar] ~70-~70: There might be a mistake here.
Context: ...ily work with the included dependencies.

(QB_NEW_EN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-test-push (linux/amd64)
GitHub Check: Summary

🔇 Additional comments (1)

README.md (1)

57-70: Documentation addition is clear and well-aligned with PR objectives.

The new section provides users with clear instructions on how to supply a custom run.yaml file, complete with a practical example and an important caveat about dependency compatibility. The spelling corrections from the previous review ("guarantee" and "necessarily") are in place.

However, verify that the example command syntax (<path_in_container> as a positional argument after the image name) reflects the actual container behavior with the new entrypoint mechanism. This assumes the container ENTRYPOINT is a script or binary that accepts the path as a CMD argument—which aligns with the PR objective to decouple the entrypoint from the run.yaml path.

Confirm that:

The new entrypoint script or binary correctly interprets the <path_in_container> argument

The default behavior (when no custom path is provided) still uses /opt/app-root/run.yaml

The container can run without arguments and still work with the default run.yaml

You may find this easier to verify once the container images are built or if you can review the entrypoint implementation (likely in distribution/build.py or a shell script entrypoint).

cdoern

If the dependencies we package remain locked: that pretty much only allows for the same set of providers but with different config options or maybe different server level config options, right?

Not sure if this is possible or worth it, but if the deps are locked, it might make sense to validate that people are not changing the providers or introducing a provider with different dependencies. Otherwise this feature will commonly break

nathan-weinberg · 2025-10-16T20:17:12Z

If the dependencies we package remain locked: that pretty much only allows for the same set of providers but with different config options or maybe different server level config options, right?

Correct

Not sure if this is possible or worth it, but if the deps are locked, it might make sense to validate that people are not changing the providers or introducing a provider with different dependencies. Otherwise this feature will commonly break

See the README changes

cdoern · 2025-10-16T20:24:38Z

@nathan-weinberg

See the README changes

are you pointing me to the readme where it says There is *no* guarantee that your custom run YAML will necessarily work with the included dependencies.? I think this is kind of a non-answer to my question since I am asking if there is anything we can do to avoid these breakages (which will be frequent).

I am suggesting a simple diff against the provided run.yaml to check if the providers changed and if the new providers require net-new dependencies. This to me, seems like introducing a feature which will more often than not cause failures so some sort of guardrails is advisable IMO.

Let me know if this makes sense and is possible! Thanks

nathan-weinberg · 2025-10-16T20:27:54Z

@nathan-weinberg

See the README changes

are you pointing me to the readme where it says There is *no* guarantee that your custom run YAML will necessarily work with the included dependencies.? I think this is kind of a non-answer to my question since I am asking if there is anything we can do to avoid these breakages (which will be frequent).

Why do we want to avoid them? This is just to let people try their own run YAMLs, we aren't invested into making sure they work. That's why I pointed you here.

I am suggesting a simple diff against the provided run.yaml to check if the providers changed and if the new providers require net-new dependencies. This to me, seems like introducing a feature which will more often than not cause failures so some sort of guardrails is advisable IMO.

Let me know if this makes sense and is possible! Thanks

It's really more of a utility for development - any user trying to actually use this as a production server (either standalone or via the operator) should be using the official run YAML

cdoern · 2025-10-16T20:28:52Z

It's really more of a utility for development

Fair, all I am saying that more often than not this'll break without manual manipulation of the dependencies in the container

nathan-weinberg · 2025-10-16T20:33:36Z

It's really more of a utility for development

Fair, all I am saying that more often than not this'll break without manual manipulation of the dependencies in the container

True, but note we are already doing these custom run YAMLs from the operator - note this comment llamastack/llama-stack-k8s-operator#171 (comment) that led to the issue that this PR is resolving. This simply allows us to move the functionality from the operator to the distro image, which is preferred.

We can iterate with some additional checking like you've mentioned, but this is more focused on the movement of that functionality and removing some hardcoded things that really shouldn't be hardcoded.

kami619 · 2025-10-16T23:59:00Z

@nathan-weinberg do we know what are the common customizations users are expecting to perform by following this path ? For example, I could see the below example you linked, I was imploring to see, if we have any insight into any broader customizations we can test proactively ?

This handles the case where we use the ConfigMap to store the run.yaml, when this happens we need to override the entrypoint to give the path of the run.yaml file.

rhuss · 2025-10-17T07:39:53Z

distribution/build.py

+    # Parse current LLS version and compare with threshold LLS version
+    try:
+        current_version = version.parse(llama_stack_version)
+        threshold_version = version.parse("0.2.23")


Wouldn't it be better to avoid that hardcoded conditional and instead use a Git branch for 0.2.23 releases (so that different build.py scripts live on different branches, each specific for the version it creates) ? The idea is that each distribution image knows how to start itself, but a distribution image is connected to a particular LLS version.

We should consider this when we doing real branching and backporting. This is to avoid to pile up legacy stuff that might not be needed anymore in the the future.

We can go with this, it's already better to have that here instead of in the operator.

I'm with your first point, This is a script for a specific LLS version, it should only know about the ENTRYPOINT for that version

Even if doing backporting, we're unlikely to backport a change to update the entry point

Are y'all suggesting a different git branch within this repo for each LLS version we ship at a given time?

I suggest to branch off for each release and do z-stream releases on that branch. Changes for the next version then just go directly in to main branch, overwriting the previous behaviour (that lives on on the branch)

We are doing that currently for RHOAI releases - trying to do it for Llama Stack releases as well would be a significant maintenance effort

I could just change the hardcoding to what is currently in Llama Stack - however, I feel it makes the container less versatile, in case anyone ever wanted to build the current distro with an older Llama Stack version.

distribution/build.py

rhuss · 2025-10-17T07:52:36Z

@nathan-weinberg do we know what are the common customizations users are expecting to perform by following this path ? For example, I could see the below example you linked, I was imploring to see, if we have any insight into any broader customizations we can test proactively ?

This handles the case where we use the ConfigMap to store the run.yaml, when this happens we need to override the entrypoint to give the path of the run.yaml file.

The customization is just that you can provide an alternative run.yaml than the default included in the distribution image. For this to work you need (a) to mount a new run.yaml into the container and then (b) provide a startup CMD to point to the run.yaml into the mounted file system. The operator takes care about that, but you can do the same just with vanillad podman, too:

podman run --rm -it \
  --mount type=bind,src="$(pwd)/config",target=/mnt,ro \
  lls-distribution:latest /mnt/run.yaml

assuming you have a ./config/run.yaml prepared.

derekhiggins · 2025-10-17T09:48:32Z

Shouldn't we be doing this upstream first and establishing a consensus there ? If the end goal is to remove the code from the operator and how it runs custom images then it will need to work with upstream images also

nathan-weinberg · 2025-10-17T15:17:50Z

Shouldn't we be doing this upstream first and establishing a consensus there ? If the end goal is to remove the code from the operator and how it runs custom images then it will need to work with upstream images also

I don't quite follow - is upstream creating distribution containers similar to this? We aren't consuming any base container image layers apart from the Red Hat Python3.12 base image, as far as I know

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

distribution/build.py (1)
55-73: Consider using a constant for the version threshold.

The hardcoded "0.2.23" at line 64 creates coupling with CURRENT_LLAMA_STACK_VERSION and will require manual updates each time the entrypoint command changes. As discussed in past reviews, this approach accumulates legacy conditionals over time.

Apply this diff to introduce a named constant:
 CURRENT_LLAMA_STACK_VERSION = "0.2.23"
 LLAMA_STACK_VERSION = os.getenv("LLAMA_STACK_VERSION", CURRENT_LLAMA_STACK_VERSION)
+# Version at which entrypoint changed from python module to llama stack run
+ENTRYPOINT_CHANGE_VERSION = "0.2.23"
 BASE_REQUIREMENTS = [
     f"llama-stack=={LLAMA_STACK_VERSION}",
 ]
Then update line 64:
-        threshold_version = version.parse("0.2.23")
+        threshold_version = version.parse(ENTRYPOINT_CHANGE_VERSION)
This makes the relationship explicit and easier to maintain when future entrypoint changes occur. Based on learnings from past review comments.
README.md (1)
57-70: LGTM: Documentation clearly explains custom run YAML feature with appropriate warning.

The new section provides a clear example for mounting and using a custom run.yaml file. The IMPORTANT note appropriately warns users about potential dependency incompatibilities, which aligns with the concerns raised in PR discussions about locked dependencies.

Optional: Static analysis suggests hyphenating compound modifiers ("custom run-YAML"), though "custom run YAML" is also acceptable in technical documentation. If you prefer strict style conformance:
-### Running with a custom run YAML
+### Running with a custom run-YAML
-> The distribution image ships with various dependencies already pre-installed. There is *no* guarantee that your custom run YAML will necessarily work with the included dependencies.
+> The distribution image ships with various dependencies already pre-installed. There is *no* guarantee that your custom run-YAML will necessarily work with the included dependencies.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d59530 and d973d39.

📒 Files selected for processing (4)

README.md (1 hunks)
distribution/Containerfile (1 hunks)
distribution/Containerfile.in (1 hunks)
distribution/build.py (5 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

distribution/Containerfile

🧰 Additional context used

🪛 LanguageTool

README.md

[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...

(EN_EXCESSIVE_EXCLAMATION)

[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarantee that your custom run YAML will necessarily work with the ...

(QB_NEW_EN_HYPHEN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-test-push (linux/amd64)
GitHub Check: Summary

🔇 Additional comments (4)

distribution/build.py (3)

15-15: LGTM: Appropriate import for version comparison.

The packaging.version import is the standard approach for semantic version parsing and comparison in Python.

195-216: LGTM: Entrypoint parameter correctly threaded through template.

The updated signature and template formatting properly propagate the dynamic entrypoint to distribution/Containerfile.in. The template validation and blank-line cleanup logic remain sound.

246-250: LGTM: Main flow correctly integrates entrypoint retrieval.

The entrypoint is obtained at the appropriate point in the build flow and passed to generate_containerfile() as expected.

distribution/Containerfile.in (1)

11-12: LGTM: Template variable and CMD correctly enable custom run.yaml.

The {entrypoint} template variable will be substituted by build.py with the version-appropriate ENTRYPOINT, and the separate CMD with a literal path allows users to override the run.yaml at runtime (e.g., podman run <image> /custom/path/to/run.yaml). The use of a literal path in exec form avoids environment variable expansion issues.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

README.md (1)
57-70: Documentation clearly explains the custom run YAML feature.

The section appropriately documents the new capability and includes a necessary warning about dependency compatibility, addressing the concerns raised in PR discussions.

Optional: Consider hyphenating the compound modifier for clarity:
-### Running with a custom run YAML
+### Running with a custom run-YAML
or
-The distribution image allows you to run a custom run YAML file within it.
+The distribution image allows you to run a custom run-YAML file within it.
This is a minor style improvement and is entirely optional.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between acfa515 and b350292.

📒 Files selected for processing (4)

README.md (1 hunks)
distribution/Containerfile (1 hunks)
distribution/Containerfile.in (1 hunks)
distribution/build.py (5 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

distribution/Containerfile

🧰 Additional context used

🪛 LanguageTool

README.md

[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...

(EN_EXCESSIVE_EXCLAMATION)

[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarantee that your custom run YAML will necessarily work with the ...

(QB_NEW_EN_HYPHEN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-test-push (linux/amd64)
GitHub Check: Summary

🔇 Additional comments (5)

distribution/Containerfile.in (1)

11-12: Template correctly separates ENTRYPOINT from CMD.

The design properly allows the entrypoint to be dynamically determined at build time while enabling users to override the run.yaml path at runtime. The literal path in CMD avoids shell expansion issues.

distribution/build.py (4)

15-15: LGTM!

Appropriate import for semantic version comparison.

195-195: LGTM!

Function signature appropriately extended to accept the dynamic entrypoint parameter.

215-215: LGTM!

Template formatting correctly includes the entrypoint parameter for substitution.

246-250: LGTM!

The main flow correctly integrates the entrypoint logic, retrieving the version-appropriate ENTRYPOINT and passing it to the containerfile generation.

coderabbitai · 2025-10-20T17:15:55Z

distribution/build.py

+def get_entrypoint(llama_stack_version):
+    """Determine the appropriate ENTRYPOINT based on llama-stack version."""
+    # If installing from source (commit SHA), use the new entrypoint
+    if is_install_from_source(llama_stack_version):
+        return 'ENTRYPOINT ["llama", "stack", "run"]'
+
+    # Parse current LLS version and compare with threshold LLS version
+    try:
+        current_version = version.parse(llama_stack_version)
+        threshold_version = version.parse("0.2.23")
+
+        if current_version < threshold_version:
+            return 'ENTRYPOINT ["python", "-m", "llama_stack.core.server.server"]'
+        else:
+            return 'ENTRYPOINT ["llama", "stack", "run"]'
+    except Exception as e:
+        print(f"Error: Could not parse version {llama_stack_version}: {e}")
+        sys.exit(1)
+


🛠️ Refactor suggestion | 🟠 Major

Replace hardcoded threshold version with named constant.

The function logic correctly determines the appropriate ENTRYPOINT based on version. However, line 64 hardcodes "0.2.23", duplicating the CURRENT_LLAMA_STACK_VERSION constant defined on line 17. This creates a maintenance burden if the version changes.

Apply this diff to use the existing constant:

def get_entrypoint(llama_stack_version): """Determine the appropriate ENTRYPOINT based on llama-stack version.""" # If installing from source (commit SHA), use the new entrypoint if is_install_from_source(llama_stack_version): return 'ENTRYPOINT ["llama", "stack", "run"]' # Parse current LLS version and compare with threshold LLS version try: current_version = version.parse(llama_stack_version) - threshold_version = version.parse("0.2.23") + threshold_version = version.parse(CURRENT_LLAMA_STACK_VERSION) if current_version < threshold_version: return 'ENTRYPOINT ["python", "-m", "llama_stack.core.server.server"]' else: return 'ENTRYPOINT ["llama", "stack", "run"]' except Exception as e: print(f"Error: Could not parse version {llama_stack_version}: {e}") sys.exit(1)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get_entrypoint(llama_stack_version):

"""Determine the appropriate ENTRYPOINT based on llama-stack version."""

# If installing from source (commit SHA), use the new entrypoint

if is_install_from_source(llama_stack_version):

return 'ENTRYPOINT ["llama", "stack", "run"]'

# Parse current LLS version and compare with threshold LLS version

try:

current_version = version.parse(llama_stack_version)

threshold_version = version.parse("0.2.23")

if current_version < threshold_version:

return 'ENTRYPOINT ["python", "-m", "llama_stack.core.server.server"]'

else:

return 'ENTRYPOINT ["llama", "stack", "run"]'

except Exception as e:

print(f"Error: Could not parse version {llama_stack_version}: {e}")

sys.exit(1)

def get_entrypoint(llama_stack_version):

"""Determine the appropriate ENTRYPOINT based on llama-stack version."""

# If installing from source (commit SHA), use the new entrypoint

if is_install_from_source(llama_stack_version):

return 'ENTRYPOINT ["llama", "stack", "run"]'

# Parse current LLS version and compare with threshold LLS version

try:

current_version = version.parse(llama_stack_version)

threshold_version = version.parse(CURRENT_LLAMA_STACK_VERSION)

if current_version < threshold_version:

return 'ENTRYPOINT ["python", "-m", "llama_stack.core.server.server"]'

else:

return 'ENTRYPOINT ["llama", "stack", "run"]'

except Exception as e:

print(f"Error: Could not parse version {llama_stack_version}: {e}")

sys.exit(1)

🤖 Prompt for AI Agents

In distribution/build.py around lines 55 to 73, the threshold version "0.2.23" is hardcoded; replace that literal with the existing CURRENT_LLAMA_STACK_VERSION constant defined on line 17 so the comparison uses the shared constant instead of duplicating the version string; ensure the constant is in scope (import or reference it directly) and update the version.parse call to use CURRENT_LLAMA_STACK_VERSION.

derekhiggins · 2025-10-23T23:57:58Z

Shouldn't we be doing this upstream first and establishing a consensus there ? If the end goal is to remove the code from the operator and how it runs custom images then it will need to work with upstream images also

I don't quite follow - is upstream creating distribution containers similar to this? We aren't consuming any base container image layers apart from the Red Hat Python3.12 base image, as far as I know

Upstream produces these images https://hub.docker.com/u/llamastack
For those the entrypoint is /usr/local/bin/llama-stack-entrypoint.sh (see https://github.com/llamastack/llama-stack/blob/main/containers/Containerfile )

For those a custom run.yaml is set in $RUN_CONFIG_PATH

In theory the lls operator also supports these images (although I haven't tried it recently). It would make sense to have both images us a common API for setting a custom run.yaml.

this commit changes the server start command to be dynamically generated based on Llama Stack version it allows users to pass a custom run.yaml if they so choose will keeping the official run.yaml we ship with the distro image as a default Signed-off-by: Nathan Weinberg <[email protected]>

Signed-off-by: Nathan Weinberg <[email protected]>

nathan-weinberg · 2025-10-28T14:58:49Z

Shouldn't we be doing this upstream first and establishing a consensus there ? If the end goal is to remove the code from the operator and how it runs custom images then it will need to work with upstream images also

I don't quite follow - is upstream creating distribution containers similar to this? We aren't consuming any base container image layers apart from the Red Hat Python3.12 base image, as far as I know

Upstream produces these images https://hub.docker.com/u/llamastack For those the entrypoint is /usr/local/bin/llama-stack-entrypoint.sh (see https://github.com/llamastack/llama-stack/blob/main/containers/Containerfile )

For those a custom run.yaml is set in $RUN_CONFIG_PATH

In theory the lls operator also supports these images (although I haven't tried it recently). It would make sense to have both images us a common API for setting a custom run.yaml.

I agree - we're in a bit of a strange spot in that we build these containers differently that upstream does - I'm wondering if we might be able to merge in the functionality first and try to find alignment with upstream as a followup?

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

distribution/build.py (1)
55-73: Add explanatory comment for the threshold version.

The function logic correctly determines the appropriate ENTRYPOINT based on version. The hardcoded "0.2.23" on line 64 represents the historical llama-stack version when the entrypoint command changed upstream (from python -m to llama stack run). This is semantically different from CURRENT_LLAMA_STACK_VERSION, which represents the version being built. However, this distinction is not obvious without a comment.

Add a comment explaining the threshold:
     # Parse current LLS version and compare with threshold LLS version
     try:
         current_version = version.parse(llama_stack_version)
+        # 0.2.23 is when llama-stack upstream changed the entrypoint command
         threshold_version = version.parse("0.2.23")
 
         if current_version < threshold_version:

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b350292 and 6cb349a.

📒 Files selected for processing (4)

README.md (1 hunks)
distribution/Containerfile (1 hunks)
distribution/Containerfile.in (1 hunks)
distribution/build.py (5 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

distribution/Containerfile.in

🧰 Additional context used

🪛 LanguageTool

README.md

[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...

(EN_EXCESSIVE_EXCLAMATION)

[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarantee that your custom run YAML will necessarily work with the ...

(QB_NEW_EN_HYPHEN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-test-push (linux/amd64)
GitHub Check: Summary

🔇 Additional comments (4)

README.md (1)

57-70: Clear documentation for the new custom run YAML feature.

The documentation effectively explains how to mount and use a custom run YAML file, and the IMPORTANT note appropriately warns users about potential dependency incompatibilities with the pre-installed packages.

distribution/Containerfile (1)

62-63: LGTM! Correct separation of ENTRYPOINT and CMD.

Splitting the ENTRYPOINT from CMD follows the standard container pattern and allows users to override just the run YAML path (via CMD) without having to replace the entire ENTRYPOINT. This is the right approach for the custom run YAML feature.

distribution/build.py (2)

195-216: LGTM! Clean integration of entrypoint parameter.

The generate_containerfile signature correctly accepts the new entrypoint parameter and properly substitutes it into the template on line 215. The implementation is straightforward and maintains the existing validation and formatting logic.

246-250: LGTM! Correct wiring of entrypoint generation.

The main function properly computes the entrypoint using get_entrypoint(LLAMA_STACK_VERSION) and passes it to generate_containerfile, maintaining consistency with the existing pattern for dependencies and install commands.

derekhiggins · 2025-10-29T12:57:55Z

Upstream produces these images https://hub.docker.com/u/llamastack For those the entrypoint is /usr/local/bin/llama-stack-entrypoint.sh (see https://github.com/llamastack/llama-stack/blob/main/containers/Containerfile )
For those a custom run.yaml is set in $RUN_CONFIG_PATH
In theory the lls operator also supports these images (although I haven't tried it recently). It would make sense to have both images us a common API for setting a custom run.yaml.

I agree - we're in a bit of a strange spot in that we build these containers differently that upstream does - I'm wondering if we might be able to merge in the functionality first and try to find alignment with upstream as a followup?

We should make this decision before changing anything, this PR will change the API into the container, we don't want to have to change it again.

For what its worth I would prefer to see us do it the same way as upstream (call /usr/local/bin/llama-stack-entrypoint.sh ). This will then give us a place to run code in future before the server is started e.g. data migrations, sanity checks, CA bundle setup etc....

nathan-weinberg · 2025-10-29T15:45:31Z

Upstream produces these images https://hub.docker.com/u/llamastack For those the entrypoint is /usr/local/bin/llama-stack-entrypoint.sh (see https://github.com/llamastack/llama-stack/blob/main/containers/Containerfile )
For those a custom run.yaml is set in $RUN_CONFIG_PATH
In theory the lls operator also supports these images (although I haven't tried it recently). It would make sense to have both images us a common API for setting a custom run.yaml.

I agree - we're in a bit of a strange spot in that we build these containers differently that upstream does - I'm wondering if we might be able to merge in the functionality first and try to find alignment with upstream as a followup?

We should make this decision before changing anything, this PR will change the API into the container, we don't want to have to change it again.

For what its worth I would prefer to see us do it the same way as upstream (call /usr/local/bin/llama-stack-entrypoint.sh ). This will then give us a place to run code in future before the server is started e.g. data migrations, sanity checks, CA bundle setup etc....

Ack, that makes sense - I'll update the implementation to align moreso with what's upstream

nathan-weinberg requested review from Elbehery, VaishnaviHire, cdoern, derekhiggins, leseb, rhdedgar, rhuss and skamenan7 as code owners October 16, 2025 18:05

nathan-weinberg force-pushed the entrypoint branch from 49bc262 to 07f27cb Compare October 16, 2025 18:05

coderabbitai bot reviewed Oct 16, 2025

View reviewed changes

distribution/Containerfile.in Outdated Show resolved Hide resolved

nathan-weinberg force-pushed the entrypoint branch from 07f27cb to 67fc897 Compare October 16, 2025 18:14

nathan-weinberg added the do-not-merge Apply to PRs that should not be merged (yet) label Oct 16, 2025

nathan-weinberg force-pushed the entrypoint branch 2 times, most recently from 2d74b58 to 0382ced Compare October 16, 2025 18:30

nathan-weinberg removed the do-not-merge Apply to PRs that should not be merged (yet) label Oct 16, 2025

coderabbitai bot reviewed Oct 16, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

nathan-weinberg force-pushed the entrypoint branch from a90eb0c to 0d59530 Compare October 16, 2025 19:42

coderabbitai bot reviewed Oct 16, 2025

View reviewed changes

cdoern reviewed Oct 16, 2025

View reviewed changes

rhuss reviewed Oct 17, 2025

View reviewed changes

distribution/build.py Show resolved Hide resolved

leseb approved these changes Oct 17, 2025

View reviewed changes

nathan-weinberg force-pushed the entrypoint branch from 0d59530 to d973d39 Compare October 17, 2025 15:17

nathan-weinberg requested a review from kelbrown20 as a code owner October 17, 2025 15:17

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

nathan-weinberg force-pushed the entrypoint branch 2 times, most recently from acfa515 to b350292 Compare October 20, 2025 17:11

coderabbitai bot reviewed Oct 20, 2025

View reviewed changes

derekhiggins mentioned this pull request Oct 28, 2025

feat: remove need for CA bundle initContainer llamastack/llama-stack-k8s-operator#174

Closed

nathan-weinberg added 2 commits October 28, 2025 10:57

docs: add details on how to use custom YAML in the docs

6cb349a

Signed-off-by: Nathan Weinberg <[email protected]>

nathan-weinberg force-pushed the entrypoint branch from b350292 to 6cb349a Compare October 28, 2025 14:57

coderabbitai bot reviewed Oct 28, 2025

View reviewed changes

nathan-weinberg added the do-not-merge Apply to PRs that should not be merged (yet) label Oct 29, 2025

feat: allow for dynamic server start command and custom run.yaml #84

Are you sure you want to change the base?

feat: allow for dynamic server start command and custom run.yaml #84

Conversation

nathan-weinberg commented Oct 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cdoern left a comment

Choose a reason for hiding this comment

Uh oh!

nathan-weinberg commented Oct 16, 2025

Uh oh!

cdoern commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nathan-weinberg commented Oct 16, 2025

Uh oh!

cdoern commented Oct 16, 2025

Uh oh!

nathan-weinberg commented Oct 16, 2025

Uh oh!

kami619 commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rhuss Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

rhuss Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

derekhiggins Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

nathan-weinberg Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

rhuss Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

nathan-weinberg Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rhuss commented Oct 17, 2025

Uh oh!

derekhiggins commented Oct 17, 2025

Uh oh!

nathan-weinberg commented Oct 17, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

nathan-weinberg commented Oct 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 16, 2025 •

edited

Loading

cdoern commented Oct 16, 2025 •

edited

Loading

kami619 commented Oct 16, 2025 •

edited

Loading