-
Notifications
You must be signed in to change notification settings - Fork 13
feat: allow for dynamic server start command and custom run.yaml #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughDecouples the container ENTRYPOINT from the run.yaml path: templates now accept a configurable entrypoint and a CMD for Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CLI as build.py:main
participant GE as get_entrypoint()
participant PV as packaging.version
participant GC as generate_containerfile()
participant TF as Containerfile.in
participant Out as distribution/Containerfile
CLI->>GE: get_entrypoint(llama_stack_version)
GE->>PV: parse & compare version (or detect source install)
PV-->>GE: comparison result
GE-->>CLI: return templated `entrypoint` string
CLI->>GC: generate_containerfile(deps, install_flag, entrypoint)
GC->>TF: load template & substitute `{entrypoint}`
GC->>GC: prepend header, strip blank lines
GC->>Out: write formatted Containerfile
note right of Out#a8dda8: Output now uses configurable ENTRYPOINT + CMD "/opt/app-root/run.yaml"
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
49bc262 to
07f27cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
distribution/build.py (1)
55-73: Consider making the version threshold configurable.The hardcoded threshold version
"0.2.23"(line 64) matches the currentCURRENT_LLAMA_STACK_VERSION(line 17). When the version is bumped in the future, maintainers must remember to update this threshold if the entrypoint behavior needs to change, creating a maintenance burden.Consider extracting the threshold as a module-level constant or deriving it from
CURRENT_LLAMA_STACK_VERSION:# Near the top of the file, after CURRENT_LLAMA_STACK_VERSION ENTRYPOINT_CHANGE_VERSION = "0.2.23" # Version where entrypoint changed from python -m to llama stack run def get_entrypoint(llama_stack_version): """Determine the appropriate ENTRYPOINT based on llama-stack version.""" if is_install_from_source(llama_stack_version): return 'ENTRYPOINT ["llama", "stack", "run"]' try: current_version = version.parse(llama_stack_version) threshold_version = version.parse(ENTRYPOINT_CHANGE_VERSION) # ... rest of functionAdditionally, consider narrowing the exception handling on line 70 from
except Exceptiontoexcept version.InvalidVersionto catch only version parsing errors, allowing other unexpected errors to surface properly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
distribution/Containerfile(1 hunks)distribution/Containerfile.in(1 hunks)distribution/build.py(5 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-15T14:25:54.837Z
Learnt from: nathan-weinberg
PR: opendatahub-io/llama-stack-distribution#33
File: distribution/Containerfile:17-21
Timestamp: 2025-09-15T14:25:54.837Z
Learning: In the opendatahub-io/llama-stack-distribution repository, the distribution/Containerfile is auto-generated by distribution/build.py based on configuration in build.yaml. When providers are added to build.yaml, the build script automatically regenerates the Containerfile with the required dependencies. Changes to the Containerfile should not be flagged as manual edits if they correspond to legitimate changes in the build configuration.
Applied to files:
distribution/Containerfiledistribution/Containerfile.in
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-push (linux/amd64)
- GitHub Check: Summary
🔇 Additional comments (3)
distribution/build.py (2)
15-15: LGTM!The
packaging.versionimport is appropriate for semantic version comparison and ensures correct handling of version strings.
195-216: LGTM!The integration of the
entrypointparameter intogenerate_containerfile()and the main flow is correct and complete. The function signature update, template substitution, and call site changes properly implement the dynamic entrypoint feature.Also applies to: 246-250
distribution/Containerfile (1)
62-63: Critical: Environment variable won't expand in CMD exec form.Docker does not expand environment variables in JSON array (exec) form. The literal string
"${APP_ROOT}/run.yaml"will be passed to the entrypoint instead of the expanded path/opt/app-root/run.yaml, causing the default startup to fail.Solution: Use the literal path in the JSON array:
ENTRYPOINT ["llama", "stack", "run"] -CMD ["${APP_ROOT}/run.yaml"] +CMD ["/opt/app-root/run.yaml"]Note: Users can still override the config path at runtime with
docker run <image> /custom/path/to/run.yaml, which satisfies the PR objective to allow custom run.yaml files.⛔ Skipped due to learnings
Learnt from: nathan-weinberg PR: opendatahub-io/llama-stack-distribution#33 File: distribution/Containerfile:17-21 Timestamp: 2025-09-15T14:25:54.837Z Learning: In the opendatahub-io/llama-stack-distribution repository, the distribution/Containerfile is auto-generated by distribution/build.py based on configuration in build.yaml. When providers are added to build.yaml, the build script automatically regenerates the Containerfile with the required dependencies. Changes to the Containerfile should not be flagged as manual edits if they correspond to legitimate changes in the build configuration.
07f27cb to
67fc897
Compare
2d74b58 to
0382ced
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
README.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md
[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...
(EN_EXCESSIVE_EXCLAMATION)
[grammar] ~70-~70: Ensure spelling is correct
Context: ...es already pre-installed. There is no guarentee that your custom run YAML will nessesar...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarentee that your custom run YAML will nessesarily work with the ...
(QB_NEW_EN_HYPHEN)
[grammar] ~70-~70: Ensure spelling is correct
Context: ...uarentee that your custom run YAML will nessesarily work with the included dependencies.
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~70-~70: There might be a mistake here.
Context: ...ily work with the included dependencies.
(QB_NEW_EN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-push (linux/amd64)
- GitHub Check: Summary
a90eb0c to
0d59530
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
README.md (1)
59-59: Consider more concise wording for clarity.The phrase "To do so, run the image in the following way. The "path" mentioned should be the path to your custom run YAML file." is slightly verbose. Consider condensing to:
"To use a custom run YAML, run the image with the path to your file as the final argument, mounted as a volume:"
This removes redundancy and gets straight to the instruction.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
README.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md
[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...
(EN_EXCESSIVE_EXCLAMATION)
[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarantee that your custom run YAML will necessarily work with the ...
(QB_NEW_EN_HYPHEN)
[grammar] ~70-~70: There might be a mistake here.
Context: ...ily work with the included dependencies.
(QB_NEW_EN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-push (linux/amd64)
- GitHub Check: Summary
🔇 Additional comments (1)
README.md (1)
57-70: Documentation addition is clear and well-aligned with PR objectives.The new section provides users with clear instructions on how to supply a custom
run.yamlfile, complete with a practical example and an important caveat about dependency compatibility. The spelling corrections from the previous review ("guarantee" and "necessarily") are in place.However, verify that the example command syntax (
<path_in_container>as a positional argument after the image name) reflects the actual container behavior with the new entrypoint mechanism. This assumes the container ENTRYPOINT is a script or binary that accepts the path as a CMD argument—which aligns with the PR objective to decouple the entrypoint from the run.yaml path.Confirm that:
- The new entrypoint script or binary correctly interprets the
<path_in_container>argument- The default behavior (when no custom path is provided) still uses
/opt/app-root/run.yaml- The container can run without arguments and still work with the default run.yaml
You may find this easier to verify once the container images are built or if you can review the entrypoint implementation (likely in
distribution/build.pyor a shell script entrypoint).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the dependencies we package remain locked: that pretty much only allows for the same set of providers but with different config options or maybe different server level config options, right?
Not sure if this is possible or worth it, but if the deps are locked, it might make sense to validate that people are not changing the providers or introducing a provider with different dependencies. Otherwise this feature will commonly break
Correct
See the README changes |
are you pointing me to the readme where it says I am suggesting a simple diff against the provided run.yaml to check if the providers changed and if the new providers require net-new dependencies. This to me, seems like introducing a feature which will more often than not cause failures so some sort of guardrails is advisable IMO. Let me know if this makes sense and is possible! Thanks |
Why do we want to avoid them? This is just to let people try their own run YAMLs, we aren't invested into making sure they work. That's why I pointed you here.
It's really more of a utility for development - any user trying to actually use this as a production server (either standalone or via the operator) should be using the official run YAML |
Fair, all I am saying that more often than not this'll break without manual manipulation of the dependencies in the container |
True, but note we are already doing these custom run YAMLs from the operator - note this comment llamastack/llama-stack-k8s-operator#171 (comment) that led to the issue that this PR is resolving. This simply allows us to move the functionality from the operator to the distro image, which is preferred. We can iterate with some additional checking like you've mentioned, but this is more focused on the movement of that functionality and removing some hardcoded things that really shouldn't be hardcoded. |
|
@nathan-weinberg do we know what are the common customizations users are expecting to perform by following this path ? For example, I could see the below example you linked, I was imploring to see, if we have any insight into any broader customizations we can test proactively ?
|
| # Parse current LLS version and compare with threshold LLS version | ||
| try: | ||
| current_version = version.parse(llama_stack_version) | ||
| threshold_version = version.parse("0.2.23") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to avoid that hardcoded conditional and instead use a Git branch for 0.2.23 releases (so that different build.py scripts live on different branches, each specific for the version it creates) ? The idea is that each distribution image knows how to start itself, but a distribution image is connected to a particular LLS version.
We should consider this when we doing real branching and backporting. This is to avoid to pile up legacy stuff that might not be needed anymore in the the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can go with this, it's already better to have that here instead of in the operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm with your first point, This is a script for a specific LLS version, it should only know about the ENTRYPOINT for that version
Even if doing backporting, we're unlikely to backport a change to update the entry point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are y'all suggesting a different git branch within this repo for each LLS version we ship at a given time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to branch off for each release and do z-stream releases on that branch. Changes for the next version then just go directly in to main branch, overwriting the previous behaviour (that lives on on the branch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are doing that currently for RHOAI releases - trying to do it for Llama Stack releases as well would be a significant maintenance effort
I could just change the hardcoding to what is currently in Llama Stack - however, I feel it makes the container less versatile, in case anyone ever wanted to build the current distro with an older Llama Stack version.
The customization is just that you can provide an alternative assuming you have a |
|
Shouldn't we be doing this upstream first and establishing a consensus there ? If the end goal is to remove the code from the operator and how it runs custom images then it will need to work with upstream images also |
I don't quite follow - is upstream creating distribution containers similar to this? We aren't consuming any base container image layers apart from the Red Hat Python3.12 base image, as far as I know |
0d59530 to
d973d39
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
distribution/build.py (1)
55-73: Consider using a constant for the version threshold.The hardcoded
"0.2.23"at line 64 creates coupling withCURRENT_LLAMA_STACK_VERSIONand will require manual updates each time the entrypoint command changes. As discussed in past reviews, this approach accumulates legacy conditionals over time.Apply this diff to introduce a named constant:
CURRENT_LLAMA_STACK_VERSION = "0.2.23" LLAMA_STACK_VERSION = os.getenv("LLAMA_STACK_VERSION", CURRENT_LLAMA_STACK_VERSION) +# Version at which entrypoint changed from python module to llama stack run +ENTRYPOINT_CHANGE_VERSION = "0.2.23" BASE_REQUIREMENTS = [ f"llama-stack=={LLAMA_STACK_VERSION}", ]Then update line 64:
- threshold_version = version.parse("0.2.23") + threshold_version = version.parse(ENTRYPOINT_CHANGE_VERSION)This makes the relationship explicit and easier to maintain when future entrypoint changes occur. Based on learnings from past review comments.
README.md (1)
57-70: LGTM: Documentation clearly explains custom run YAML feature with appropriate warning.The new section provides a clear example for mounting and using a custom run.yaml file. The IMPORTANT note appropriately warns users about potential dependency incompatibilities, which aligns with the concerns raised in PR discussions about locked dependencies.
Optional: Static analysis suggests hyphenating compound modifiers ("custom run-YAML"), though "custom run YAML" is also acceptable in technical documentation. If you prefer strict style conformance:
-### Running with a custom run YAML +### Running with a custom run-YAML-> The distribution image ships with various dependencies already pre-installed. There is *no* guarantee that your custom run YAML will necessarily work with the included dependencies. +> The distribution image ships with various dependencies already pre-installed. There is *no* guarantee that your custom run-YAML will necessarily work with the included dependencies.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
README.md(1 hunks)distribution/Containerfile(1 hunks)distribution/Containerfile.in(1 hunks)distribution/build.py(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- distribution/Containerfile
🧰 Additional context used
🪛 LanguageTool
README.md
[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...
(EN_EXCESSIVE_EXCLAMATION)
[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarantee that your custom run YAML will necessarily work with the ...
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-push (linux/amd64)
- GitHub Check: Summary
🔇 Additional comments (4)
distribution/build.py (3)
15-15: LGTM: Appropriate import for version comparison.The
packaging.versionimport is the standard approach for semantic version parsing and comparison in Python.
195-216: LGTM: Entrypoint parameter correctly threaded through template.The updated signature and template formatting properly propagate the dynamic entrypoint to
distribution/Containerfile.in. The template validation and blank-line cleanup logic remain sound.
246-250: LGTM: Main flow correctly integrates entrypoint retrieval.The entrypoint is obtained at the appropriate point in the build flow and passed to
generate_containerfile()as expected.distribution/Containerfile.in (1)
11-12: LGTM: Template variable and CMD correctly enable custom run.yaml.The
{entrypoint}template variable will be substituted bybuild.pywith the version-appropriate ENTRYPOINT, and the separate CMD with a literal path allows users to override the run.yaml at runtime (e.g.,podman run <image> /custom/path/to/run.yaml). The use of a literal path in exec form avoids environment variable expansion issues.
acfa515 to
b350292
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
README.md (1)
57-70: Documentation clearly explains the custom run YAML feature.The section appropriately documents the new capability and includes a necessary warning about dependency compatibility, addressing the concerns raised in PR discussions.
Optional: Consider hyphenating the compound modifier for clarity:
-### Running with a custom run YAML +### Running with a custom run-YAMLor
-The distribution image allows you to run a custom run YAML file within it. +The distribution image allows you to run a custom run-YAML file within it.This is a minor style improvement and is entirely optional.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
README.md(1 hunks)distribution/Containerfile(1 hunks)distribution/Containerfile.in(1 hunks)distribution/build.py(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- distribution/Containerfile
🧰 Additional context used
🪛 LanguageTool
README.md
[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...
(EN_EXCESSIVE_EXCLAMATION)
[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarantee that your custom run YAML will necessarily work with the ...
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-push (linux/amd64)
- GitHub Check: Summary
🔇 Additional comments (5)
distribution/Containerfile.in (1)
11-12: Template correctly separates ENTRYPOINT from CMD.The design properly allows the entrypoint to be dynamically determined at build time while enabling users to override the run.yaml path at runtime. The literal path in CMD avoids shell expansion issues.
distribution/build.py (4)
15-15: LGTM!Appropriate import for semantic version comparison.
195-195: LGTM!Function signature appropriately extended to accept the dynamic entrypoint parameter.
215-215: LGTM!Template formatting correctly includes the entrypoint parameter for substitution.
246-250: LGTM!The main flow correctly integrates the entrypoint logic, retrieving the version-appropriate ENTRYPOINT and passing it to the containerfile generation.
| def get_entrypoint(llama_stack_version): | ||
| """Determine the appropriate ENTRYPOINT based on llama-stack version.""" | ||
| # If installing from source (commit SHA), use the new entrypoint | ||
| if is_install_from_source(llama_stack_version): | ||
| return 'ENTRYPOINT ["llama", "stack", "run"]' | ||
|
|
||
| # Parse current LLS version and compare with threshold LLS version | ||
| try: | ||
| current_version = version.parse(llama_stack_version) | ||
| threshold_version = version.parse("0.2.23") | ||
|
|
||
| if current_version < threshold_version: | ||
| return 'ENTRYPOINT ["python", "-m", "llama_stack.core.server.server"]' | ||
| else: | ||
| return 'ENTRYPOINT ["llama", "stack", "run"]' | ||
| except Exception as e: | ||
| print(f"Error: Could not parse version {llama_stack_version}: {e}") | ||
| sys.exit(1) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Replace hardcoded threshold version with named constant.
The function logic correctly determines the appropriate ENTRYPOINT based on version. However, line 64 hardcodes "0.2.23", duplicating the CURRENT_LLAMA_STACK_VERSION constant defined on line 17. This creates a maintenance burden if the version changes.
Apply this diff to use the existing constant:
def get_entrypoint(llama_stack_version):
"""Determine the appropriate ENTRYPOINT based on llama-stack version."""
# If installing from source (commit SHA), use the new entrypoint
if is_install_from_source(llama_stack_version):
return 'ENTRYPOINT ["llama", "stack", "run"]'
# Parse current LLS version and compare with threshold LLS version
try:
current_version = version.parse(llama_stack_version)
- threshold_version = version.parse("0.2.23")
+ threshold_version = version.parse(CURRENT_LLAMA_STACK_VERSION)
if current_version < threshold_version:
return 'ENTRYPOINT ["python", "-m", "llama_stack.core.server.server"]'
else:
return 'ENTRYPOINT ["llama", "stack", "run"]'
except Exception as e:
print(f"Error: Could not parse version {llama_stack_version}: {e}")
sys.exit(1)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def get_entrypoint(llama_stack_version): | |
| """Determine the appropriate ENTRYPOINT based on llama-stack version.""" | |
| # If installing from source (commit SHA), use the new entrypoint | |
| if is_install_from_source(llama_stack_version): | |
| return 'ENTRYPOINT ["llama", "stack", "run"]' | |
| # Parse current LLS version and compare with threshold LLS version | |
| try: | |
| current_version = version.parse(llama_stack_version) | |
| threshold_version = version.parse("0.2.23") | |
| if current_version < threshold_version: | |
| return 'ENTRYPOINT ["python", "-m", "llama_stack.core.server.server"]' | |
| else: | |
| return 'ENTRYPOINT ["llama", "stack", "run"]' | |
| except Exception as e: | |
| print(f"Error: Could not parse version {llama_stack_version}: {e}") | |
| sys.exit(1) | |
| def get_entrypoint(llama_stack_version): | |
| """Determine the appropriate ENTRYPOINT based on llama-stack version.""" | |
| # If installing from source (commit SHA), use the new entrypoint | |
| if is_install_from_source(llama_stack_version): | |
| return 'ENTRYPOINT ["llama", "stack", "run"]' | |
| # Parse current LLS version and compare with threshold LLS version | |
| try: | |
| current_version = version.parse(llama_stack_version) | |
| threshold_version = version.parse(CURRENT_LLAMA_STACK_VERSION) | |
| if current_version < threshold_version: | |
| return 'ENTRYPOINT ["python", "-m", "llama_stack.core.server.server"]' | |
| else: | |
| return 'ENTRYPOINT ["llama", "stack", "run"]' | |
| except Exception as e: | |
| print(f"Error: Could not parse version {llama_stack_version}: {e}") | |
| sys.exit(1) |
🤖 Prompt for AI Agents
In distribution/build.py around lines 55 to 73, the threshold version "0.2.23"
is hardcoded; replace that literal with the existing CURRENT_LLAMA_STACK_VERSION
constant defined on line 17 so the comparison uses the shared constant instead
of duplicating the version string; ensure the constant is in scope (import or
reference it directly) and update the version.parse call to use
CURRENT_LLAMA_STACK_VERSION.
Upstream produces these images https://hub.docker.com/u/llamastack For those a custom run.yaml is set in $RUN_CONFIG_PATH In theory the lls operator also supports these images (although I haven't tried it recently). It would make sense to have both images us a common API for setting a custom run.yaml. |
this commit changes the server start command to be dynamically generated based on Llama Stack version it allows users to pass a custom run.yaml if they so choose will keeping the official run.yaml we ship with the distro image as a default Signed-off-by: Nathan Weinberg <[email protected]>
Signed-off-by: Nathan Weinberg <[email protected]>
b350292 to
6cb349a
Compare
I agree - we're in a bit of a strange spot in that we build these containers differently that upstream does - I'm wondering if we might be able to merge in the functionality first and try to find alignment with upstream as a followup? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
distribution/build.py (1)
55-73: Add explanatory comment for the threshold version.The function logic correctly determines the appropriate ENTRYPOINT based on version. The hardcoded
"0.2.23"on line 64 represents the historical llama-stack version when the entrypoint command changed upstream (frompython -mtollama stack run). This is semantically different fromCURRENT_LLAMA_STACK_VERSION, which represents the version being built. However, this distinction is not obvious without a comment.Add a comment explaining the threshold:
# Parse current LLS version and compare with threshold LLS version try: current_version = version.parse(llama_stack_version) + # 0.2.23 is when llama-stack upstream changed the entrypoint command threshold_version = version.parse("0.2.23") if current_version < threshold_version:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
README.md(1 hunks)distribution/Containerfile(1 hunks)distribution/Containerfile.in(1 hunks)distribution/build.py(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- distribution/Containerfile.in
🧰 Additional context used
🪛 LanguageTool
README.md
[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...
(EN_EXCESSIVE_EXCLAMATION)
[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarantee that your custom run YAML will necessarily work with the ...
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-push (linux/amd64)
- GitHub Check: Summary
🔇 Additional comments (4)
README.md (1)
57-70: Clear documentation for the new custom run YAML feature.The documentation effectively explains how to mount and use a custom run YAML file, and the IMPORTANT note appropriately warns users about potential dependency incompatibilities with the pre-installed packages.
distribution/Containerfile (1)
62-63: LGTM! Correct separation of ENTRYPOINT and CMD.Splitting the ENTRYPOINT from CMD follows the standard container pattern and allows users to override just the run YAML path (via CMD) without having to replace the entire ENTRYPOINT. This is the right approach for the custom run YAML feature.
distribution/build.py (2)
195-216: LGTM! Clean integration of entrypoint parameter.The
generate_containerfilesignature correctly accepts the newentrypointparameter and properly substitutes it into the template on line 215. The implementation is straightforward and maintains the existing validation and formatting logic.
246-250: LGTM! Correct wiring of entrypoint generation.The main function properly computes the entrypoint using
get_entrypoint(LLAMA_STACK_VERSION)and passes it togenerate_containerfile, maintaining consistency with the existing pattern for dependencies and install commands.
We should make this decision before changing anything, this PR will change the API into the container, we don't want to have to change it again. For what its worth I would prefer to see us do it the same way as upstream (call /usr/local/bin/llama-stack-entrypoint.sh ). This will then give us a place to run code in future before the server is started e.g. data migrations, sanity checks, CA bundle setup etc.... |
Ack, that makes sense - I'll update the implementation to align moreso with what's upstream |
What does this PR do?
this commit changes the server start command to be dynamically generated based on Llama Stack version
it allows users to pass a custom run.yaml if they so choose will keeping the official run.yaml we ship with the distro image as a default
Closes #83
Summary by CodeRabbit