docs: update using "starter" distro than "ollama" #96

zdtsw · 2025-07-07T16:23:45Z

update example by using https://github.com/llamastack/llama-stack/blob/main/llama_stack/distributions/starter/run.yaml
set new env "OLLAMA_INFERENCE_MODEL" and "OLLAMA_URL" to use ollama
remove default port in the config
remove deprecated distro images: see https://github.com/llamastack/llama-stack-ops/blob/main/actions/publish-dockers/main.sh#L71
add new sample with not using configmap

need wait for a new docker image to be released llamastack/distribution-starter:0.2.15 to include llamastack/llama-stack#2516 otherwise all providers are still enabled in image 0.2.14 which requires more env variable to disable them if use starter template

mergify · 2025-07-15T09:36:05Z

This pull request has merge conflicts that must be resolved before it can be merged. @zdtsw please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

nathan-weinberg · 2025-07-21T18:35:40Z

README.md

      env:
      - name: INFERENCE_MODEL
-        value: "llama3.2:1b"
+        value: "llama3.2:3b"


AFAIK this doesn't need to be changed from 1b to 3b - 1b should continue to work just fine

However, INFERENCE_MODEL will need to change to OLLAMA_INFERENCE_MODEL, see https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/starter.html#model-configuration

nathan-weinberg · 2025-07-21T18:35:55Z

config/samples/_v1alpha1_llamastackdistribution.yaml

      env:
        - name: INFERENCE_MODEL
-          value: 'llama3.2:1b'
+          value: 'llama3.2:3b'


Same comment here regarding 3b versus 1b

nathan-weinberg · 2025-07-21T18:36:22Z

config/samples/example-with-configmap.yaml

          url: "http://ollama-server-service.ollama-dist.svc.cluster.local:11434"
    models:
-      - model_id: "llama3.2:1b"
+      - model_id: "ollama/llama3.2:3b"


Same comment here about 1b versus 3b

Why do we need the ollama prefix here ?

nathan-weinberg · 2025-07-21T18:37:43Z

config/samples/example-with-configmap.yaml

+      - name: ENABLE_OLLAMA
+        value: ollama
+      - name: OLLAMA_EMBEDDING_MODEL
+        value: all-minilm:l6-v2


Does OLLAMA_INFERENCE_MODEL need to be added here as well?

ENABLE_OLLAMA is now removed,
as for OLLAMA_INFERENCE_MODEL think it will be passed from Configmap llama-stack-config right? line 19-22

nathan-weinberg · 2025-07-21T18:38:17Z

config/samples/example-withoutconfigmpa.yaml

+      port: 8321
+      env:
+      - name: OLLAMA_INFERENCE_MODEL
+        value: "llama3.2:3b"


Same comment around 3b versus 1b here

rhuss · 2025-08-28T08:32:39Z

distributions.json

-"tgi": "docker.io/llamastack/distribution-tgi:latest",
-"together": "docker.io/llamastack/distribution-together:latest",
-"vllm-gpu": "docker.io/llamastack/distribution-vllm-gpu:latest"
+  "starter": "docker.io/llamastack/distribution-starter:0.2.15",


can we keep it on latest ?

updated, all set to use latest tag now

i did not add

"starter-gpu": "docker.io/llamastack/distribution-starter-gpu:latest", "dell": "docker.io/llamastack/distribution-dell:latest"

in this file (ref https://github.com/llamastack/llama-stack-ops/blob/main/actions/publish-dockers/main.sh#L71) can have a follow up PR for these 2

rhuss · 2025-08-28T08:34:03Z

config/samples/example-with-configmap.yaml

    providers:
      inference:
-      - provider_id: ollama
+      - provider_id: ${env.ENABLE_OLLAMA:=__disabled__}


please keep ollama

rhuss · 2025-08-28T08:38:30Z

config/manager/manager.yaml

          capabilities:
            drop:
-              - "ALL"
+            - "ALL"


why ? Please try to keep your changes focused on the PR. It would be great to check before submitting the PR that your AI assistant doesn't touch things outside the scope of the PR (even when it makes changes, please open another PR for changing unrelated stuff).

I do not think the change came from IDE/AI, mainly the reason for this change was to align with indentation in others lines in this yaml file.
I can revert it here

rhuss · 2025-08-28T08:39:18Z

README.md

 3. Verify the server pod is running in the user defined namespace.

-### Using a ConfigMap for run.yaml configuration
+### Using a ConfigMap to override default run.yaml configuration from distribution


This is outside the scope of this PR.

rhuss · 2025-08-28T08:40:41Z

config/samples/example-with-configmap.yaml

+        model_id: ${env.ENABLE_OLLAMA:=__disabled__}/${env.OLLAMA_EMBEDDING_MODEL:=__disabled__}
+        provider_id: ${env.ENABLE_OLLAMA:=__disabled__}
+        provider_model_id: ${env.OLLAMA_EMBEDDING_MODEL:=__disabled__}
+        model_type: embedding


Why do you need this extra entry ?

Also, I think the LLS moved away from the ENABLE_OLLAMA "trick", and just uses OLLAMA_URL. See the run.yaml of the starter distribution.

i think when this PR was opened it was based on 0.2.15,
changes after that release might not be well captured here
i've removed extral changes, to only set OLLAMA_INFERENCE_MODEL and OLLAMA_URL

matzew · 2025-08-28T08:40:46Z

README.md

      - name: OLLAMA_URL
        value: "http://ollama-server-service.ollama-dist.svc.cluster.local:11434"
+      - name: ENABLE_OLLAMA
+        value: ollama


I find this value for the name a bit odd... 😄 but I guess is outside of scope here

this is removed after comments https://github.com/llamastack/llama-stack-k8s-operator/pull/96/files#r2306693329

rhuss · 2025-08-28T08:43:03Z

config/samples/_v1alpha1_llamastackdistribution.yaml

+          value: 'llama3.2:3b'
        - name: OLLAMA_URL
          value: 'http://ollama-server-service.ollama-dist.svc.cluster.local:11434'
+        - name: ENABLE_OLLAMA


this is not needed.

rhuss · 2025-08-28T08:43:25Z

config/samples/example-with-configmap.yaml

+        value: all-minilm:l6-v2
    userConfig:
-      configMapName: llama-stack-config
+      configMapName: llama-stack-config  # use ConfigMap's data.run.yaml


please remove the comment

…ama-stack-k8s-operator#96. Seems like ollama distribution is not longer maintained and switching to 'starter', and adjustments for OLLAMA env settings Signed-off-by: Matthias Wessendorf <[email protected]>

rhuss

@zdtsw thanks a lot for the PR! It looks good in general.

Since you have created it with the help of an AI-assistant (which is fine!), please help us reviewers by using some guidelines as laid out in rvc-guide.github.io. E.g. this PR touches quite some things outside of its scope (which is an doc update).

Especially these advices might be helpful here:

Stay on target – avoid irrelevant code changes: AI models sometimes get “distracted”, modifying parts of the codebase that you didn’t ask to change (like drive-by “improvements” that are not needed). Such changes can confuse reviewers and bloat your PR. Avoid touching code that is not relevant to the task at hand. Each PR should have a laser-focused diff - if you notice unrelated edits (no matter how well-intentioned by the AI), strip them out.

Do a thorough self-review and test your changes: Never treat an AI-generated patch as ready to go without reviewing it yourself. Before requesting a peer review, inspect and verify the code yourself. This means reading through the diff carefully, understanding every change, and checking that it solves the problem without breaking anything else. Try to run the code end-to-end, and run all relevant unit tests or sample scenarios. If the project has automated test suites or linters, run those tools locally to catch obvious issues. You are responsible for the code you submit, whether you or an AI wrote it, and AI tools are no substitute for your own review of code quality, correctness, style, security, and licensing.

zdtsw

did udpates to address review comments

zdtsw · 2025-08-28T09:13:05Z

config/manager/manager.yaml

          capabilities:
            drop:
-              - "ALL"
+            - "ALL"


I do not think the change came from IDE/AI, mainly the reason for this change was to align with indentation in others lines in this yaml file.
I can revert it here

zdtsw · 2025-08-28T09:23:20Z

config/samples/_v1alpha1_llamastackdistribution.yaml

      env:
        - name: INFERENCE_MODEL
-          value: 'llama3.2:1b'
+          value: 'llama3.2:3b'


zdtsw · 2025-08-28T09:23:26Z

README.md

 3. Verify the server pod is running in the user defined namespace.

-### Using a ConfigMap for run.yaml configuration
+### Using a ConfigMap to override default run.yaml configuration from distribution


zdtsw · 2025-08-28T09:25:49Z

config/samples/example-with-configmap.yaml

    providers:
      inference:
-      - provider_id: ollama
+      - provider_id: ${env.ENABLE_OLLAMA:=__disabled__}


zdtsw · 2025-08-28T09:26:09Z

config/samples/_v1alpha1_llamastackdistribution.yaml

+          value: 'llama3.2:3b'
        - name: OLLAMA_URL
          value: 'http://ollama-server-service.ollama-dist.svc.cluster.local:11434'
+        - name: ENABLE_OLLAMA


zdtsw · 2025-08-28T09:38:33Z

distributions.json

-"tgi": "docker.io/llamastack/distribution-tgi:latest",
-"together": "docker.io/llamastack/distribution-together:latest",
-"vllm-gpu": "docker.io/llamastack/distribution-vllm-gpu:latest"
+  "starter": "docker.io/llamastack/distribution-starter:0.2.15",


updated, all set to use latest tag now

zdtsw · 2025-08-28T09:44:11Z

distributions.json

-"tgi": "docker.io/llamastack/distribution-tgi:latest",
-"together": "docker.io/llamastack/distribution-together:latest",
-"vllm-gpu": "docker.io/llamastack/distribution-vllm-gpu:latest"
+  "starter": "docker.io/llamastack/distribution-starter:0.2.15",


i did not add

"starter-gpu": "docker.io/llamastack/distribution-starter-gpu:latest", "dell": "docker.io/llamastack/distribution-dell:latest"

in this file (ref https://github.com/llamastack/llama-stack-ops/blob/main/actions/publish-dockers/main.sh#L71) can have a follow up PR for these 2

zdtsw · 2025-08-28T09:46:13Z

config/samples/example-with-configmap.yaml

+      - name: ENABLE_OLLAMA
+        value: ollama
+      - name: OLLAMA_EMBEDDING_MODEL
+        value: all-minilm:l6-v2


ENABLE_OLLAMA is now removed,
as for OLLAMA_INFERENCE_MODEL think it will be passed from Configmap llama-stack-config right? line 19-22

zdtsw · 2025-08-28T09:49:50Z

config/samples/example-with-configmap.yaml

          url: "http://ollama-server-service.ollama-dist.svc.cluster.local:11434"
    models:
-      - model_id: "llama3.2:1b"
+      - model_id: "ollama/llama3.2:3b"


zdtsw · 2025-08-28T09:50:10Z

README.md

      - name: OLLAMA_URL
        value: "http://ollama-server-service.ollama-dist.svc.cluster.local:11434"
+      - name: ENABLE_OLLAMA
+        value: ollama


this is removed after comments https://github.com/llamastack/llama-stack-k8s-operator/pull/96/files#r2306693329

zdtsw · 2025-08-28T09:55:51Z

@zdtsw thanks a lot for the PR! It looks good in general.

Since you have created it with the help of an AI-assistant (which is fine!), please help us reviewers by using some guidelines as laid out in rvc-guide.github.io. E.g. this PR touches quite some things outside of its scope (which is an doc update).

Especially these advices might be helpful here:

Stay on target – avoid irrelevant code changes: AI models sometimes get “distracted”, modifying parts of the codebase that you didn’t ask to change (like drive-by “improvements” that are not needed). Such changes can confuse reviewers and bloat your PR. Avoid touching code that is not relevant to the task at hand. Each PR should have a laser-focused diff - if you notice unrelated edits (no matter how well-intentioned by the AI), strip them out.

Do a thorough self-review and test your changes: Never treat an AI-generated patch as ready to go without reviewing it yourself. Before requesting a peer review, inspect and verify the code yourself. This means reading through the diff carefully, understanding every change, and checking that it solves the problem without breaking anything else. Try to run the code end-to-end, and run all relevant unit tests or sample scenarios. If the project has automated test suites or linters, run those tools locally to catch obvious issues. You are responsible for the code you submit, whether you or an AI wrote it, and AI tools are no substitute for your own review of code quality, correctness, style, security, and licensing.

Thanks for the review comments and suggestions!
original commit was made a bit long time ago, some parts are out of date to catch the latest code in llama-stack.
my initial plan was to only updates README.md but as usual it became a snowball PR.
I will keep this in mind in the future.
I keep the change in the samples, but if these should be moved to a dedicated PR, do let me know i can split them out.

matzew · 2025-08-28T10:30:52Z

config/samples/example-withoutconfigmpa.yaml

+    distribution:
+      name: starter
+    containerSpec:
+      port: 8321


you removed the port in the README`, perhaps here also not needed?

yep, this is a default one, not needed.
let me remove it as well

matzew · 2025-09-02T08:13:45Z

config/samples/example-with-configmap.yaml

      - model_id: "llama3.2:1b"
        provider_id: ollama
        model_type: llm
+        provider_model_id: llama3.2:1b


Is this needed? Since above is already a model_id and a provider_id ?

not necessary, i can remove it

rhuss · 2025-09-02T08:18:10Z

README.md

    containerSpec:
-      port: 8321
      env:
      - name: INFERENCE_MODEL


Shouldn't this be also OLLAMA_INFERENCE_MODEL like in the sample change below ?

you are right!
let me fix this

VaishnaviHire · 2025-09-04T14:08:47Z

@nathan-weinberg @rhuss Is it okay to merge this PR ?

VaishnaviHire · 2025-09-04T14:09:06Z

@mergify rebase

- update example and create one without using userconfigmap - set new env to enable ollama - use the same llama model as in llama-stack - remove deprecated distro images from distribution.json Signed-off-by: Wen Zhou <[email protected]>

- revert back to use llama3.2:1b - remove unnecessary/unrelated comments/changes - set INFERNECE_MODEL to OLLAMA_INFERENCE_MODEL - remove ENALBE_OLLAMA - set images to use "latest" tag than 0.2.15 - Signed-off-by: Wen Zhou <[email protected]>

- remove default port 8321 in sample Signed-off-by: Wen Zhou <[email protected]>

Signed-off-by: Wen Zhou <[email protected]>

mergify · 2025-09-04T14:09:37Z

rebase

✅ Branch has been successfully rebased

rhdedgar · 2025-09-04T18:35:46Z

config/samples/example-withoutconfigmpa.yaml

After #156 is merged, this file will have the same properties as example-with-custom-config.yaml, and example-with-configmap will be removed. If this one goes through first I'll update the name:
config/samples/example-withoutconfigmpa.yaml ->
config/samples/example-withoutconfigmap.yaml

i saw #156 is closed, any action still needed for this PR?

rhdedgar · 2025-09-04T18:37:02Z

config/samples/example-with-configmap.yaml

If #156 merges before this PR, this file can be deleted.

rhuss · 2025-10-13T16:11:00Z

where do we stand with this PR ? In general I think its a great addition and we should move forward with it.

zdtsw force-pushed the chore_2 branch from f029bff to f132945 Compare July 8, 2025 07:27

zdtsw marked this pull request as draft July 8, 2025 10:05

mergify bot added the needs-rebase label Jul 15, 2025

zdtsw force-pushed the chore_2 branch from f132945 to 3eac460 Compare July 20, 2025 11:56

mergify bot removed the needs-rebase label Jul 20, 2025

zdtsw marked this pull request as ready for review July 20, 2025 14:07

nathan-weinberg requested changes Jul 21, 2025

View reviewed changes

rhuss reviewed Aug 28, 2025

View reviewed changes

matzew mentioned this pull request Aug 28, 2025

Outdated version of Llama stack server on latest operat st #149

Open

rhuss reviewed Aug 28, 2025

View reviewed changes

matzew reviewed Aug 28, 2025

View reviewed changes

rhuss reviewed Aug 28, 2025

View reviewed changes

matzew mentioned this pull request Aug 28, 2025

Update llsd configuration to support latest llama stack server matzew/llama-stack-stack#11

Merged

rhuss requested changes Aug 28, 2025

View reviewed changes

zdtsw force-pushed the chore_2 branch from 3eac460 to 8cea925 Compare August 28, 2025 09:50

zdtsw commented Aug 28, 2025

View reviewed changes

matzew reviewed Aug 28, 2025

View reviewed changes

matzew reviewed Sep 2, 2025

View reviewed changes

rhuss reviewed Sep 2, 2025

View reviewed changes

zdtsw force-pushed the chore_2 branch 3 times, most recently from 1950dad to cbc3dbc Compare September 2, 2025 08:33

VaishnaviHire approved these changes Sep 3, 2025

View reviewed changes

zdtsw added 4 commits September 4, 2025 14:09

docs: update using "starter" distro than "ollama"

33ecafb

- update example and create one without using userconfigmap - set new env to enable ollama - use the same llama model as in llama-stack - remove deprecated distro images from distribution.json Signed-off-by: Wen Zhou <[email protected]>

update: address review comments

54c517d

- revert back to use llama3.2:1b - remove unnecessary/unrelated comments/changes - set INFERNECE_MODEL to OLLAMA_INFERENCE_MODEL - remove ENALBE_OLLAMA - set images to use "latest" tag than 0.2.15 - Signed-off-by: Wen Zhou <[email protected]>

update: code review

097297a

- remove default port 8321 in sample Signed-off-by: Wen Zhou <[email protected]>

fix: after code review

23d0be3

Signed-off-by: Wen Zhou <[email protected]>

VaishnaviHire force-pushed the chore_2 branch from cbc3dbc to 23d0be3 Compare September 4, 2025 14:09

VaishnaviHire requested review from derekhiggins, leseb, mfleader and rhdedgar as code owners September 4, 2025 14:09

rhdedgar approved these changes Sep 4, 2025

View reviewed changes

docs: update using "starter" distro than "ollama" #96

Are you sure you want to change the base?

docs: update using "starter" distro than "ollama" #96

Uh oh!

Conversation

zdtsw commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify bot commented Jul 15, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zdtsw Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zdtsw Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rhuss left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zdtsw left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zdtsw commented Jul 7, 2025 •

edited

Loading

zdtsw Aug 28, 2025 •

edited

Loading

zdtsw Aug 28, 2025 •

edited

Loading

rhuss left a comment •

edited

Loading

zdtsw left a comment •

edited

Loading

zdtsw Aug 28, 2025 •

edited

Loading

zdtsw Aug 28, 2025 •

edited

Loading

matzew Aug 28, 2025 •

edited

Loading