Skip to content

Commit c48963b

Browse files
committed
add configuration options
1 parent 8065b82 commit c48963b

File tree

1 file changed

+27
-6
lines changed

1 file changed

+27
-6
lines changed

content/manuals/ai/compose/model-runner.md

Lines changed: 27 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -102,15 +102,36 @@ services:
102102
type: model
103103
options:
104104
model: ai/smollm2
105+
context-size: 1024
106+
runtime-flags: "--no-prefill-assistant"
105107
```
106108

107-
Notice the dedicated `provider` attribute in the `ai_runner` service.
108-
This attribute specifies that the service is a model provider and lets you define options such as the name of the model to be used.
109+
Notice the following:
109110

110-
There is also a `depends_on` attribute in the `my-chat-app` service.
111-
This attribute specifies that the `my-chat-app` service depends on the `ai_runner` service.
112-
This means that the `ai_runner` service will be started before the `my-chat-app` service to allow injection of model information to the `my-chat-app` service.
111+
In the `ai_runner` service:
113112

114-
## Reference
113+
- `provider.type`: Specifies that the service is a `model` provider.
114+
- `provider.options`: Specifies the options of the mode:
115+
116+
- We want to use `ai/smollm2` model.
117+
118+
- We set the context size to `1024` tokens.
119+
120+
121+
> [!NOTE]
122+
> Each model has its own maximum context size. When increasing the context length,
123+
> consider your hardware constraints. In general, try to use the smallest context size
124+
> possible for your use case.
125+
126+
- We pass the llama.cpp server `--no-prefill-assistant` parameter,
127+
see [the available parameters](https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md).
128+
129+
In the `chat` service:
130+
131+
- `depends_on` specifies that the `chat` service depends on the `ai_runner` service. The
132+
`ai_runner` service will be started before the `chat` service, to allow injection of model
133+
information to the `chat` service.
134+
135+
## Related pages
115136

116137
- [Docker Model Runner documentation](/manuals/ai/model-runner.md)

0 commit comments

Comments
 (0)