You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Make sure your prompt contains the tags expected by your processor to correctly inject the assets in the prompt. For some vision multimodal models for instance, you need to add as many `<image>` tags in your prompt as there are image assets included in your model input. `Chat` method, instead, does not require this step.
82
+
79
83
80
-
The `TransformersMultiModal` model supports batch generation. To use it, invoke the `batch` method with a list of lists. You will receive as a result a list of completions.
84
+
### Chat
85
+
The `Chat` interface offers a more convenient way to work with multimodal inputs. You don't need to manually add asset tags like `<image>`. The model's HF processor handles the chat templating and asset placement for you automatically.
86
+
To do so, call the model with a `Chat` instance using a multimodal chat format. Assets must be pre-processed as `outlines.inputs.{Image, Audio, Video}` format, and only `image`, `video`, and `audio` types are supported.
81
87
82
88
For instance:
83
89
84
90
```python
91
+
import outlines
92
+
from outlines.inputs import Chat, Image
93
+
from transformers import AutoModelForImageTextToText, AutoProcessor
{"type": "text", "text": "Describe the image in few words."}
126
+
],
127
+
}
128
+
])
129
+
130
+
# Call the model to generate a response
131
+
response = model(prompt, max_new_tokens=50)
132
+
print(response) # 'A Siamese cat with blue eyes is sitting on a cat tree, looking alert and curious.'
121
133
```
122
134
123
-
### Chat
124
-
You can use chat inputs with the `TransformersMultiModal` model. To do so, call the model with a `Chat` instance.
135
+
### Batching
136
+
The `TransformersMultiModal` model supports batching through the `batch` method. To use it, provide a list of prompts (using the formats described above) to the `batch` method. You will receive as a result a list of completions.
125
137
126
-
For instance:
138
+
An example using the Chat format:
127
139
128
140
```python
129
141
import outlines
@@ -133,18 +145,22 @@ from PIL import Image as PILImage
print([Animal.model_validate_json(i) for i in responses]) # [Animal(animal='cat', color='white and gray'), Animal(animal='dog', color='white')]
172
202
```
173
203
174
204
175
-
!!! Warning
205
+
An example using a list of lists with tag assets:
176
206
177
-
Make sure your prompt contains the tags expected by your processor to correctly inject the assets in the prompt. For some vision multimodal models for instance, you need to add as many `<image>` tags in your prompt as there are image assets included in your model input.
0 commit comments