Further edits

christian-pinto · christian-pinto · commit 8d928ec3c80e · 2025-09-05T16:55:31.000+01:00
Signed-off-by: Christian Pinto &lt;christian.pinto@ibm.com&gt;
diff --git a/_posts/2025-09-03-beyond-text-generation.md b/_posts/2025-09-03-beyond-text-generation.md
@@ -36,7 +36,7 @@ These patches are then fed to the model for inference, with the resulting output
 </p>
 
 Given these requirements, the obvious choice was to integrate vision transformers in vLLM as pooling models.
-In vLLM pooling models allow extracting the raw model output of the model via an identity pooler. 
+In vLLM pooling models allow extracting the raw model output via an identity pooler. 
 Identity poolers do not apply any transformation to the data and return it as is - exactly what we need. 
 For the input, we exploit the existing multimodal input capabilities of vLLM to pre-proces images into tensors that are then fed to vLLM for inference.