Vision Service: Improve caption prompt and allow submitting a custom prompt with the request data

This task includes improving the default caption generation prompt(s) and allowing a custom prompt to be submitted:

- [ ] For the intended use, image captions should not start with "Here's a detailed description of the image:...", "This image...", or "An image of...", but should simply describe the content of the image (and not be extremely long or markdown formatted, see screenshot below).
- [x] In addition, it should be possible to submit a custom `prompt` in the request data along with the [model name and version](https://github.com/photoprism/photoprism-vision/pull/10), so that users can customize the prompt for the model they have chosen without touching the `photoprism-vision` service.

Examples of good/better image captions:

- *A cat sleeping with its head resting on the strings of an instrument.*
- *A vibrant, full poppy flower in rich shades of red and pink.*
- *A young woman, likely in her late 20s or early 30s. She is smiling broadly, appearing friendly and approachable.*

Of course, the caption doesn't have to start with "A" and can be longer than these examples.

To illustrate, here is a screenshot of the captions generated by the `gemma3` model with the current prompt:

![Image](https://github.com/user-attachments/assets/63786c6c-230c-472b-994f-0e4af5725e92)

Related Issues:

- https://github.com/photoprism/photoprism-vision/issues/1#issuecomment-2865844378

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vision Service: Improve caption prompt and allow submitting a custom prompt with the request data #11

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Vision Service: Improve caption prompt and allow submitting a custom prompt with the request data #11

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions