Open
Description
This task includes improving the default caption generation prompt(s) and allowing a custom prompt to be submitted:
- For the intended use, image captions should not start with "Here's a detailed description of the image:...", "This image...", or "An image of...", but should simply describe the content of the image (and not be extremely long or markdown formatted, see screenshot below).
- In addition, it should be possible to submit a custom
prompt
in the request data along with the model name and version, so that users can customize the prompt for the model they have chosen without touching thephotoprism-vision
service.
Examples of good/better image captions:
- A cat sleeping with its head resting on the strings of an instrument.
- A vibrant, full poppy flower in rich shades of red and pink.
- A young woman, likely in her late 20s or early 30s. She is smiling broadly, appearing friendly and approachable.
Of course, the caption doesn't have to start with "A" and can be longer than these examples.
To illustrate, here is a screenshot of the captions generated by the gemma3
model with the current prompt:
Related Issues:
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Upcoming ⏳