Request: DOI

#9
by zss001 - opened

Describe this picture in 15 words.
image (4).png

Google org

To generate a 15-word description for an image, follow these steps:

1. Set the prompt to instruct the model to create a caption limited to 15 words. This directs the model to produce the response within the word constraint.
2. Use max_new_tokens=15 to limit the model's output to approximately 15 words (tokens).

Could you please refer to the below sample code:

image.png

However, for some prompts, the model may return the message: "Sorry, as a base VLM I am not trained to answer this question". This happens because the model isn't designed for certain specific or complex tasks and defaults to this response when it can't process or comprehend the prompt properly.

Thank you.

Sign up or log in to comment