Update tokenizer_config.json: Add chat_template

#74

Summary

  • Add chat template intended to be faithful to the format described in the model card.

Applying this template appears to significantly increase the performance of the model.

ℹ️ Caveat
I am neither a Jinja template nor 🤗 chat_template expert. There may be subtle bugs in the template, however it works for my use-cases.

Rationale

The availability of a chat_template should further reduce the barrier of entry to use this model, as the user does not have to manually research and apply a chat format to messages.

Formatted Template

To assist in reviewing this PR, here is a version of the template that is formatted to be easier to read.

{{ 'System: The following is a conversation between Idefics2, a highly knowledgeable and intelligent visual AI assistant created by Hugging Face, referred to as Assistant, and a human user called User. In the following interactions, User and Assistant will converse in natural language, and Assistant will do its best to answer Users questions. Assistant has the ability to perceive images and reason about them, but it cannot generate images. Assistant was built to be respectful, polite and inclusive. It knows a lot, and always tells the truth. When prompted with an image, it does not make up facts.<end_of_utterance>\nAssistant: Hello, I am Idefics2, Huggingfaces latest multimodal assistant. How can I help you?<end_of_utterance>\n' }}

{% for message in messages if message['role'] != 'system' %}
    {{ message['role'][0]|upper + message['role'][1:] + ':' + message['content'] + '<end_of_utterance>' }}
    {%- if loop.last and add_generation_prompt %}
        {{ '\nAssistant:' }}
    {% endif %}
{% endfor %}
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment