Can't get any reasonable output

#3
by Sebbecking - opened

Hi!

Thanks for your work on improving german capabilities of open LLMs :)
I tried to use your model in a toy example, but I seem to only get repetitions on the input prompt.
I tried several temperatures and prompts. Any hints on what I'm doing wrong?

This is my full code:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    pretrained_model_name_or_path="LeoLM/leo-hessianai-13b",
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("LeoLM/leo-hessianai-13b")

# taken from https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat/blob/main/model.py#L20
def get_prompt(message: str, chat_history: list[tuple[str, str]],
               system_prompt: str) -> str:
    texts = [f'<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n']
    # The first user input is _not_ stripped
    do_strip = False
    for user_input, response in chat_history:
        user_input = user_input.strip() if do_strip else user_input
        do_strip = True
        texts.append(f'{user_input} [/INST] {response.strip()} </s><s>[INST] ')
    message = message.strip() if do_strip else message
    texts.append(f'{message} [/INST]')
    return ''.join(texts)

prompt = get_prompt(message="Hi, kannst du mit mir reden?", chat_history=[], system_prompt="Du bist ein netter, hilfsbereiter Sprachassistent.")
inputs = tokenizer([prompt], return_tensors='pt', add_special_tokens=False)

# Generate
generate_ids = model.generate(inputs.input_ids.to("cuda"), 
                              max_length=300,
                              do_sample=True,
                              top_p=0.95,
                              top_k=50,
                              temperature=0.8,
                              num_beams=1
                             )
print(tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0])

Output was

'[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n<</SYS>>\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi, kannst du mit mir reden? [/INST]\n\n[INST] <<SYS>>\nDu bist ein netter, hilfsbereiter Sprachassistent.\n\nHi'
Sebbecking changed discussion title from Can't get any to Can't get any reasonable output
LAION LeoLM org

LeoLM/leo-hessianai-13b and LeoLM/leo-hessianai-7b are our base models and are not intended for direct use in a chat format. For chat models, check out LeoLM/leo-hessianai-13b-chat and LeoLM/leo-hessianai-7b-chat. Let me know if these work better for you :)

Thanks! Works way better now.
Besides using the wrong model, I also used the wrong prompt template...

Sebbecking changed discussion status to closed

If the chat model is used for better results, what would be the general use case of LeoLM/leo-hessianai-13b and LeoLM/leo-hessianai-7b?

Sign up or log in to comment