Text Generation
Transformers
PyTorch
Safetensors
English
llama
conversational
text-generation-inference
Inference Endpoints

Nice work! The config.json seems to have "max_position_embeddings": 8192, but it doesn't work fine after 4096

#1
by Panchovix - opened

Hi there, noticed on the config.json

"max_position_embeddings": 8192

But when trying to do inference at context above 4K, I get gibberish. Are we supposed to use a rope value different than 1?

Thanks!

Hi, we definitely trained with a length of 8192, although I think most of the data was <= 4k in terms of length, so this might be what's going on. Additionally, the DPO training data didn't go over 6k tokens. So probably some extra work is required to get these models to work well with such long contexts - we didn't really test long context settings much.

Sign up or log in to comment