use_flash_attention_2=True

#9
by TillFetzer - opened

One Question, because use_flash_attention_2=True does not work. Is trust_remote_code=True the same in these context. Or is it depended on a speciall package version?

LAION LeoLM org

This should work without using trust_remote_code=True. Then you can load the model with attn_implementation="flash_attention_2" or without if you would prefer not to have the flash-attndependency.

Can you shortly say which transformer version was used, because ValueError: The following model_kwargs are not used by the model: ['attn_implementation']. I will test different, but maybe you can shorten it by stating the right one. I have transformers==4.34.0 can and flash-attn ==2.3.2

works for me over # Load model directly with use_flash_attention_2=True

TillFetzer changed discussion status to closed

Sign up or log in to comment