Not able do FlashAttention

#2
by fansheguang - opened

my current gpu does not support FlashAttention, without this, how will the performance be impacted ?

fansheguang changed discussion title from Suggestion on slow download to Not able do FlashAttention

@fansheguang what is your GPU? Is it local or cloud based?

We'd like to see if there are other means for you to run the model without a considerable performance impact.

fansheguang changed discussion status to closed
fansheguang changed discussion status to open

Hi Nick , my gpu is on the AWS, and it's a g4dn.8xlarge[T4] (128G). Thank you!

Sign up or log in to comment