Björn Plüster

New activity in nvidia/Nemotron-4-340B-Base 3 months ago

Hf safetensors version

9

#3 opened 3 months ago by

ehartford

New activity in LeoLM/leo-mistral-hessianai-7b-chat 4 months ago

use_flash_attention_2=True

#9 opened 4 months ago by

TillFetzer

New activity in LeoLM/leo-mistral-hessianai-7b-chat 5 months ago

leo-mistral-hessianai-7b-chat for privateGPT

#8 opened 5 months ago by

Dodo124

New activity in DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental 5 months ago

Update tokenizer_config.json

#1 opened 5 months ago by

New activity in LeoLM/leo-hessianai-13b-chat 6 months ago

Problems with flash-attention2

#13 opened 6 months ago by

omaer0

New activity in DiscoResearch/mixtral-7b-8expert 9 months ago

Loss function?

#10 opened 9 months ago by

narvind2003

New activity in mistralai/Mixtral-8x7B-v0.1 9 months ago

No multi GPU inference support?

8

#4 opened 9 months ago by

dataautogpt3

New activity in LeoLM/leo-hessianai-7b 9 months ago

Llama2 vs Mistral

#2 opened 9 months ago by

lightningRalf

New activity in DiscoResearch/mixtral-7b-8expert 9 months ago

Add languages

#8 opened 9 months ago by

lbourdois

Missing module/classes: from transformers.cache_utils import Cache, DynamicCache

#7 opened 9 months ago by

panopstor

New activity in DiscoResearch/DiscoLM-mixtral-8x7b-v2 9 months ago

changed "tokenizer" typo to be the one we create.

#4 opened 9 months ago by

dyngnosis

Which transformers version is being used here?

#6 opened 9 months ago by

Promptengineering

New activity in DiscoResearch/mixtral-7b-8expert 9 months ago

Flash dependency (locks out non-NVIDIA GPUs)

#4 opened 9 months ago by

Thalesian

Update modeling_moe_mistral.py

#5 opened 9 months ago by

Really appreciate the work put into this! I have noticed a change in the model output since first release.

#3 opened 9 months ago by

AARon99

New activity in DiscoResearch/DiscoLM-mixtral-8x7b-v2 9 months ago

Trying to quantize. Running into the issue below. Any suggestions?

#5 opened 9 months ago by

BigDeeper

New activity in DiscoResearch/DiscoLM-mixtral-8x7b-v2 10 months ago

small readme fix

#1 opened 10 months ago by

jphme

New activity in DiscoResearch/mixtral-7b-8expert 10 months ago

Update modeling_moe_mistral.py

#1 opened 10 months ago by

New activity in LeoLM/leo-hessianai-70b-chat 10 months ago

AWQ-Variante

4

#2 opened 10 months ago by

SebastianBodza

Little Mistake :)

#1 opened 10 months ago by

DRXD1000

New activity in LeoLM/leo-hessianai-13b-chat 10 months ago

Can you incorporate madlad400 training data ?

#11 opened 10 months ago by

cmp-nct

New activity in DiscoResearch/DiscoLM-70b 10 months ago

Is this instruction following model?

#1 opened 10 months ago by

rjmehta

New activity in LeoLM/leo-mistral-hessianai-7b-chat 10 months ago

fix vocab size

4

#4 opened 10 months ago by

jphme

New activity in LumiOpen/Poro-34B 10 months ago

Inconsistency in effective batch size reporting

#1 opened 10 months ago by

New activity in OpenAssistant/OASST-DE 10 months ago

Update README.md

#2 opened 10 months ago by

waler4ik28

New activity in LeoLM/leo-mistral-hessianai-7b-chat 11 months ago

Update READEME.md to include system prompt

#3 opened 11 months ago by

aari1995

New activity in LeoLM/leo-hessianai-13b-chat 11 months ago

Quantise this model - missing file

#10 opened 11 months ago by

cuh008

New activity in LeoLM/leo-mistral-hessianai-7b-chat 11 months ago

gguf version?

#2 opened 11 months ago by

guido1893

New activity in LeoLM/leo-hessianai-13b-chat 11 months ago

Sentence Transformers

#8 opened 11 months ago by

Ambiguity in Language detection

5

#7 opened 11 months ago by

tokenizer.model missing?

#2 opened 12 months ago by

darule

New activity in LeoLM/leo-hessianai-13b-chat-bilingual 11 months ago

Quantised models by thebloke

#1 opened 12 months ago by

choltha

Training code

#2 opened 11 months ago by

robert-h

New activity in LeoLM/leo-mistral-hessianai-7b-chat 11 months ago

First sentence of the description wrong?

#1 opened 11 months ago by

h3ndrik

New activity in LeoLM/leo-hessianai-13b-chat 12 months ago

Add a `chat template` to this repository

#6 opened 12 months ago by

LLukas22

How to achieve better results with fine-tuning

#5 opened 12 months ago by

Some weights of LlamaForCausalLM were not initialized from the model checkpoint

#3 opened 12 months ago by

fcivardi

CUDA out of memory applying to a dataset of texts

#4 opened 12 months ago by

fcivardi

New activity in LeoLM/leo-hessianai-13b 12 months ago

how to prompt

#5 opened 12 months ago by

g58892881

Falsche Ausgabe bei Abfrage von Landeshauptstädten:

#4 opened 12 months ago by

darule

Flash attention NVCC requirements

#2 opened 12 months ago by

New activity in LeoLM/leo-hessianai-7b-chat 12 months ago

missing tokenizer.model?

7

#2 opened 12 months ago by

b0968

New activity in LeoLM/leo-hessianai-13b 12 months ago

Can't get any reasonable output

#3 opened 12 months ago by

Sebbecking

tokenizer.model missing?

#1 opened 12 months ago by

b0968

New activity in LeoLM/leo-hessianai-7b-chat 12 months ago

Commercial use

5

#1 opened 12 months ago by

BramVanroy

New activity in LeoLM/leo-hessianai-13b-chat 12 months ago

Is there a problem with year numbers?

#1 opened 12 months ago by

stelterlab

New activity in togethercomputer/LLaMA-2-7B-32K about 1 year ago

Installing ! pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary but flah_llama still erroring out

4

#25 opened about 1 year ago by

ajash

New activity in Salesforce/blip2-flan-t5-xxl over 1 year ago

Fixed typo in FP16 and 8bit examples

#4 opened over 1 year ago by