Edit model card

llm.c checkpoint: GPT-2 774M

This is a HF/safetensors conversion of the llm.c checkpoint of a 774M parameter run on 150B tokens from FineWeb.

Training was conducted on a single 8xA100 80GB SXM node for ~6 days.

See discussion on GitHub for more information.

Downloads last month
85
Safetensors
Model size
774M params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Dataset used to train mdouglas/llmc-gpt2-774M-150B

Collection including mdouglas/llmc-gpt2-774M-150B