Rough estimates for text generation?

#24

by hf2477565 - opened Apr 13, 2023

Apr 13, 2023

Hi there,

I'm new to transformers, torch, and basically any ML development from the last decade and I'm trying to get back into it.

I've setup a jupyter notebook with torch and cuda enabled, I have an RTX 2080 8GB. I'm not expecting blistering performance, but should that be sufficient to build a pipeline from a pretrained model and get it to give me answers in say less than 10 minutes?

This code runs without error in about 8 minutes or so:

tokenizer = AutoTokenizer.from_pretrained("databricks/dolly-v2-12b", padding_side="left")
model = AutoModelForCausalLM.from_pretrained("databricks/dolly-v2-12b", offload_folder="offload", torch_dtype=torch.bfloat16, device_map="auto", load_in_8bit=True)

generate_text = InstructionTextGenerationPipeline(model=model, tokenizer=tokenizer)

but

generate_text("tell a short story")

just seems to hang.

I thought the pipeline inference would be relatively quick compared to loading the model. Are my expectations wrong?

srowen

Databricks org Apr 13, 2023

device_map='auto' is causing so much confusion. You don't have nearly enough GPU RAM to load so it loads most on the CPU and works but very slowly. Maybe we should just set the example to force CUDA 0 so it fails explicitly if it doesn't fit

For 16GB GPUs you can get it to load in 8 bit. For 8GB won't work. Use the 2.7B model?

srowen

Databricks org Apr 13, 2023

To answer your question should be like 10-20 seconds on an A10.

srowen changed discussion status to closed Apr 19, 2023

Sofie

Apr 26, 2023

•

edited Apr 26, 2023

Hi @srowen , sorry to follow up on a closed discussion, but I'm wondering how to specify the device_map argument to force CUDA 0 and fail explicitly, as you suggested?

srowen

Databricks org Apr 26, 2023

You just set device="cuda:0" then, and you don't need accelerate to figure out a device mapping in that case.

Sofie

Apr 27, 2023

Thank you! That's clear and works like a charm.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment