PseudoTerminal X commited on
Commit
abe779e
1 Parent(s): 012d2fd

update model_max_length for a valuable signal on whether this is Schnell or Dev model

Browse files

Currently we have to check the model name or its guidance embedding configurations, but both of these are editable by continued finetuning. The sequence length cannot be changed through fine-tuning, it requires continued pretraining and corrected attn_mask handling during SDPA.

This is a humble request that should improve the utility of Schnell with less work for downstream adaptations.

Files changed (1) hide show
  1. tokenizer_2/tokenizer_config.json +1 -1
tokenizer_2/tokenizer_config.json CHANGED
@@ -932,7 +932,7 @@
932
  "eos_token": "</s>",
933
  "extra_ids": 100,
934
  "legacy": true,
935
- "model_max_length": 512,
936
  "pad_token": "<pad>",
937
  "sp_model_kwargs": {},
938
  "tokenizer_class": "T5Tokenizer",
 
932
  "eos_token": "</s>",
933
  "extra_ids": 100,
934
  "legacy": true,
935
+ "model_max_length": 256,
936
  "pad_token": "<pad>",
937
  "sp_model_kwargs": {},
938
  "tokenizer_class": "T5Tokenizer",