--- license: openrail datasets: - WelfCrozzo/kupalinka language: - be - en - ru metrics: - bleu library_name: transformers tags: - translation widget: - text: "да зорак праз цяжкасці" example_title: "be -> ru" - text: "да зорак праз цяжкасці" example_title: "be -> en" - text: "к звездам через трудности" example_title: "ru -> be" - text: "к звездам через трудности" example_title: "ru -> en" - text: "to the stars through difficulties." example_title: "en -> be" - text: "to the stars through difficulties." example_title: "en -> ru" --- # T5 for belarusian language ![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67) This model is based on T5-small with sequence length equal 128 tokens. Model trained from scratch on RTX 3090 24GB. # Supported tasks: - translation BE to RU: `` - translation BE to EN: `` - translation RU to BE: `` - translation RU to EN: `` - translation EN to BE: `` - translation EN to RU: `` # Metrics: - [evel/BLEU](https://api.wandb.ai/links/miklgr500/31mq4s36) - [evel/loss](https://api.wandb.ai/links/miklgr500/rvi2p69n) - [train/loss](https://api.wandb.ai/links/miklgr500/z9alu3n5) # How to Get Started with the Model
Click to expand ```python from transformers import T5TokenizerFast, T5ForConditionalGeneration tokenizer = T5TokenizerFast.from_pretrained("WelfCrozzo/T5-L128-belarusian") model = T5ForConditionalGeneration.from_pretrained("WelfCrozzo/T5-L128-belarusian") x = tokenizer.encode('да зорак праз цяжкасці', return_tensors='pt') result = model.generate(x, return_dict_in_generate=True, output_scores=True,max_length=128) print(tokenizer.decode(result["sequences"][0])) ```
# References - [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://jmlr.org/papers/volume21/20-074/20-074.pdf)