JustinDu
/

BARTxiv

text2text-generation

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

BARTxiv / README.md

Justin Du

Update README.md

580f81d 12 months ago

|

history blame contribute delete

No virus

2.57 kB

	---
	language: en
	license: mit
	library_name: transformers
	tags:
	- summarization
	- bart
	datasets: ccdv/arxiv-summarization
	model-index:
	- name: BARTxiv
	results:
	- task:
	type: summarization
	dataset:
	name: arxiv-summarization
	type: ccdv/arxiv-summarization
	split: validation
	metrics:
	- type: rouge1
	value: 41.70204016592095
	- type: rouge2
	value: 15.134827404979639
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# BARTxiv

	See the model implementation [here](https://interrsect.web.app).

	This model is a fine-tuned version of [facebook/bart-large-cnn](https://huggingface.co/facebook/bart-large-cnn) on the [arxiv-summarization](https://huggingface.co/datasets/ccdv/arxiv-summarization) dataset.
	It achieves the following results on the validation set:
	- Loss: 0.86
	- Rouge1: 41.70
	- Rouge2: 15.13
	- Rougel: 22.85
	- Rougelsum: 37.77

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-6
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- optimizer: Adafactor
	- num_epochs: 9

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|
	\| 1.24 \| 1.0 \| 1073 \| 1.24 \| 38.32 \| 12.80 \| 20.55 \| 34.50 \|
	\| 1.04 \| 2.0 \| 2146 \| 1.04 \| 39.65 \| 13.74 \| 21.28 \| 35.83 \|
	\| 0.979 \| 3.0 \| 3219 \| 0.98 \| 40.19 \| 14.30 \| 21.87 \| 36.38 \|
	\| 0.970 \| 4.0 \| 4292 \| 0.97 \| 40.87 \| 14.44 \| 22.14 \| 36.89 \|
	\| 0.918 \| 5.0 \| 5365 \| 0.92 \| 41.17 \| 14.94 \| 22.54 \| 37.40 \|
	\| 0.901 \| 6.0 \| 6438 \| 0.90 \| 41.02 \| 14.65 \| 22.46 \| 37.05 \|
	\| 0.889 \| 7.0 \| 7511 \| 0.89 \| 41.32 \| 15.09 \| 22.64 \| 37.42 \|
	\| 0.900 \| 8.0 \| 8584 \| 0 .90 \| 41.23 \| 15.02 \| 22.67 \| 37.28 \|
	\| 0.869 \| 9.0 \| 9657 \| 0.87 \| 41.70 \| 15.13 \| 22.85 \| 37.77 \|

	### Framework versions

	- Transformers 4.25.1
	- Pytorch 1.13.0+cu117
	- Datasets 2.6.1
	- Tokenizers 0.13.1