kenhktsui commited on
Commit
55b3b75
1 Parent(s): 11666d1

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,225 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: sentence-transformers/paraphrase-mpnet-base-v2
3
+ library_name: setfit
4
+ metrics:
5
+ - accuracy
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - setfit
9
+ - sentence-transformers
10
+ - text-classification
11
+ - generated_from_setfit_trainer
12
+ widget:
13
+ - text: At least 27 people were killed and over 200 injured in a devastating gas explosion
14
+ that ripped through a residential area in central Mexico City, officials said
15
+ on Tuesday. The blast, which occurred at around 8pm local time, also left hundreds
16
+ of people homeless and caused widespread destruction. The explosion was so powerful
17
+ that it shattered windows and damaged buildings several blocks away. Rescue teams
18
+ were working through the night to search for anyone who may still be trapped under
19
+ the rubble. The cause of the explosion is still unknown, but authorities have
20
+ launched an investigation into the incident.
21
+ - text: Just got back from the most disappointing concert of my life. The artist was
22
+ late, the sound quality was terrible, and they only played 2 songs from their
23
+ new album. I was expecting so much more. 1/10 would not recommend.
24
+ - text: The new smartphone from Samsung has exceeded our expectations in every way.
25
+ The camera is top-notch, the battery life is impressive, and the display is vibrant
26
+ and clear. We were blown away by the seamless performance and the sleek design.
27
+ Overall, this phone is a game-changer in the tech industry and a must-have for
28
+ anyone looking for a high-quality device.
29
+ - text: 'Are you kidding me?! I just got a parking ticket for a spot that was clearly
30
+ marked as free for 1 hour. The city is just trying to rip us off. Unbelievable.
31
+ #Frustrated #ParkingTicket'
32
+ - text: Renowned actress Emma Stone took home the coveted Golden Globe award for Best
33
+ Actress in a Motion Picture last night, marking her second consecutive win in
34
+ the category. The 33-year-old actress was visibly emotional as she accepted the
35
+ award, thanking her team and family for their unwavering support. Stone's performance
36
+ in the critically acclaimed film 'The Favourite' earned her widespread critical
37
+ acclaim and a spot in the running for the prestigious award. This win solidifies
38
+ her position as one of the most talented and sought-after actresses in Hollywood.
39
+ inference: true
40
+ model-index:
41
+ - name: SetFit with sentence-transformers/paraphrase-mpnet-base-v2
42
+ results:
43
+ - task:
44
+ type: text-classification
45
+ name: Text Classification
46
+ dataset:
47
+ name: Unknown
48
+ type: unknown
49
+ split: test
50
+ metrics:
51
+ - type: accuracy
52
+ value: 0.89
53
+ name: Accuracy
54
+ ---
55
+
56
+ # SetFit with sentence-transformers/paraphrase-mpnet-base-v2
57
+
58
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
59
+
60
+ The model has been trained using an efficient few-shot learning technique that involves:
61
+
62
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
63
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
64
+
65
+ ## Model Details
66
+
67
+ ### Model Description
68
+ - **Model Type:** SetFit
69
+ - **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
70
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
71
+ - **Maximum Sequence Length:** 512 tokens
72
+ - **Number of Classes:** 2 classes
73
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
74
+ <!-- - **Language:** Unknown -->
75
+ <!-- - **License:** Unknown -->
76
+
77
+ ### Model Sources
78
+
79
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
80
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
81
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
82
+
83
+ ### Model Labels
84
+ | Label | Examples |
85
+ |:------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
86
+ | 1 | <ul><li>"The latest smartwatch from Apple has been making waves in the tech world, and for good reason. Its sleek design and vibrant display make it a fashion statement on the wrist. The watch's minimalist aesthetic is both elegant and understated, making it perfect for those who want a stylish accessory without drawing too much attention. With its impressive array of features, including built-in GPS and heart rate monitoring, this watch is a must-have for anyone looking to upgrade their fitness game. Overall, the Apple smartwatch is a stunning piece of technology that seamlessly blends form and function, making it a standout in the world of wearable tech."</li><li>"Just hit 1 year of consistent meditation practice and I can already feel the difference in my mental clarity and focus. It's crazy how much of an impact it's had on my relationships and overall well-being. I'm so grateful for the journey so far and excited to see where it takes me next #personal growth #mindfulness"</li><li>'Just wanted to say a huge thank you to @JohnDoe for helping me move into my new apartment today! His kindness and willingness to lend a hand made a huge difference in my day. I really appreciate everything he did for me!'</li></ul> |
87
+ | 0 | <ul><li>"Just lost my grandma today. Still can't believe she's gone. She was the most selfless person I've ever known. Always putting others before herself. I'm going to miss her so much. RIP grandma, you will be deeply missed."</li><li>"The latest economic figures have revealed a dismal picture, with the country's GDP growth rate plummeting to a 10-year low. Analysts warn that this trend is unlikely to reverse anytime soon, citing a lack of investment and stagnant consumer spending. As a result, the government is facing mounting pressure to implement policies that can stimulate growth and create jobs. However, with the current political climate, it remains to be seen whether such efforts will bear fruit."</li><li>"The highly anticipated new restaurant in town has been a major letdown for many customers. Despite its promising menu and sleek interior, the service has been slow and the food quality has been inconsistent. Many have taken to social media to express their disappointment, with some even going so far as to say that the restaurant is a 'complete waste of time and money.' The restaurant's management has yet to issue a statement addressing the concerns, leaving many to wonder if they will be able to turn things around."</li></ul> |
88
+
89
+ ## Evaluation
90
+
91
+ ### Metrics
92
+ | Label | Accuracy |
93
+ |:--------|:---------|
94
+ | **all** | 0.89 |
95
+
96
+ ## Uses
97
+
98
+ ### Direct Use for Inference
99
+
100
+ First install the SetFit library:
101
+
102
+ ```bash
103
+ pip install setfit
104
+ ```
105
+
106
+ Then you can load this model and run inference.
107
+
108
+ ```python
109
+ from setfit import SetFitModel
110
+
111
+ # Download from the 🤗 Hub
112
+ model = SetFitModel.from_pretrained("setfit_model_id")
113
+ # Run inference
114
+ preds = model("Are you kidding me?! I just got a parking ticket for a spot that was clearly marked as free for 1 hour. The city is just trying to rip us off. Unbelievable. #Frustrated #ParkingTicket")
115
+ ```
116
+
117
+ <!--
118
+ ### Downstream Use
119
+
120
+ *List how someone could finetune this model on their own dataset.*
121
+ -->
122
+
123
+ <!--
124
+ ### Out-of-Scope Use
125
+
126
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
127
+ -->
128
+
129
+ <!--
130
+ ## Bias, Risks and Limitations
131
+
132
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
133
+ -->
134
+
135
+ <!--
136
+ ### Recommendations
137
+
138
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
139
+ -->
140
+
141
+ ## Training Details
142
+
143
+ ### Training Set Metrics
144
+ | Training set | Min | Median | Max |
145
+ |:-------------|:----|:--------|:----|
146
+ | Word count | 32 | 65.6129 | 112 |
147
+
148
+ | Label | Training Sample Count |
149
+ |:------|:----------------------|
150
+ | 1 | 13 |
151
+ | 0 | 18 |
152
+
153
+ ### Training Hyperparameters
154
+ - batch_size: (16, 16)
155
+ - num_epochs: (5, 5)
156
+ - max_steps: -1
157
+ - sampling_strategy: oversampling
158
+ - body_learning_rate: (2e-05, 1e-05)
159
+ - head_learning_rate: 0.01
160
+ - loss: CosineSimilarityLoss
161
+ - distance_metric: cosine_distance
162
+ - margin: 0.25
163
+ - end_to_end: False
164
+ - use_amp: False
165
+ - warmup_proportion: 0.1
166
+ - seed: 42
167
+ - eval_max_steps: -1
168
+ - load_best_model_at_end: True
169
+
170
+ ### Training Results
171
+ | Epoch | Step | Training Loss | Validation Loss |
172
+ |:-------:|:-------:|:-------------:|:---------------:|
173
+ | 0.0303 | 1 | 0.3052 | - |
174
+ | 1.0 | 33 | - | 0.0154 |
175
+ | 1.5152 | 50 | 0.0008 | - |
176
+ | 2.0 | 66 | - | 0.0039 |
177
+ | 3.0 | 99 | - | 0.0019 |
178
+ | 3.0303 | 100 | 0.0001 | - |
179
+ | 4.0 | 132 | - | 0.0017 |
180
+ | 4.5455 | 150 | 0.0002 | - |
181
+ | **5.0** | **165** | **-** | **0.0014** |
182
+
183
+ * The bold row denotes the saved checkpoint.
184
+ ### Framework Versions
185
+ - Python: 3.9.19
186
+ - SetFit: 1.1.0.dev0
187
+ - Sentence Transformers: 3.0.1
188
+ - Transformers: 4.39.0
189
+ - PyTorch: 2.4.0
190
+ - Datasets: 2.20.0
191
+ - Tokenizers: 0.15.2
192
+
193
+ ## Citation
194
+
195
+ ### BibTeX
196
+ ```bibtex
197
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
198
+ doi = {10.48550/ARXIV.2209.11055},
199
+ url = {https://arxiv.org/abs/2209.11055},
200
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
201
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
202
+ title = {Efficient Few-Shot Learning Without Prompts},
203
+ publisher = {arXiv},
204
+ year = {2022},
205
+ copyright = {Creative Commons Attribution 4.0 International}
206
+ }
207
+ ```
208
+
209
+ <!--
210
+ ## Glossary
211
+
212
+ *Clearly define terms in order to be accessible across audiences.*
213
+ -->
214
+
215
+ <!--
216
+ ## Model Card Authors
217
+
218
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
219
+ -->
220
+
221
+ <!--
222
+ ## Model Card Contact
223
+
224
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
225
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "setfit/step_165",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.39.0",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.39.0",
5
+ "pytorch": "2.4.0"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": [
4
+ "1",
5
+ "0"
6
+ ]
7
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c845ced8aca9af3b803de4faf6366ff3424c36181fa8dba3d5515ab0165127d3
3
+ size 437967672
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e57eb002eea752eee7a003a4ebf912f13ad54d80cb6b7974a4010f76251958be
3
+ size 6991
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "104": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "30526": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": true,
46
+ "cls_token": "<s>",
47
+ "do_basic_tokenize": true,
48
+ "do_lower_case": true,
49
+ "eos_token": "</s>",
50
+ "mask_token": "<mask>",
51
+ "max_length": 512,
52
+ "model_max_length": 512,
53
+ "never_split": null,
54
+ "pad_to_multiple_of": null,
55
+ "pad_token": "<pad>",
56
+ "pad_token_type_id": 0,
57
+ "padding_side": "right",
58
+ "sep_token": "</s>",
59
+ "stride": 0,
60
+ "strip_accents": null,
61
+ "tokenize_chinese_chars": true,
62
+ "tokenizer_class": "MPNetTokenizer",
63
+ "truncation_side": "right",
64
+ "truncation_strategy": "longest_first",
65
+ "unk_token": "[UNK]"
66
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff