elsayovita commited on
Commit
509f7cc
1 Parent(s): 85819ba

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,825 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-small-en-v1.5
3
+ datasets: []
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:11863
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: In the fiscal year 2022, the emissions were categorized into different
35
+ scopes, with each scope representing a specific source of emissions
36
+ sentences:
37
+ - 'Question: What is NetLink proactive in identifying to be more efficient in? '
38
+ - What standard is the Environment, Health, and Safety Management System (EHSMS)
39
+ audited to by a third-party accredited certification body at the operational assets
40
+ level of CLI?
41
+ - What do the different scopes represent in terms of emissions in the fiscal year
42
+ 2022?
43
+ - source_sentence: NetLink is committed to protecting the security of all information
44
+ and information systems, including both end-user data and corporate data. To this
45
+ end, management ensures that the appropriate IT policies, personal data protection
46
+ policy, risk mitigation strategies, cyber security programmes, systems, processes,
47
+ and controls are in place to protect our IT systems and confidential data
48
+ sentences:
49
+ - '"What recognition did NetLink receive in FY22?"'
50
+ - What measures does NetLink have in place to protect the security of all information
51
+ and information systems, including end-user data and corporate data?
52
+ - 'Question: What does Disclosure 102-10 discuss regarding the organization and
53
+ its supply chain?'
54
+ - source_sentence: In the domain of economic performance, the focus is on the financial
55
+ health and growth of the organization, ensuring sustainable profitability and
56
+ value creation for stakeholders
57
+ sentences:
58
+ - What does NetLink prioritize by investing in its network to ensure reliability
59
+ and quality of infrastructure?
60
+ - What percentage of the total energy was accounted for by heat, steam, and chilled
61
+ water in 2021 according to the given information?
62
+ - What is the focus in the domain of economic performance, ensuring sustainable
63
+ profitability and value creation for stakeholders?
64
+ - source_sentence: Disclosure 102-41 discusses collective bargaining agreements and
65
+ is found on page 98
66
+ sentences:
67
+ - What topic is discussed in Disclosure 102-41 on page 98 of the document?
68
+ - What was the number of cases in 2021, following a decrease from 42 cases in 2020?
69
+ - What type of data does GRI 101 provide in relation to connecting the nation?
70
+ - source_sentence: Employee health and well-being has never been more topical than
71
+ it was in the past year. We understand that people around the world, including
72
+ our employees, have been increasingly exposed to factors affecting their physical
73
+ and mental wellbeing. We are committed to creating an environment that supports
74
+ our employees and ensures they feel valued and have a sense of belonging. We utilised
75
+ sentences:
76
+ - What aspect of the standard covers the evaluation of the management approach?
77
+ - 'Question: What is the company''s commitment towards its employees'' health and
78
+ well-being based on the provided context information?'
79
+ - What types of skills does NetLink focus on developing through their training and
80
+ development opportunities for employees?
81
+ model-index:
82
+ - name: BAAI BGE small en v1.5 ESG
83
+ results:
84
+ - task:
85
+ type: information-retrieval
86
+ name: Information Retrieval
87
+ dataset:
88
+ name: dim 384
89
+ type: dim_384
90
+ metrics:
91
+ - type: cosine_accuracy@1
92
+ value: 0.786984742476608
93
+ name: Cosine Accuracy@1
94
+ - type: cosine_accuracy@3
95
+ value: 0.9269156199949422
96
+ name: Cosine Accuracy@3
97
+ - type: cosine_accuracy@5
98
+ value: 0.944617718958105
99
+ name: Cosine Accuracy@5
100
+ - type: cosine_accuracy@10
101
+ value: 0.9597066509314676
102
+ name: Cosine Accuracy@10
103
+ - type: cosine_precision@1
104
+ value: 0.786984742476608
105
+ name: Cosine Precision@1
106
+ - type: cosine_precision@3
107
+ value: 0.3089718733316474
108
+ name: Cosine Precision@3
109
+ - type: cosine_precision@5
110
+ value: 0.18892354379162102
111
+ name: Cosine Precision@5
112
+ - type: cosine_precision@10
113
+ value: 0.09597066509314678
114
+ name: Cosine Precision@10
115
+ - type: cosine_recall@1
116
+ value: 0.021860687291016895
117
+ name: Cosine Recall@1
118
+ - type: cosine_recall@3
119
+ value: 0.025747656110970626
120
+ name: Cosine Recall@3
121
+ - type: cosine_recall@5
122
+ value: 0.026239381082169593
123
+ name: Cosine Recall@5
124
+ - type: cosine_recall@10
125
+ value: 0.026658518081429664
126
+ name: Cosine Recall@10
127
+ - type: cosine_ndcg@10
128
+ value: 0.19459455903970813
129
+ name: Cosine Ndcg@10
130
+ - type: cosine_mrr@10
131
+ value: 0.8588156921146056
132
+ name: Cosine Mrr@10
133
+ - type: cosine_map@100
134
+ value: 0.023886995279989515
135
+ name: Cosine Map@100
136
+ - task:
137
+ type: information-retrieval
138
+ name: Information Retrieval
139
+ dataset:
140
+ name: dim 256
141
+ type: dim_256
142
+ metrics:
143
+ - type: cosine_accuracy@1
144
+ value: 0.7815055213689623
145
+ name: Cosine Accuracy@1
146
+ - type: cosine_accuracy@3
147
+ value: 0.9236280873303548
148
+ name: Cosine Accuracy@3
149
+ - type: cosine_accuracy@5
150
+ value: 0.9421731433870016
151
+ name: Cosine Accuracy@5
152
+ - type: cosine_accuracy@10
153
+ value: 0.9596223552221191
154
+ name: Cosine Accuracy@10
155
+ - type: cosine_precision@1
156
+ value: 0.7815055213689623
157
+ name: Cosine Precision@1
158
+ - type: cosine_precision@3
159
+ value: 0.30787602911011824
160
+ name: Cosine Precision@3
161
+ - type: cosine_precision@5
162
+ value: 0.18843462867740032
163
+ name: Cosine Precision@5
164
+ - type: cosine_precision@10
165
+ value: 0.09596223552221193
166
+ name: Cosine Precision@10
167
+ - type: cosine_recall@1
168
+ value: 0.021708486704693403
169
+ name: Cosine Recall@1
170
+ - type: cosine_recall@3
171
+ value: 0.025656335759176533
172
+ name: Cosine Recall@3
173
+ - type: cosine_recall@5
174
+ value: 0.026171476205194496
175
+ name: Cosine Recall@5
176
+ - type: cosine_recall@10
177
+ value: 0.02665617653394776
178
+ name: Cosine Recall@10
179
+ - type: cosine_ndcg@10
180
+ value: 0.19396598426779785
181
+ name: Cosine Ndcg@10
182
+ - type: cosine_mrr@10
183
+ value: 0.8550811914864019
184
+ name: Cosine Mrr@10
185
+ - type: cosine_map@100
186
+ value: 0.023784308256522512
187
+ name: Cosine Map@100
188
+ - task:
189
+ type: information-retrieval
190
+ name: Information Retrieval
191
+ dataset:
192
+ name: dim 128
193
+ type: dim_128
194
+ metrics:
195
+ - type: cosine_accuracy@1
196
+ value: 0.7713057405378067
197
+ name: Cosine Accuracy@1
198
+ - type: cosine_accuracy@3
199
+ value: 0.9141869678833348
200
+ name: Cosine Accuracy@3
201
+ - type: cosine_accuracy@5
202
+ value: 0.9346708252549946
203
+ name: Cosine Accuracy@5
204
+ - type: cosine_accuracy@10
205
+ value: 0.9532158813116413
206
+ name: Cosine Accuracy@10
207
+ - type: cosine_precision@1
208
+ value: 0.7713057405378067
209
+ name: Cosine Precision@1
210
+ - type: cosine_precision@3
211
+ value: 0.3047289892944449
212
+ name: Cosine Precision@3
213
+ - type: cosine_precision@5
214
+ value: 0.18693416505099894
215
+ name: Cosine Precision@5
216
+ - type: cosine_precision@10
217
+ value: 0.09532158813116413
218
+ name: Cosine Precision@10
219
+ - type: cosine_recall@1
220
+ value: 0.021425159459383523
221
+ name: Cosine Recall@1
222
+ - type: cosine_recall@3
223
+ value: 0.025394082441203752
224
+ name: Cosine Recall@3
225
+ - type: cosine_recall@5
226
+ value: 0.025963078479305412
227
+ name: Cosine Recall@5
228
+ - type: cosine_recall@10
229
+ value: 0.026478218925323375
230
+ name: Cosine Recall@10
231
+ - type: cosine_ndcg@10
232
+ value: 0.192049680708846
233
+ name: Cosine Ndcg@10
234
+ - type: cosine_mrr@10
235
+ value: 0.8456702445512195
236
+ name: Cosine Mrr@10
237
+ - type: cosine_map@100
238
+ value: 0.023531692780408037
239
+ name: Cosine Map@100
240
+ - task:
241
+ type: information-retrieval
242
+ name: Information Retrieval
243
+ dataset:
244
+ name: dim 64
245
+ type: dim_64
246
+ metrics:
247
+ - type: cosine_accuracy@1
248
+ value: 0.7428137907780494
249
+ name: Cosine Accuracy@1
250
+ - type: cosine_accuracy@3
251
+ value: 0.892438674871449
252
+ name: Cosine Accuracy@3
253
+ - type: cosine_accuracy@5
254
+ value: 0.9184860490601029
255
+ name: Cosine Accuracy@5
256
+ - type: cosine_accuracy@10
257
+ value: 0.9411615948748209
258
+ name: Cosine Accuracy@10
259
+ - type: cosine_precision@1
260
+ value: 0.7428137907780494
261
+ name: Cosine Precision@1
262
+ - type: cosine_precision@3
263
+ value: 0.297479558290483
264
+ name: Cosine Precision@3
265
+ - type: cosine_precision@5
266
+ value: 0.1836972098120206
267
+ name: Cosine Precision@5
268
+ - type: cosine_precision@10
269
+ value: 0.09411615948748209
270
+ name: Cosine Precision@10
271
+ - type: cosine_recall@1
272
+ value: 0.02063371641050138
273
+ name: Cosine Recall@1
274
+ - type: cosine_recall@3
275
+ value: 0.024789963190873596
276
+ name: Cosine Recall@3
277
+ - type: cosine_recall@5
278
+ value: 0.02551350136278064
279
+ name: Cosine Recall@5
280
+ - type: cosine_recall@10
281
+ value: 0.026143377635411698
282
+ name: Cosine Recall@10
283
+ - type: cosine_ndcg@10
284
+ value: 0.18745029665008597
285
+ name: Cosine Ndcg@10
286
+ - type: cosine_mrr@10
287
+ value: 0.8220114494981732
288
+ name: Cosine Mrr@10
289
+ - type: cosine_map@100
290
+ value: 0.022884160441989647
291
+ name: Cosine Map@100
292
+ - task:
293
+ type: information-retrieval
294
+ name: Information Retrieval
295
+ dataset:
296
+ name: dim 32
297
+ type: dim_32
298
+ metrics:
299
+ - type: cosine_accuracy@1
300
+ value: 0.6668633566551463
301
+ name: Cosine Accuracy@1
302
+ - type: cosine_accuracy@3
303
+ value: 0.8242434460085981
304
+ name: Cosine Accuracy@3
305
+ - type: cosine_accuracy@5
306
+ value: 0.8640310208210402
307
+ name: Cosine Accuracy@5
308
+ - type: cosine_accuracy@10
309
+ value: 0.8987608530725786
310
+ name: Cosine Accuracy@10
311
+ - type: cosine_precision@1
312
+ value: 0.6668633566551463
313
+ name: Cosine Precision@1
314
+ - type: cosine_precision@3
315
+ value: 0.27474781533619935
316
+ name: Cosine Precision@3
317
+ - type: cosine_precision@5
318
+ value: 0.17280620416420805
319
+ name: Cosine Precision@5
320
+ - type: cosine_precision@10
321
+ value: 0.08987608530725787
322
+ name: Cosine Precision@10
323
+ - type: cosine_recall@1
324
+ value: 0.018523982129309623
325
+ name: Cosine Recall@1
326
+ - type: cosine_recall@3
327
+ value: 0.022895651278016623
328
+ name: Cosine Recall@3
329
+ - type: cosine_recall@5
330
+ value: 0.024000861689473345
331
+ name: Cosine Recall@5
332
+ - type: cosine_recall@10
333
+ value: 0.02496557925201608
334
+ name: Cosine Recall@10
335
+ - type: cosine_ndcg@10
336
+ value: 0.17367624271978654
337
+ name: Cosine Ndcg@10
338
+ - type: cosine_mrr@10
339
+ value: 0.7532998425142056
340
+ name: Cosine Mrr@10
341
+ - type: cosine_map@100
342
+ value: 0.02100792923667254
343
+ name: Cosine Map@100
344
+ ---
345
+
346
+ # BAAI BGE small en v1.5 ESG
347
+
348
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
349
+
350
+ ## Model Details
351
+
352
+ ### Model Description
353
+ - **Model Type:** Sentence Transformer
354
+ - **Base model:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) <!-- at revision 5c38ec7c405ec4b44b94cc5a9bb96e735b38267a -->
355
+ - **Maximum Sequence Length:** 512 tokens
356
+ - **Output Dimensionality:** 384 tokens
357
+ - **Similarity Function:** Cosine Similarity
358
+ <!-- - **Training Dataset:** Unknown -->
359
+ - **Language:** en
360
+ - **License:** apache-2.0
361
+
362
+ ### Model Sources
363
+
364
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
365
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
366
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
367
+
368
+ ### Full Model Architecture
369
+
370
+ ```
371
+ SentenceTransformer(
372
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
373
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
374
+ (2): Normalize()
375
+ )
376
+ ```
377
+
378
+ ## Usage
379
+
380
+ ### Direct Usage (Sentence Transformers)
381
+
382
+ First install the Sentence Transformers library:
383
+
384
+ ```bash
385
+ pip install -U sentence-transformers
386
+ ```
387
+
388
+ Then you can load this model and run inference.
389
+ ```python
390
+ from sentence_transformers import SentenceTransformer
391
+
392
+ # Download from the 🤗 Hub
393
+ model = SentenceTransformer("elsayovita/bge-small-en-v1.5-esg-v2")
394
+ # Run inference
395
+ sentences = [
396
+ 'Employee health and well-being has never been more topical than it was in the past year. We understand that people around the world, including our employees, have been increasingly exposed to factors affecting their physical and mental wellbeing. We are committed to creating an environment that supports our employees and ensures they feel valued and have a sense of belonging. We utilised',
397
+ "Question: What is the company's commitment towards its employees' health and well-being based on the provided context information?",
398
+ 'What types of skills does NetLink focus on developing through their training and development opportunities for employees?',
399
+ ]
400
+ embeddings = model.encode(sentences)
401
+ print(embeddings.shape)
402
+ # [3, 384]
403
+
404
+ # Get the similarity scores for the embeddings
405
+ similarities = model.similarity(embeddings, embeddings)
406
+ print(similarities.shape)
407
+ # [3, 3]
408
+ ```
409
+
410
+ <!--
411
+ ### Direct Usage (Transformers)
412
+
413
+ <details><summary>Click to see the direct usage in Transformers</summary>
414
+
415
+ </details>
416
+ -->
417
+
418
+ <!--
419
+ ### Downstream Usage (Sentence Transformers)
420
+
421
+ You can finetune this model on your own dataset.
422
+
423
+ <details><summary>Click to expand</summary>
424
+
425
+ </details>
426
+ -->
427
+
428
+ <!--
429
+ ### Out-of-Scope Use
430
+
431
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
432
+ -->
433
+
434
+ ## Evaluation
435
+
436
+ ### Metrics
437
+
438
+ #### Information Retrieval
439
+ * Dataset: `dim_384`
440
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
441
+
442
+ | Metric | Value |
443
+ |:--------------------|:-----------|
444
+ | cosine_accuracy@1 | 0.787 |
445
+ | cosine_accuracy@3 | 0.9269 |
446
+ | cosine_accuracy@5 | 0.9446 |
447
+ | cosine_accuracy@10 | 0.9597 |
448
+ | cosine_precision@1 | 0.787 |
449
+ | cosine_precision@3 | 0.309 |
450
+ | cosine_precision@5 | 0.1889 |
451
+ | cosine_precision@10 | 0.096 |
452
+ | cosine_recall@1 | 0.0219 |
453
+ | cosine_recall@3 | 0.0257 |
454
+ | cosine_recall@5 | 0.0262 |
455
+ | cosine_recall@10 | 0.0267 |
456
+ | cosine_ndcg@10 | 0.1946 |
457
+ | cosine_mrr@10 | 0.8588 |
458
+ | **cosine_map@100** | **0.0239** |
459
+
460
+ #### Information Retrieval
461
+ * Dataset: `dim_256`
462
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
463
+
464
+ | Metric | Value |
465
+ |:--------------------|:-----------|
466
+ | cosine_accuracy@1 | 0.7815 |
467
+ | cosine_accuracy@3 | 0.9236 |
468
+ | cosine_accuracy@5 | 0.9422 |
469
+ | cosine_accuracy@10 | 0.9596 |
470
+ | cosine_precision@1 | 0.7815 |
471
+ | cosine_precision@3 | 0.3079 |
472
+ | cosine_precision@5 | 0.1884 |
473
+ | cosine_precision@10 | 0.096 |
474
+ | cosine_recall@1 | 0.0217 |
475
+ | cosine_recall@3 | 0.0257 |
476
+ | cosine_recall@5 | 0.0262 |
477
+ | cosine_recall@10 | 0.0267 |
478
+ | cosine_ndcg@10 | 0.194 |
479
+ | cosine_mrr@10 | 0.8551 |
480
+ | **cosine_map@100** | **0.0238** |
481
+
482
+ #### Information Retrieval
483
+ * Dataset: `dim_128`
484
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
485
+
486
+ | Metric | Value |
487
+ |:--------------------|:-----------|
488
+ | cosine_accuracy@1 | 0.7713 |
489
+ | cosine_accuracy@3 | 0.9142 |
490
+ | cosine_accuracy@5 | 0.9347 |
491
+ | cosine_accuracy@10 | 0.9532 |
492
+ | cosine_precision@1 | 0.7713 |
493
+ | cosine_precision@3 | 0.3047 |
494
+ | cosine_precision@5 | 0.1869 |
495
+ | cosine_precision@10 | 0.0953 |
496
+ | cosine_recall@1 | 0.0214 |
497
+ | cosine_recall@3 | 0.0254 |
498
+ | cosine_recall@5 | 0.026 |
499
+ | cosine_recall@10 | 0.0265 |
500
+ | cosine_ndcg@10 | 0.192 |
501
+ | cosine_mrr@10 | 0.8457 |
502
+ | **cosine_map@100** | **0.0235** |
503
+
504
+ #### Information Retrieval
505
+ * Dataset: `dim_64`
506
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
507
+
508
+ | Metric | Value |
509
+ |:--------------------|:-----------|
510
+ | cosine_accuracy@1 | 0.7428 |
511
+ | cosine_accuracy@3 | 0.8924 |
512
+ | cosine_accuracy@5 | 0.9185 |
513
+ | cosine_accuracy@10 | 0.9412 |
514
+ | cosine_precision@1 | 0.7428 |
515
+ | cosine_precision@3 | 0.2975 |
516
+ | cosine_precision@5 | 0.1837 |
517
+ | cosine_precision@10 | 0.0941 |
518
+ | cosine_recall@1 | 0.0206 |
519
+ | cosine_recall@3 | 0.0248 |
520
+ | cosine_recall@5 | 0.0255 |
521
+ | cosine_recall@10 | 0.0261 |
522
+ | cosine_ndcg@10 | 0.1875 |
523
+ | cosine_mrr@10 | 0.822 |
524
+ | **cosine_map@100** | **0.0229** |
525
+
526
+ #### Information Retrieval
527
+ * Dataset: `dim_32`
528
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
529
+
530
+ | Metric | Value |
531
+ |:--------------------|:----------|
532
+ | cosine_accuracy@1 | 0.6669 |
533
+ | cosine_accuracy@3 | 0.8242 |
534
+ | cosine_accuracy@5 | 0.864 |
535
+ | cosine_accuracy@10 | 0.8988 |
536
+ | cosine_precision@1 | 0.6669 |
537
+ | cosine_precision@3 | 0.2747 |
538
+ | cosine_precision@5 | 0.1728 |
539
+ | cosine_precision@10 | 0.0899 |
540
+ | cosine_recall@1 | 0.0185 |
541
+ | cosine_recall@3 | 0.0229 |
542
+ | cosine_recall@5 | 0.024 |
543
+ | cosine_recall@10 | 0.025 |
544
+ | cosine_ndcg@10 | 0.1737 |
545
+ | cosine_mrr@10 | 0.7533 |
546
+ | **cosine_map@100** | **0.021** |
547
+
548
+ <!--
549
+ ## Bias, Risks and Limitations
550
+
551
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
552
+ -->
553
+
554
+ <!--
555
+ ### Recommendations
556
+
557
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
558
+ -->
559
+
560
+ ## Training Details
561
+
562
+ ### Training Dataset
563
+
564
+ #### Unnamed Dataset
565
+
566
+
567
+ * Size: 11,863 training samples
568
+ * Columns: <code>context</code> and <code>question</code>
569
+ * Approximate statistics based on the first 1000 samples:
570
+ | | context | question |
571
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
572
+ | type | string | string |
573
+ | details | <ul><li>min: 13 tokens</li><li>mean: 40.74 tokens</li><li>max: 277 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 24.4 tokens</li><li>max: 62 tokens</li></ul> |
574
+ * Samples:
575
+ | context | question |
576
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------|
577
+ | <code>The engagement with key stakeholders involves various topics and methods throughout the year</code> | <code>Question: What does the engagement with key stakeholders involve throughout the year?</code> |
578
+ | <code>For unitholders and analysts, the focus is on business and operations, the release of financial results, and the overall performance and announcements</code> | <code>Question: What is the focus for unitholders and analysts in terms of business and operations, financial results, performance, and announcements?</code> |
579
+ | <code>These are communicated through press releases and other required disclosures via SGXNet and NetLink's website</code> | <code>What platform is used to communicate press releases and required disclosures for NetLink?</code> |
580
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
581
+ ```json
582
+ {
583
+ "loss": "MultipleNegativesRankingLoss",
584
+ "matryoshka_dims": [
585
+ 384,
586
+ 256,
587
+ 128,
588
+ 64,
589
+ 32
590
+ ],
591
+ "matryoshka_weights": [
592
+ 1,
593
+ 1,
594
+ 1,
595
+ 1,
596
+ 1
597
+ ],
598
+ "n_dims_per_step": -1
599
+ }
600
+ ```
601
+
602
+ ### Training Hyperparameters
603
+ #### Non-Default Hyperparameters
604
+
605
+ - `eval_strategy`: epoch
606
+ - `per_device_train_batch_size`: 32
607
+ - `per_device_eval_batch_size`: 16
608
+ - `gradient_accumulation_steps`: 16
609
+ - `learning_rate`: 2e-05
610
+ - `num_train_epochs`: 4
611
+ - `lr_scheduler_type`: cosine
612
+ - `warmup_ratio`: 0.1
613
+ - `bf16`: True
614
+ - `tf32`: True
615
+ - `load_best_model_at_end`: True
616
+ - `optim`: adamw_torch_fused
617
+ - `batch_sampler`: no_duplicates
618
+
619
+ #### All Hyperparameters
620
+ <details><summary>Click to expand</summary>
621
+
622
+ - `overwrite_output_dir`: False
623
+ - `do_predict`: False
624
+ - `eval_strategy`: epoch
625
+ - `prediction_loss_only`: True
626
+ - `per_device_train_batch_size`: 32
627
+ - `per_device_eval_batch_size`: 16
628
+ - `per_gpu_train_batch_size`: None
629
+ - `per_gpu_eval_batch_size`: None
630
+ - `gradient_accumulation_steps`: 16
631
+ - `eval_accumulation_steps`: None
632
+ - `learning_rate`: 2e-05
633
+ - `weight_decay`: 0.0
634
+ - `adam_beta1`: 0.9
635
+ - `adam_beta2`: 0.999
636
+ - `adam_epsilon`: 1e-08
637
+ - `max_grad_norm`: 1.0
638
+ - `num_train_epochs`: 4
639
+ - `max_steps`: -1
640
+ - `lr_scheduler_type`: cosine
641
+ - `lr_scheduler_kwargs`: {}
642
+ - `warmup_ratio`: 0.1
643
+ - `warmup_steps`: 0
644
+ - `log_level`: passive
645
+ - `log_level_replica`: warning
646
+ - `log_on_each_node`: True
647
+ - `logging_nan_inf_filter`: True
648
+ - `save_safetensors`: True
649
+ - `save_on_each_node`: False
650
+ - `save_only_model`: False
651
+ - `restore_callback_states_from_checkpoint`: False
652
+ - `no_cuda`: False
653
+ - `use_cpu`: False
654
+ - `use_mps_device`: False
655
+ - `seed`: 42
656
+ - `data_seed`: None
657
+ - `jit_mode_eval`: False
658
+ - `use_ipex`: False
659
+ - `bf16`: True
660
+ - `fp16`: False
661
+ - `fp16_opt_level`: O1
662
+ - `half_precision_backend`: auto
663
+ - `bf16_full_eval`: False
664
+ - `fp16_full_eval`: False
665
+ - `tf32`: True
666
+ - `local_rank`: 0
667
+ - `ddp_backend`: None
668
+ - `tpu_num_cores`: None
669
+ - `tpu_metrics_debug`: False
670
+ - `debug`: []
671
+ - `dataloader_drop_last`: False
672
+ - `dataloader_num_workers`: 0
673
+ - `dataloader_prefetch_factor`: None
674
+ - `past_index`: -1
675
+ - `disable_tqdm`: False
676
+ - `remove_unused_columns`: True
677
+ - `label_names`: None
678
+ - `load_best_model_at_end`: True
679
+ - `ignore_data_skip`: False
680
+ - `fsdp`: []
681
+ - `fsdp_min_num_params`: 0
682
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
683
+ - `fsdp_transformer_layer_cls_to_wrap`: None
684
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
685
+ - `deepspeed`: None
686
+ - `label_smoothing_factor`: 0.0
687
+ - `optim`: adamw_torch_fused
688
+ - `optim_args`: None
689
+ - `adafactor`: False
690
+ - `group_by_length`: False
691
+ - `length_column_name`: length
692
+ - `ddp_find_unused_parameters`: None
693
+ - `ddp_bucket_cap_mb`: None
694
+ - `ddp_broadcast_buffers`: False
695
+ - `dataloader_pin_memory`: True
696
+ - `dataloader_persistent_workers`: False
697
+ - `skip_memory_metrics`: True
698
+ - `use_legacy_prediction_loop`: False
699
+ - `push_to_hub`: False
700
+ - `resume_from_checkpoint`: None
701
+ - `hub_model_id`: None
702
+ - `hub_strategy`: every_save
703
+ - `hub_private_repo`: False
704
+ - `hub_always_push`: False
705
+ - `gradient_checkpointing`: False
706
+ - `gradient_checkpointing_kwargs`: None
707
+ - `include_inputs_for_metrics`: False
708
+ - `eval_do_concat_batches`: True
709
+ - `fp16_backend`: auto
710
+ - `push_to_hub_model_id`: None
711
+ - `push_to_hub_organization`: None
712
+ - `mp_parameters`:
713
+ - `auto_find_batch_size`: False
714
+ - `full_determinism`: False
715
+ - `torchdynamo`: None
716
+ - `ray_scope`: last
717
+ - `ddp_timeout`: 1800
718
+ - `torch_compile`: False
719
+ - `torch_compile_backend`: None
720
+ - `torch_compile_mode`: None
721
+ - `dispatch_batches`: None
722
+ - `split_batches`: None
723
+ - `include_tokens_per_second`: False
724
+ - `include_num_input_tokens_seen`: False
725
+ - `neftune_noise_alpha`: None
726
+ - `optim_target_modules`: None
727
+ - `batch_eval_metrics`: False
728
+ - `eval_on_start`: False
729
+ - `batch_sampler`: no_duplicates
730
+ - `multi_dataset_batch_sampler`: proportional
731
+
732
+ </details>
733
+
734
+ ### Training Logs
735
+ | Epoch | Step | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_32_cosine_map@100 | dim_384_cosine_map@100 | dim_64_cosine_map@100 |
736
+ |:----------:|:------:|:-------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|:---------------------:|
737
+ | 0.4313 | 10 | 4.3426 | - | - | - | - | - |
738
+ | 0.8625 | 20 | 2.7083 | - | - | - | - | - |
739
+ | 1.0350 | 24 | - | 0.0229 | 0.0233 | 0.0195 | 0.0234 | 0.0220 |
740
+ | 1.2264 | 30 | 2.6835 | - | - | - | - | - |
741
+ | 1.6577 | 40 | 2.1702 | - | - | - | - | - |
742
+ | 1.9164 | 46 | - | 0.0230 | 0.0234 | 0.0197 | 0.0235 | 0.0221 |
743
+ | 0.4313 | 10 | 2.2406 | - | - | - | - | - |
744
+ | 0.8625 | 20 | 1.8606 | - | - | - | - | - |
745
+ | 1.0350 | 24 | - | 0.0233 | 0.0236 | 0.0204 | 0.0237 | 0.0225 |
746
+ | 1.2264 | 30 | 2.0645 | - | - | - | - | - |
747
+ | 1.6577 | 40 | 1.6752 | - | - | - | - | - |
748
+ | 2.0458 | 49 | - | 0.0235 | 0.0237 | 0.0208 | 0.0238 | 0.0228 |
749
+ | 2.0216 | 50 | 1.7855 | - | - | - | - | - |
750
+ | 2.4528 | 60 | 1.7333 | - | - | - | - | - |
751
+ | 2.8841 | 70 | 1.5116 | - | - | - | - | - |
752
+ | 3.0566 | 74 | - | 0.0235 | 0.0238 | 0.0210 | 0.0239 | 0.0229 |
753
+ | 3.2480 | 80 | 1.7812 | - | - | - | - | - |
754
+ | 3.6792 | 90 | 1.4886 | - | - | - | - | - |
755
+ | **3.7655** | **92** | **-** | **0.0235** | **0.0238** | **0.021** | **0.0239** | **0.0229** |
756
+
757
+ * The bold row denotes the saved checkpoint.
758
+
759
+ ### Framework Versions
760
+ - Python: 3.10.12
761
+ - Sentence Transformers: 3.0.1
762
+ - Transformers: 4.42.4
763
+ - PyTorch: 2.4.0+cu121
764
+ - Accelerate: 0.32.1
765
+ - Datasets: 2.21.0
766
+ - Tokenizers: 0.19.1
767
+
768
+ ## Citation
769
+
770
+ ### BibTeX
771
+
772
+ #### Sentence Transformers
773
+ ```bibtex
774
+ @inproceedings{reimers-2019-sentence-bert,
775
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
776
+ author = "Reimers, Nils and Gurevych, Iryna",
777
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
778
+ month = "11",
779
+ year = "2019",
780
+ publisher = "Association for Computational Linguistics",
781
+ url = "https://arxiv.org/abs/1908.10084",
782
+ }
783
+ ```
784
+
785
+ #### MatryoshkaLoss
786
+ ```bibtex
787
+ @misc{kusupati2024matryoshka,
788
+ title={Matryoshka Representation Learning},
789
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
790
+ year={2024},
791
+ eprint={2205.13147},
792
+ archivePrefix={arXiv},
793
+ primaryClass={cs.LG}
794
+ }
795
+ ```
796
+
797
+ #### MultipleNegativesRankingLoss
798
+ ```bibtex
799
+ @misc{henderson2017efficient,
800
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
801
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
802
+ year={2017},
803
+ eprint={1705.00652},
804
+ archivePrefix={arXiv},
805
+ primaryClass={cs.CL}
806
+ }
807
+ ```
808
+
809
+ <!--
810
+ ## Glossary
811
+
812
+ *Clearly define terms in order to be accessible across audiences.*
813
+ -->
814
+
815
+ <!--
816
+ ## Model Card Authors
817
+
818
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
819
+ -->
820
+
821
+ <!--
822
+ ## Model Card Contact
823
+
824
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
825
+ -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-small-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 1536,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 12,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.42.4",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 30522
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.42.4",
5
+ "pytorch": "2.4.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f6ea25a3d99b6b322fbc77ab626355ba6a50682af2fc3506185174184dd4cd1e
3
+ size 133462128
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff