Edit model card

transformers_issues_topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("flumboyantApple/transformers_issues_topics")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 30
  • Number of training documents: 9000
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 bert - pytorch - tensorflow - pretrained - gpu 12 -1_bert_pytorch_tensorflow_pretrained
0 tokenizer - tokenizers - tokenization - tokenize - encoderdecoder 2279 0_tokenizer_tokenizers_tokenization_tokenize
1 cuda - pytorch - tensorflow - gpu - gpus 1830 1_cuda_pytorch_tensorflow_gpu
2 modelcard - modelcards - card - model - cards 887 2_modelcard_modelcards_card_model
3 seq2seq - s2s - seq2seqtrainer - seq2seqdataset - runseq2seq 451 3_seq2seq_s2s_seq2seqtrainer_seq2seqdataset
4 trainer - trainertrain - trainers - training - evaluateduringtraining 445 4_trainer_trainertrain_trainers_training
5 albertbasev2 - albertforpretraining - albert - albertformaskedlm - albertmodel 435 5_albertbasev2_albertforpretraining_albert_albertformaskedlm
6 gpt2 - gpt2tokenizer - gpt2xl - gpt2tokenizerfast - gpt 347 6_gpt2_gpt2tokenizer_gpt2xl_gpt2tokenizerfast
7 typos - typo - fix - correction - fixed 278 7_typos_typo_fix_correction
8 readmemd - readmetxt - readme - file - camembertbasereadmemd 274 8_readmemd_readmetxt_readme_file
9 t5 - t5model - tf - t5base - t5large 259 9_t5_t5model_tf_t5base
10 transformerscli - transformers - transformer - importerror - import 228 10_transformerscli_transformers_transformer_importerror
11 ci - testing - tests - testgeneratefp16 - test 198 11_ci_testing_tests_testgeneratefp16
12 longformerforquestionanswering - questionansweringpipeline - tfalbertforquestionanswering - distilbertforquestionanswering - questionanswering 142 12_longformerforquestionanswering_questionansweringpipeline_tfalbertforquestionanswering_distilbertforquestionanswering
13 pipeline - pipelines - ner - fixpipeline - nerpipeline 140 13_pipeline_pipelines_ner_fixpipeline
14 longformer - longformers - longform - longformerlayer - longformermodel 136 14_longformer_longformers_longform_longformerlayer
15 benchmark - benchmarks - accuracy - precision - hardcoded 113 15_benchmark_benchmarks_accuracy_precision
16 onnx - onnxexport - onnxonnxruntime - onnxruntime - 04onnxexport 77 16_onnx_onnxexport_onnxonnxruntime_onnxruntime
17 generationbeamsearchpy - generatebeamsearch - beamsearch - nonbeamsearch - beam 76 17_generationbeamsearchpy_generatebeamsearch_beamsearch_nonbeamsearch
18 flax - flaxelectraformaskedlm - flaxelectraforpretraining - flaxjax - flaxelectramodel 75 18_flax_flaxelectraformaskedlm_flaxelectraforpretraining_flaxjax
19 datacollatorforlanguagemodelingfile - datacollatorforlanguagemodeling - datacollatorforlanguagemodelling - datacollatorforpermutationlanguagemodeling - runlanguagemodelingpy 49 19_datacollatorforlanguagemodelingfile_datacollatorforlanguagemodeling_datacollatorforlanguagemodelling_datacollatorforpermutationlanguagemodeling
20 huggingfacetokenizers297 - huggingfacetransformers - huggingface - huggingfaces - huggingfacecn 43 20_huggingfacetokenizers297_huggingfacetransformers_huggingface_huggingfaces
21 cachedir - cache - cachedpath - caching - cached 43 21_cachedir_cache_cachedpath_caching
22 notebook - notebooks - blenderbot3b - community - blenderbot 35 22_notebook_notebooks_blenderbot3b_community
23 wandbproject - ga - wandbcallback - wandb - fork 33 23_wandbproject_ga_wandbcallback_wandb
24 closed - adding - add - bort - added 32 24_closed_adding_add_bort
25 electra - electrapretrainedmodel - electraformaskedlm - electralarge - electraformultiplechoice 27 25_electra_electrapretrainedmodel_electraformaskedlm_electralarge
26 layoutlm - layout - layoutlmtokenizer - layoutlmbaseuncased - tf 23 26_layoutlm_layout_layoutlmtokenizer_layoutlmbaseuncased
27 pplm - pr - deprecated - variable - ppl 18 27_pplm_pr_deprecated_variable
28 isort - blackisortflake8 - github - repo - version 15 28_isort_blackisortflake8_github_repo

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 30
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.23.3
  • HDBSCAN: 0.8.38.post1
  • UMAP: 0.5.6
  • Pandas: 1.5.3
  • Scikit-Learn: 1.1.2
  • Sentence-transformers: 3.0.1
  • Transformers: 4.44.1
  • Numba: 0.60.0
  • Plotly: 5.10.0
  • Python: 3.9.18
Downloads last month
2
Inference Examples
Inference API (serverless) is not available, repository is disabled.