melissasanabria commited on
Commit
f6ed259
1 Parent(s): 6efc18c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -9,7 +9,6 @@ This is the official pre-trained model introduced in [DNA language model GROVER
9
 
10
 
11
  from transformers import AutoTokenizer, AutoModelForMaskedLM
12
- import torch
13
 
14
  # Import the tokenizer and the model
15
  tokenizer = AutoTokenizer.from_pretrained("PoetschLab/GROVER")
@@ -17,7 +16,7 @@ This is the official pre-trained model introduced in [DNA language model GROVER
17
 
18
 
19
  Some preliminary analysis shows that sequence re-tokenization using Byte Pair Encoding (BPE) changes significantly if the sequence is less than 50 nucleotides long. Longer than 50 nucleotides, you should still be careful with sequence edges.
20
- We advice to add 100 nucleotides at the beginning and end of every sequence in order to garantee that your sequence is represented with the same tokens as the original tokenization.
21
  We also provide the tokenized chromosomes with their respective nucleotide mappers (They are available in the folder tokenized chromosomes).
22
 
23
  ### BibTeX entry and citation info
 
9
 
10
 
11
  from transformers import AutoTokenizer, AutoModelForMaskedLM
 
12
 
13
  # Import the tokenizer and the model
14
  tokenizer = AutoTokenizer.from_pretrained("PoetschLab/GROVER")
 
16
 
17
 
18
  Some preliminary analysis shows that sequence re-tokenization using Byte Pair Encoding (BPE) changes significantly if the sequence is less than 50 nucleotides long. Longer than 50 nucleotides, you should still be careful with sequence edges.
19
+ We advice to add 100 nucleotides at the beginning and end of every sequence in order to guarantee that your sequence is represented with the same tokens as the original tokenization.
20
  We also provide the tokenized chromosomes with their respective nucleotide mappers (They are available in the folder tokenized chromosomes).
21
 
22
  ### BibTeX entry and citation info