mo137 commited on
Commit
a32abda
1 Parent(s): 4b18183

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -2
README.md CHANGED
@@ -26,12 +26,34 @@ You're probably better off using Q8_0, but I thought I'll share these – maybe
26
  Higher bits per weight (bpw) numbers result in slower computation:
27
  ```
28
  20 s Q8_0
29
- 23 s 11.0bpw-txt16
30
  30 s fp16
31
- 37 s 16.4bpw-txt32
32
  310 s fp32
33
  ```
34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  In the txt16/32 files, I quantized only these layers to Q8_0, unless they were one-dimensional:
36
  ```
37
  img_mlp.0
 
26
  Higher bits per weight (bpw) numbers result in slower computation:
27
  ```
28
  20 s Q8_0
29
+ 23 s 11.024bpw-txt16.gguf
30
  30 s fp16
31
+ 37 s 16.422bpw-txt32.gguf
32
  310 s fp32
33
  ```
34
 
35
+ ### Update 2024-08-26
36
+ Two new files. This time the only tensors in Q8_0 are some or all of:
37
+ ```
38
+ double_blocks.*.img_mlp.0.weight
39
+ double_blocks.*.img_mlp.2.weight
40
+ double_blocks.*.txt_mlp.0.weight
41
+ double_blocks.*.txt_mlp.2.weight
42
+
43
+ double_blocks.*.img_mod.lin.weight
44
+ double_blocks.*.txt_mod.lin.weight
45
+ single_blocks.*.linear1.weight
46
+ single_blocks.*.linear2.weight
47
+ single_blocks.*.modulation.lin.weight
48
+ ```
49
+
50
+ - `flux1-dev-Q8_0-fp32-11.763bpw.gguf`
51
+ This version has all the above layers in Q8_0.
52
+ - `flux1-dev-Q8_0-fp32-13.962bpw.gguf`
53
+ This version preserves first **2** layers of all kinds, and first **4** MLP layers in fp32.
54
+ - `flux1-dev-Q8_0-fp32-16.161bpw.gguf`
55
+ This one, first **4** layers of any kind and first **8** MLP layers in fp32.
56
+
57
  In the txt16/32 files, I quantized only these layers to Q8_0, unless they were one-dimensional:
58
  ```
59
  img_mlp.0