slavashe commited on
Commit
a22d5ff
1 Parent(s): f11347d

update README

Browse files
Files changed (1) hide show
  1. README.md +17 -6
README.md CHANGED
@@ -3,21 +3,32 @@ license: cdla-permissive-2.0
3
  ---
4
 
5
  ## Model Summary
6
- [DAC auto-encoder models](https://github.com/descriptinc/descript-audio-codec) provide compact discrete tokenization of speech and audio signals that facilitate signal generation by cascaded generative AI models (e.g. multi-modal generative AI models) and high-quality reconstruction of the original signals. [The current models](https://www.isca-archive.org/interspeech_2024/shechtman24_interspeech.pdf) improve upon the [original DAC models](https://github.com/descriptinc/descript-audio-codec) by allowing a more compact representation for speech-only signals with high-quality signal reconstruction.
 
 
 
 
 
7
 
8
  ## Usage
9
- follow [DAC](https://github.com/descriptinc/descript-audio-codec) installation instructions
10
- download the model weights from the current repo (e.g., *weights_24khz_1.5kbps_v1.0*)
 
 
 
 
 
 
11
  ### Compress audio
12
  ```
13
- python3 -m dac encode /path/to/input --output /path/to/output/codes --weights_path /path/to/weights_24khz_1.5kbps_v1.0
14
  ```
15
 
16
  This command will create `.dac` files with the same name as the input files. It will also preserve the directory structure relative to input root and re-create it in the output directory. Please use `python -m dac encode --help` for more options.
17
 
18
  ### Reconstruct audio from compressed codes
19
  ```
20
- python3 -m dac decode /path/to/output/codes --output /path/to/reconstructed_input --weights_path /path/to/weights_24khz_1.5kbps_v1.0
21
  ```
22
 
23
  This command will create `.wav` files with the same name as the input files. It will also preserve the directory structure relative to input root and re-create it in the output directory. Please use `python -m dac decode --help` for more options.
@@ -28,7 +39,7 @@ import dac
28
  from audiotools import AudioSignal
29
 
30
  # Download a model
31
- model_path = /path/to/weights_24khz_1.5kbps_v1.0
32
  model = dac.DAC.load(model_path)
33
 
34
  model.to('cuda')
 
3
  ---
4
 
5
  ## Model Summary
6
+ [DAC auto-encoder models](https://github.com/descriptinc/descript-audio-codec) provide compact discrete tokenization of speech and audio signals that facilitate signal generation by cascaded generative AI models (e.g. multi-modal generative AI models) and high-quality reconstruction of the original signals. [The current models](https://www.isca-archive.org/interspeech_2024/shechtman24_interspeech.pdf) improve upon the [original DAC models](https://github.com/descriptinc/descript-audio-codec) by allowing a more compact representation for wide-band speech-only signals with high-quality signal reconstruction.
7
+
8
+ | Model | Speech Sample Rate | codebooks | Bit Rate | Token Rate| version|
9
+ | :---: | :---: | :---: | :---: | :---: | :---: |
10
+ | weights_24khz_3.0kbps_v1.0.pth | 24kHz | 4 | 3kHz | 300Hz | 1.0 |
11
+ | weights_24khz_1.5kbps_v1.0.pth | 24kHz | 2 | 1.5kHz | 150Hz | 1.0 |
12
 
13
  ## Usage
14
+ * follow [DAC](https://github.com/descriptinc/descript-audio-codec) installation instructions
15
+
16
+ * clone the current repo
17
+ ```
18
+ git clone https://huggingface.co/ibm/DAC.speech.v1.0
19
+ cd DAC.speech.v1.0
20
+ ```
21
+
22
  ### Compress audio
23
  ```
24
+ python3 -m dac encode /path/to/input --output /path/to/output/codes --weights_path weights_24khz_3.0kbps_v1.0.pth
25
  ```
26
 
27
  This command will create `.dac` files with the same name as the input files. It will also preserve the directory structure relative to input root and re-create it in the output directory. Please use `python -m dac encode --help` for more options.
28
 
29
  ### Reconstruct audio from compressed codes
30
  ```
31
+ python3 -m dac decode /path/to/output/codes --output /path/to/reconstructed_input --weights_path weights_24khz_3.0kbps_v1.0.pth
32
  ```
33
 
34
  This command will create `.wav` files with the same name as the input files. It will also preserve the directory structure relative to input root and re-create it in the output directory. Please use `python -m dac decode --help` for more options.
 
39
  from audiotools import AudioSignal
40
 
41
  # Download a model
42
+ model_path = 'weights_24khz_3.0kbps_v1.0.pth'
43
  model = dac.DAC.load(model_path)
44
 
45
  model.to('cuda')