Readme updates
Browse files
README.md
CHANGED
@@ -18,7 +18,8 @@ tags:
|
|
18 |
---
|
19 |
|
20 |
## Model Summary
|
21 |
-
[DAC auto-encoder models](https://github.com/descriptinc/descript-audio-codec) provide compact discrete tokenization of speech and audio signals that facilitate signal generation by cascaded generative AI models (e.g. multi-modal generative AI models) and high-quality reconstruction of the original signals. [The current models](https://www.isca-archive.org/interspeech_2024/shechtman24_interspeech.pdf) improve upon the [original DAC models](https://github.com/descriptinc/descript-audio-codec) by allowing a more compact representation for wide-band speech
|
|
|
22 |
|
23 |
| Model | Speech Sample Rate | codebooks | Bit Rate | Token Rate| version|
|
24 |
| :---: | :---: | :---: | :---: | :---: | :---: |
|
|
|
18 |
---
|
19 |
|
20 |
## Model Summary
|
21 |
+
[DAC auto-encoder models](https://github.com/descriptinc/descript-audio-codec) provide compact discrete tokenization of speech and audio signals that facilitate signal generation by cascaded generative AI models (e.g. multi-modal generative AI models) and high-quality reconstruction of the original signals. [The current finetuned models](https://www.isca-archive.org/interspeech_2024/shechtman24_interspeech.pdf) improve upon the [original DAC models](https://github.com/descriptinc/descript-audio-codec) by allowing a more compact representation for wide-band speech signals with high-quality signal reconstruction. The models achieve speech reconstruction, which is [nearly indistinguishable from PCM](https://ibm.biz/IS24SpeechRVQ) with a rate of 150-300 tokens per second
|
22 |
+
(1500-3000 bps). [The evaluation](https://www.isca-archive.org/interspeech_2024/shechtman24_interspeech.pdf) used comprehensive English speech data encompassing different recording conditions, including studio settings.
|
23 |
|
24 |
| Model | Speech Sample Rate | codebooks | Bit Rate | Token Rate| version|
|
25 |
| :---: | :---: | :---: | :---: | :---: | :---: |
|