LaDiffCodec

Open-sourced codes for paper - GENERATIVE DE-QUANTIZATION FOR NEURAL SPEECH CODEC VIA LATENT DIFFUSION (Accepted by ICASSP 2024)

Cite as: Yang, Haici, Inseon Jang, and Minje Kim. "Generative De-Quantization for Neural Speech Codec Via Latent Diffusion." ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024.

Prerequisites

Environment

pip install -r requirements.txt

Data

Librispeech

Dependencies

EnCodec - https://github.com/facebookresearch/encodec
Descript Audio Codec (DAC) - https://github.com/descriptinc/descript-audio-codec

Hyper-Parameters:

Symbol	Description
run_diff	Running diffusion model
diff_dims	Dimension of input feature to the diffusion model
cond_quantization	Whether the condition features should be quantized . Turn it on when training diffusion model on codecs.
cond_bandwidth	The designated bitrate of this codec model
scaling_feature	Apply scaling on each feature map only
scaling_global	Apply scaling globally
ratios	The downsampling ratios of encoder (and decoder)

Pretrained Checkpoints:

We provided a pretrained LaDiffCodec checkpoint with scalable bitrates at link. The bitrates can be chosen from 1.5kbps, 3kbps, 6kbps, 9kbps, 12kbps.

To use the pretrained models -
python -m srcs.main --synthesis --load_model [path]/0907_diffusor.amlt --continuous_AE [path]/continuous_AE.amlt --discrete_AE [path]/discrete_AE.amlt --cond_bandwidth [BANDWIDTH] --diff_dims 256 --input_dir [INPUT_DIR] --output_dir [OUTPUT_DIR] --orig_sampling

You can also remove --orig_sampling to use midway infilling for a much faster sampling, with a slight compromise to the quality.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
srcs		srcs
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

LaDiffCodec

Prerequisites

Environment

Data

Dependencies

Hyper-Parameters:

Pretrained Checkpoints:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Uh oh!

Uh oh!

haiciyang/LaDiffCodec

Folders and files

Latest commit

History

Repository files navigation

LaDiffCodec

Prerequisites

Environment

Data

Dependencies

Hyper-Parameters:

Pretrained Checkpoints:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages