Discrepancy between the paper and the model

by erceguder - opened Feb 9, 2024

Feb 9, 2024

Hey,

Thank you for the great work. Upon playing around with the code, I realized that some parts of the method are not implemented as is described in the paper. For example, the vocoder is told to work on 128 mel-bins in the paper, whereas the provided vocoder clearly works on 64 mel-bins. I could not find any version of the model that aligns with the paper on your HF profile, is such a model going to be released soon?

Alicization

Nov 4, 2024

I have encountered the same question, the config of vocoder is set to 64mel-bins.
By the way, The quality of the waveform I generated using the official prompt and code is a bit low compared to the samples on the official website...
How can I get as high quality as the samples released on the website?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment