about Loss_T #6

FlyToYourMooN · 2024-03-11T15:27:37Z

Hello, I'm trying to train a model on the Opensinger dataset, but the Loss_T keeps going up and the resulting speech is almost incomprehensible, do you have any suggestions for that? The configurations I'm using are all default, just batch_size = 8 and LR reduced by half. Thank you so much!

yoyolicoris · 2024-03-11T17:35:08Z

Hi @FlyToYourMooN, thanks for asking.

Could you provide some generated audio samples?

The loss_T looks normal, but loss (the negative ELBO) looks higher than usual (in my experiments the loss should be around -5.6 at 240k).

You can also check out the repo https://github.com/yoyololicon/duet-svs-diffusion.
We used the 1D UNet from https://github.com/archinetai/audio-diffusion-pytorch as a denoiser (which is stronger than the noncausal wavenet of diffwave) and trained it on 8 singing voice datasets (including OpenSinger). We also made the checkpoint available.

I hope this helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about Loss_T #6

about Loss_T #6

FlyToYourMooN commented Mar 11, 2024

yoyolicoris commented Mar 11, 2024

about Loss_T #6

about Loss_T #6

Comments

FlyToYourMooN commented Mar 11, 2024

yoyolicoris commented Mar 11, 2024