-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About code"100-106" from dataloader.py #88
Comments
Hi there, Thanks for reaching out. I think waveform mean substraction is not related to mixup. Substracting the mean of the waveform is a quite commonly used method to remove the DC offset. The way that I do it before mixup is just for safe. I haven't conduct experiment on the impact of waveform mean substraction, but I guess the impact is minor as we do another normalization on the spectrogram afterwards. My guess is, if your training and test use a consistent -Yuan |
Thank you so much gentlemen. |
Dear Minster. Gong |
I think I answered this in a previous issue (see here). You are exactly correct on that Again, I want to emphasize that - though it doesn't matter which to use, it is important to keep it consistent in training and inference, specifically, if you want to use our pretrained model, please stick to our dataloader without any change. Finally, I recommand to first run our original code and see if you can reproduce our claimed results, if yes, then you can play with the model with various settings. -Yuan |
Dear Yuan, thank you for the great code repository and for maintaining it! I have a small follow up question regarding the waveform normalization ( The dataloader is used for generating the normalization stats and training and it normalizes the waveform before the transition into the frequency domain. The predict code however loads the audio itself and has no waveform normalization.
|
hi Harald, This is not intentional, Having that said, this is a minor thing and just remove the DC constant, if you check The thing really makes big difference is the spectrogram normalization at Line 202 in 9e3bd99
Without DC removal, the code probably still runs well, without the fbank norm, the inference is almost sure to fail. Finally, I recommend to use https://colab.research.google.com/github/YuanGongND/ast/blob/master/colab/AST_Inference_Demo.ipynb for inference instead of -Yuan |
Dear Minster. Gong
Thanks a lot for your pioneering work in the field of audio processing, and warmhearted comments every time.
I have a question about using MixUp method in AST. Since I saw the code 102 from dataloader.py
waveform = waveform - waveform.mean()
.My question is why the waveform() need to be subtracted the mean of waveform(). That operation of subtracting from either the original MixUp or there is reason behind it?
The text was updated successfully, but these errors were encountered: