Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about the details of AST. #81

Open
TungyuYoung opened this issue Sep 17, 2022 · 1 comment
Open

Some questions about the details of AST. #81

TungyuYoung opened this issue Sep 17, 2022 · 1 comment
Labels
question Further information is requested

Comments

@TungyuYoung
Copy link

I would like to know how to explain the classification of audio that can be achieved using ImageNet pretrained models based on spectrograms? As we all know, most of the pictures included in Imagenet are common photos of daily life, such as cats, dogs, cars, etc. Are the features of these pictures/objects correlated with the audio spectrogram? Why can the knowledge learned from traditional pictures be distilled into the classification of spectrograms?

I would appreciate it if you could answer my questions.

@YuanGongND YuanGongND added the question Further information is requested label Oct 9, 2022
@YuanGongND
Copy link
Owner

Hi there,

This is an interesting question but I don't have a clear answer. It is worth note that using IN pretraining for audio tasks is not new for AST, but can be trace back to 2014.

-Yuan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants