Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to configure the dataset or modify the code if I want to do the one class binary classification #91

Open
nanyyyyyy opened this issue Jan 15, 2023 · 14 comments
Labels
question Further information is requested

Comments

@nanyyyyyy
Copy link

thanks

@YuanGongND
Copy link
Owner

Please folllow the ESC-50 recipe (50 class classification with AudioSet pretrained model) and just change

--label-csv ./data/esc_class_labels_indices.csv --n_class 50 \

--n_class 2 for binary classification. I recommend to start with running ESC-50 recipe without any modification, it is one-click and automatically generate json datafiles. Once you can successfully run it (which means there's no environment issues and other problems), you can start modify from that.

-Yuan

@YuanGongND YuanGongND added the question Further information is requested label Jan 15, 2023
@nanyyyyyy
Copy link
Author

nanyyyyyy commented Jan 15, 2023

Please folllow the ESC-50 recipe (50 class classification with AudioSet pretrained model) and just change

--label-csv ./data/esc_class_labels_indices.csv --n_class 50 \

--n_class 2 for binary classification. I recommend to start with running ESC-50 recipe without any modification, it is one-click and automatically generate json datafiles. Once you can successfully run it (which means there's no environment issues and other problems), you can start modify from that.

-Yuan

I am feeding the model and constructing the dataset json file with this dictionary {'/m/spcmd00' : 0 , '/m/spcmd01':1}. and setting n_class equals to 2. 0 and 1 are the binary status of the only one class. Does this make sense? Thank you so much!! Great work by the way.

@YuanGongND
Copy link
Owner

This makes sense. But you also need to take care of the hyper-parameters, in particular, audio_length should be the max length of frames of audios in your dataset (e.g., 100 for 1s audio) timem should be about 20% of your average audio length, e.g., 25 for 1s audio. You also need to tune learning rate, etc.

@1244547821
Copy link

I want to test esc-50 in windows, is it possible?

@YuanGongND
Copy link
Owner

YuanGongND commented Feb 20, 2023

I want to test esc-50 in windows, is it possible?

It might be possible if you have torch environment setup in Windows, though many things need to be changed in https://github.com/YuanGongND/ast/blob/master/egs/esc50/prep_esc50.py and https://github.com/YuanGongND/ast/blob/master/egs/esc50/run_esc.sh, and maybe somewhere else. An easier way might be use the Google Colab environment, I think it is OK for ESC-50 as it is small.

-Yuan

@1244547821
Copy link

I want to test esc-50 in windows, is it possible?

It might be possible if you have torch environment setup in Windows, though many things need to be changed in https://github.com/YuanGongND/ast/blob/master/egs/esc50/prep_esc50.py and https://github.com/YuanGongND/ast/blob/master/egs/esc50/run_esc.sh, and maybe somewhere else. An easier way might be use the Google Colab environment, I think it is OK for ESC-50 as it is small.

-Yuan

ok, thanks for your answer.

@1244547821
Copy link

Please folllow the ESC-50 recipe (50 class classification with AudioSet pretrained model) and just change

--label-csv ./data/esc_class_labels_indices.csv --n_class 50 \

--n_class 2 for binary classification. I recommend to start with running ESC-50 recipe without any modification, it is one-click and automatically generate json datafiles. Once you can successfully run it (which means there's no environment issues and other problems), you can start modify from that.
-Yuan

I am feeding the model and constructing the dataset json file with this dictionary {'/m/spcmd00' : 0 , '/m/spcmd01':1}. and setting n_class equals to 2. 0 and 1 are the binary status of the only one class. Does this make sense? Thank you so much!! Great work by the way.

Have you completed the binary classifications? I have some questions about modifying parameters. How did you modify --freqm, --timem, --tstride, --fstride, --audio_length?

@YuanGongND
Copy link
Owner

YuanGongND commented Feb 22, 2023

I usually suggest to first reproduce the ESC-50 recipe and then start modifying hyper-parameters, this helps you rule out other factors could impact the performance.

The hyper-parameter you listed are not related to number of classes:

audio_length should be your input audio length in frames, i.e., 1000 for 10-second audio; timem is the max mask augmentation on the time domain, should be around 20% of audio_length, e.g., 200; freqm is the max mask on the frequency domain, you can keep it same with the ESC-50 recipe. tstride and fstride are patch split stride, you should keep it same with the ESC-50 recipe.

You do need to modify --label-csv ./data/esc_class_labels_indices.csv --n_class 50 for a binary classification problem. label-csv should point to a csv contains only 2 labels, n_class should be 2.

-Yuan

@1244547821
Copy link

I usually suggest to first reproduce the ESC-50 recipe and then start modifying hyper-parameters, this helps you rule out other factors could impact the performance.

The hyper-parameter you listed are not related to number of classes:

audio_length should be your input audio length in frames, i.e., 1000 for 10-second audio; timem is the max mask augmentation on the time domain, should be around 20% of audio_length, e.g., 200; freqm is the max mask on the frequency domain, you can keep it same with the ESC-50 recipe. tstride and fstride are patch split stride, you should keep it same with the ESC-50 recipe.

You do need to modify --label-csv ./data/esc_class_labels_indices.csv --n_class 50 for a binary classification problem. label-csv should point to a csv contains only 2 labels, n_class should be 2.

-Yuan

Thank you very much, I have modified it.

@1244547821
Copy link

I usually suggest to first reproduce the ESC-50 recipe and then start modifying hyper-parameters, this helps you rule out other factors could impact the performance.

The hyper-parameter you listed are not related to number of classes:

audio_length should be your input audio length in frames, i.e., 1000 for 10-second audio; timem is the max mask augmentation on the time domain, should be around 20% of audio_length, e.g., 200; freqm is the max mask on the frequency domain, you can keep it same with the ESC-50 recipe. tstride and fstride are patch split stride, you should keep it same with the ESC-50 recipe.

You do need to modify --label-csv ./data/esc_class_labels_indices.csv --n_class 50 for a binary classification problem. label-csv should point to a csv contains only 2 labels, n_class should be 2.

-Yuan

I calculated dataset_mean=-6.6268077 and dataset_std=5.358466 of esc-50 are different from those in run_esc.sh. I don’t know where I went wrong. Could you please answer?

@YuanGongND
Copy link
Owner

What's your mean and std?

@1244547821
Copy link

What's your mean and std?

mean=0.000238, std=0.000841. I feel that this is wrong. I see run.py help is the dataset spectrogram mean, so I converted it to fft calculation. So I would like to ask how to calculate this?

@YuanGongND
Copy link
Owner

Is this your own dataset? This is certainly not correct as the std should not be 0. You can check the issues to find how to cal the mean and std.

@1244547821
Copy link

Is this your own dataset? This is certainly not correct as the std should not be 0. You can check the issues to find how to cal the mean and std.

Hi yuan, in my own two-category data set training, the Avg precision in each epoch is 0.5, Recall is always 1, can you answer my question, the following is my result.
start validation
acc: 0.934211
AUC: 0.981920
Avg Precision: 0.500000
Avg Recall: 1.000000
d_prime: 2.962967
train_loss: 0.260971
valid_loss: 0.466797
validation finished
Epoch-2 lr: 1e-05

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants