-
-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardization of Initial Weights (SIW) Plugin implementation #527
base: master
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 742476754
💛 - Coveralls |
Thanks @EdenBelouadah ! Just a couple of comments, then I'll leave to @AntonioCarta the final check:
Apart from that, I don't have any other comments! Well done 😄 |
Thanks @EdenBelouadah! I have a couple of comments:
This is fine, we can change it later if needed. Let me know if you are able to reproduce the results. |
Thank you very much @AndreaCossu and @AntonioCarta for you remarks. They are indeed helpful! I will integrate them this week and let you know later about the results reproduction. |
Hello again,
2- At which point can I specify the 3- Is there any way in Avalanche to use hyper-parameters for the first non-incremental state different from those used in incremental states? Please tell me if there are still some issues with the code. Thank you very much |
If you use pycharm you have the ability to run tests separately (also in debug mode).
I did a quick implementation of a wrapper for the lr scheduler. I added a print inside
what hyperparameters? can you explain more in detail? |
Thank you |
You could do something like this to change the hyper-parameters dynamically:
and add it as a plugin to your strategy. Other customized parameters can be similarly modified at runtime,
This is fine. You could leave |
Oh no! It seems there are some PEP8 errors! 😕
|
Oh no! It seems there are some PEP8 errors! 😕
|
Hello @AntonioCarta ,
2- To ckeck why I cannot reproduce the results of my paper, I found that the problem is in the vanilla fine tuning itself. I removed the SIW plugin and I removed also my image lists. Instead, I just use the Naive Strategy with the already-prepared SplitCifar benchmark and the results during testing are weird. During training, it seems that the model is learning (accuracy on the new task ~80%), but during testing, the accuracy on the current task drops until ~20% while it's nearly ~0 for older tasks. Here's the code:
and here's the testing accuracy at each experience: Top1_Acc_Stream/eval_phase/test_stream = 0.1860 Thank you again for you help and patience. |
This is suspiciously low. I run your example and I see that the accuracy on the current experience drops. For example on Exp. 9 you get
While it should get a much higher value, forgetting all the previous experiences (Naive). Notice that the values on the training set are ok (above 80%), so it could also be overfitting? I never experimented with CIFAR100 so I can't say for sure but it seems too extreme to me. I will try to reproduce this problem on a smaller dataset. |
@EdenBelouadah I did a quick experiment on CIFAR10 (10 epochs, train_mb_size 512), The results seem ok to me:
The model learns correctly on the last experience and forgets catastrophically the previous ones. The Naive strategy seems to work as expected. Probably your error has to do with your hyperparameter's choice. Can you check this? Are you using the last version of Avalanche? Can you pull from master to ensure this? |
Hello,
2- Also, me too I suspected that it was an overfitting. Therefore, I changed the hyperparameters and I used exactly those used in my paper, but it seems now to be an underfitting (which is weird for me). The training accuracy on the first task does not exceed 10%. I think now we're sure that the problem is in the optimization of the Naive strategy. I will search for other hyperparameters and tell you if the problem is solved. 3- Can you please run a unit test on the Synaptic Intelligence and tell me if you have the same bug as me? if yes, can you please correct the bug so it does not affect the PR of SIW? Thanks a lot. |
Are you able to obtain good results without using the Avalanche's Naive strategy? If you can provide me a minimal script that gets good results using Avalanche's scenarios, but not Avalanche's strategies I can try to investigate this problem more in detail.
I see that in |
Hi @EdenBelouadah how is this progressing? Let us know if you encountered any problem and want some help! |
Hello,
I implemented the SIW approach from https://arxiv.org/pdf/2008.13710.pdf as a plugin.
I just need you to check if my implementation respects the rules of Avalanche.
I also added an example on how to run it on SplitCIFAR100.
SIW is based on three components:
If the implementation seems okay to you, I will run full experiments to check if I can reproduce the results of the paper before to submit the definitive PR.
Thank you! :)