Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SimCLR performance #13

Open
jlindsey15 opened this issue Jun 17, 2020 · 5 comments
Open

SimCLR performance #13

jlindsey15 opened this issue Jun 17, 2020 · 5 comments

Comments

@jlindsey15
Copy link

Does the performance of the SimCLR implementation in this repo match the published results?

@HobbitLong
Copy link
Owner

@jlindsey15, in terms of CIFAR-10, yes! You can check Fig. B.4. in the SimCLR paper.

@jlindsey15
Copy link
Author

Ah I see, so there is no ImageNet support yet. Would you expect plugging in the ImageNet dataset to the current code to work? Thanks for the help!

@HobbitLong
Copy link
Owner

@jlindsey15, logically it should work. But may just take too much time to run on ImageNet.

One note is that currently I am not using DistributedDataParallel which might further speedup on ImageNet. If your dataset is not very big, might be fine to just use DataParallel here.

@vtddggg
Copy link

vtddggg commented Sep 22, 2020

@jlindsey15, logically it should work. But may just take too much time to run on ImageNet.

One note is that currently I am not using DistributedDataParallel which might further speedup on ImageNet. If your dataset is not very big, might be fine to just use DataParallel here.

@HobbitLong Can I ask for the hyper-parameter settings for ImageNet training?
In you paper, I found you use epoch=700, batch_size=8192, LARS optimizer with cosine lr decay, temperature=0.07, AutoAugment for training your best model.
However, there are some other parameters that I cannot find from your paper:
Such as base_learning_rate, warm_up_init_learning_rate, warm_up_epochs. I would appreciate it if your can provide more information about them.

@haohang96
Copy link

haohang96 commented Jan 27, 2021

@jlindsey15, logically it should work. But may just take too much time to run on ImageNet.

One note is that currently I am not using DistributedDataParallel which might further speedup on ImageNet. If your dataset is not very big, might be fine to just use DataParallel here.

@HobbitLong I think this code can not be applied to DistributedDataParallel directly. Because in DDP mode, pytorch compute loss in each gpu. For example, in 8-gpus mechain, if batch_size=1024, each gpu is assigned 32 samples. In this case, SimCLR(or SupContrast) will search postive pairs just in 32 candinates, it will be harmful to downstream performance.

If use ddp mode, I think a gather_layer op need to be implemented like simclr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants