SimCLR performance #13

jlindsey15 · 2020-06-17T17:34:11Z

Does the performance of the SimCLR implementation in this repo match the published results?

HobbitLong · 2020-06-17T19:01:27Z

@jlindsey15, in terms of CIFAR-10, yes! You can check Fig. B.4. in the SimCLR paper.

jlindsey15 · 2020-06-17T19:38:24Z

Ah I see, so there is no ImageNet support yet. Would you expect plugging in the ImageNet dataset to the current code to work? Thanks for the help!

HobbitLong · 2020-06-17T21:03:52Z

@jlindsey15, logically it should work. But may just take too much time to run on ImageNet.

One note is that currently I am not using DistributedDataParallel which might further speedup on ImageNet. If your dataset is not very big, might be fine to just use DataParallel here.

vtddggg · 2020-09-22T08:12:03Z

@jlindsey15, logically it should work. But may just take too much time to run on ImageNet.

One note is that currently I am not using DistributedDataParallel which might further speedup on ImageNet. If your dataset is not very big, might be fine to just use DataParallel here.

@HobbitLong Can I ask for the hyper-parameter settings for ImageNet training?
In you paper, I found you use epoch=700, batch_size=8192, LARS optimizer with cosine lr decay, temperature=0.07, AutoAugment for training your best model.
However, there are some other parameters that I cannot find from your paper:
Such as base_learning_rate, warm_up_init_learning_rate, warm_up_epochs. I would appreciate it if your can provide more information about them.

haohang96 · 2021-01-27T03:19:09Z

@jlindsey15, logically it should work. But may just take too much time to run on ImageNet.

One note is that currently I am not using DistributedDataParallel which might further speedup on ImageNet. If your dataset is not very big, might be fine to just use DataParallel here.

@HobbitLong I think this code can not be applied to DistributedDataParallel directly. Because in DDP mode, pytorch compute loss in each gpu. For example, in 8-gpus mechain, if batch_size=1024, each gpu is assigned 32 samples. In this case, SimCLR(or SupContrast) will search postive pairs just in 32 candinates, it will be harmful to downstream performance.

If use ddp mode, I think a gather_layer op need to be implemented like simclr.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SimCLR performance #13

SimCLR performance #13

jlindsey15 commented Jun 17, 2020

HobbitLong commented Jun 17, 2020

jlindsey15 commented Jun 17, 2020

HobbitLong commented Jun 17, 2020

vtddggg commented Sep 22, 2020 •

edited

Loading

haohang96 commented Jan 27, 2021 •

edited

Loading

SimCLR performance #13

SimCLR performance #13

Comments

jlindsey15 commented Jun 17, 2020

HobbitLong commented Jun 17, 2020

jlindsey15 commented Jun 17, 2020

HobbitLong commented Jun 17, 2020

vtddggg commented Sep 22, 2020 • edited Loading

haohang96 commented Jan 27, 2021 • edited Loading

vtddggg commented Sep 22, 2020 •

edited

Loading

haohang96 commented Jan 27, 2021 •

edited

Loading