You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I find the proposed training strategy is 1) train the backbone with the labels and the contrastive loss, 2) finetune the last linear layer. The baseline approach is train the backbone and last linear layer at the same time with cross entropy loss. Do you have a reference of 1) train the backbone with cross entropy loss, 2) re-train the last linear layer from scratch?
The reason is that, the baseline here has multiple differences with the proposed solution. The gain could come from more iterations, i.e. iterations in pre-training + iterations in fine-tuning.
The text was updated successfully, but these errors were encountered:
@amsword, excellent question. But I don't think this is a fairness issue.
IIRC, the baseline you proposed here is sub-optimal compared to end-to-end cross-entropy training. In theory it also should be, but you can still run such a baseline to verify.
If cross-entropy and SCL(supervised contrastive learning) are regarded as a multi-task task, which means that the network is in the form of a backbone and multiple branches, what will be the result? Could it be better? This also avoids finetuning the last linear layer.
I find the proposed training strategy is 1) train the backbone with the labels and the contrastive loss, 2) finetune the last linear layer. The baseline approach is train the backbone and last linear layer at the same time with cross entropy loss. Do you have a reference of 1) train the backbone with cross entropy loss, 2) re-train the last linear layer from scratch?
The reason is that, the baseline here has multiple differences with the proposed solution. The gain could come from more iterations, i.e. iterations in pre-training + iterations in fine-tuning.
The text was updated successfully, but these errors were encountered: