Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch_size>1 is not working in the testing phase #10

Open
yuzehui1996 opened this issue Feb 5, 2021 · 3 comments
Open

batch_size>1 is not working in the testing phase #10

yuzehui1996 opened this issue Feb 5, 2021 · 3 comments
Labels
bug Something isn't working

Comments

@yuzehui1996
Copy link

Hi,
Have you ever tried muti-gpus training? I simply add DataParallel but the AP and AR are lower than the training with single gpu.

Thanks!

@yizhou-wang
Copy link
Owner

Sorry, we didn't try multiple GPUs before. But the performance should not downgrade using multiple GPUs. You may need to tune some parameters or monitor the loss during the training.

@yuzehui1996
Copy link
Author

Thanks for your reply!
I have trained the houglass model(HG) with multi-gpus. When I test the checkpoints and set batch-size from 1 to 4, the AP and AR downgrade greatly. Will the accuracy decrease due to the BatchNormalization layer during multi-gpus training?
BTW, I use 4 gpu cards for training.

@yizhou-wang yizhou-wang added the bug Something isn't working label Feb 9, 2021
@yizhou-wang
Copy link
Owner

I double-checked the inference code. Seems the batch size is hardcoded as 1. We will fix this bug in the future. But for now, you can just use batch_size = 1 in the testing phase.

@yizhou-wang yizhou-wang changed the title Multi-GPUs train? batch_size>1 is not working in the testing phase Feb 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants