`batch_size>1` is not working in the testing phase #10

yuzehui1996 · 2021-02-05T11:05:26Z

Hi，
Have you ever tried muti-gpus training? I simply add DataParallel but the AP and AR are lower than the training with single gpu.

Thanks!

yizhou-wang · 2021-02-05T21:36:13Z

Sorry, we didn't try multiple GPUs before. But the performance should not downgrade using multiple GPUs. You may need to tune some parameters or monitor the loss during the training.

yuzehui1996 · 2021-02-07T10:49:37Z

Thanks for your reply!
I have trained the houglass model(HG) with multi-gpus. When I test the checkpoints and set batch-size from 1 to 4, the AP and AR downgrade greatly. Will the accuracy decrease due to the BatchNormalization layer during multi-gpus training?
BTW, I use 4 gpu cards for training.

yizhou-wang · 2021-02-09T02:44:17Z

I double-checked the inference code. Seems the batch size is hardcoded as 1. We will fix this bug in the future. But for now, you can just use batch_size = 1 in the testing phase.

yizhou-wang added the bug Something isn't working label Feb 9, 2021

yizhou-wang changed the title ~~Multi-GPUs train？~~ batch_size>1 is not working in the testing phase Feb 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`batch_size>1` is not working in the testing phase #10

`batch_size>1` is not working in the testing phase #10

yuzehui1996 commented Feb 5, 2021

yizhou-wang commented Feb 5, 2021

yuzehui1996 commented Feb 7, 2021

yizhou-wang commented Feb 9, 2021

batch_size>1 is not working in the testing phase #10

batch_size>1 is not working in the testing phase #10

Comments

yuzehui1996 commented Feb 5, 2021

yizhou-wang commented Feb 5, 2021

yuzehui1996 commented Feb 7, 2021

yizhou-wang commented Feb 9, 2021

`batch_size>1` is not working in the testing phase #10

`batch_size>1` is not working in the testing phase #10