When I run train.py on coco dataset with resnet-101 model, it 's struck on for a long time. #59

lji72 · 2018-12-04T01:50:07Z

/home/liuji/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py:97: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
INFO:tensorflow:Restoring parameters from /home/liuji/light_head_rcnn/data/imagenet_weights/res101.ckpt

^CTraceback (most recent call last):
File "train.py", line 264, in
train(args)
File "train.py", line 186, in train
blobs_list = prefetch_data_layer.forward()
File "/home/liuji/light_head_rcnn/lib/utils/dpflow/prefetching_iter.py", line 78, in forward
if self.iter_next():
File "/home/liuji/light_head_rcnn/lib/utils/dpflow/prefetching_iter.py", line 65, in iter_next
e.wait()
File "/home/liuji/anaconda3/envs/tensorflow/lib/python3.6/threading.py", line 551, in wait
signaled = self._cond.wait(timeout)
File "/home/liuji/anaconda3/envs/tensorflow/lib/python3.6/threading.py", line 295, in wait
waiter.acquire()

Hello, I meet the problem, could you give a more detail solution. Thanks

XingLiuJia · 2018-12-24T11:31:24Z

hello,can you send me COCO ?thank you Email:[email protected]

mbruchalski1 · 2018-12-27T22:51:34Z

Having the same problem, dataset, json, odgt all look good. Able to run the test code, but unable to know what this problem is. Error message is not detailed. Does anyone have a solution for this issue or the code is only for evaluation and does not work for training?

masotrix · 2019-11-15T21:09:29Z

I solved it adjusting "nr_dataflow" in config.py (in the corresponding folder you should be training according to README.md) from 16 to 2 in case of 1 GPU, because train_batch_per_gpu=2, (so 8GPUs x 2 images = 16 and 1GPU x 2 image = 2). Hope this helps you ✌️

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When I run train.py on coco dataset with resnet-101 model, it 's struck on for a long time. #59

When I run train.py on coco dataset with resnet-101 model, it 's struck on for a long time. #59

lji72 commented Dec 4, 2018

XingLiuJia commented Dec 24, 2018

mbruchalski1 commented Dec 27, 2018

masotrix commented Nov 15, 2019

When I run train.py on coco dataset with resnet-101 model, it 's struck on for a long time. #59

When I run train.py on coco dataset with resnet-101 model, it 's struck on for a long time. #59

Comments

lji72 commented Dec 4, 2018

XingLiuJia commented Dec 24, 2018

mbruchalski1 commented Dec 27, 2018

masotrix commented Nov 15, 2019