You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 29, 2024. It is now read-only.
I tried to run "train.py"(process 5 train at "usage").
But, I get the following "ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape" error.
Shall I change the batch size to be small?
How to change batch size ?
2019-08-25 14:54:06.964697: W tensorflow/core/common_runtime/bfc_allocator.cc:271] *******************************xx***
2019-08-25 14:54:06.964735: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at conv_ops.cc:446 : Resource exhausted: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node fa_layer4/conv_2/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 469, in
train()
File "train.py", line 437, in train
train_one_epoch(sess, ops, train_writer, stack_train)
File "train.py", line 243, in train_one_epoch
feed_dict=feed_dict,
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node fa_layer4/conv_2/Conv2D (defined at /home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py:186) = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Caused by op 'fa_layer4/conv_2/Conv2D', defined at:
File "train.py", line 469, in
train()
File "train.py", line 359, in train
bn_decay=bn_decay,
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/model.py", line 128, in get_model
scope="fa_layer4",
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/pointnet_util.py", line 323, in pointnet_fp_module
bn_decay=bn_decay,
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py", line 186, in conv2d
data_format=data_format,
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 957, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node fa_layer4/conv_2/Conv2D (defined at /home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py:186) = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
I tried to run "train.py"(process 5 train at "usage").
But, I get the following "ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape" error.
Shall I change the batch size to be small?
How to change batch size ?
[my environment]
gpu Geforce1050 ti
memory 16 GiB
swap 16 GiB
ubuntu 16.04
cuda 9.0
cudnn 7.5.0
[anaconda3]
python 3.6
tensorflow-gpu 1.12.0
scikit-learn 0.21.3
open3d-python 0.7.0.0
2019-08-25 14:54:06.964697: W tensorflow/core/common_runtime/bfc_allocator.cc:271] *******************************xx***
2019-08-25 14:54:06.964735: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at conv_ops.cc:446 : Resource exhausted: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node fa_layer4/conv_2/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[{{node gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad/_433}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4838_gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 469, in
train()
File "train.py", line 437, in train
train_one_epoch(sess, ops, train_writer, stack_train)
File "train.py", line 243, in train_one_epoch
feed_dict=feed_dict,
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node fa_layer4/conv_2/Conv2D (defined at /home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py:186) = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[{{node gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad/_433}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4838_gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Caused by op 'fa_layer4/conv_2/Conv2D', defined at:
File "train.py", line 469, in
train()
File "train.py", line 359, in train
bn_decay=bn_decay,
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/model.py", line 128, in get_model
scope="fa_layer4",
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/pointnet_util.py", line 323, in pointnet_fp_module
bn_decay=bn_decay,
File "/home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py", line 186, in conv2d
data_format=data_format,
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 957, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/home/hiwasawa/anaconda3/envs/pointNet2_py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[16,128,8192,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node fa_layer4/conv_2/Conv2D (defined at /home/hiwasawa/PointNet2/Open3D-PointNet2-Semantic3D-master/util/tf_util.py:186) = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](fa_layer4/conv_1/Relu, fa_layer4/conv_2/weights/read/_211)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[{{node gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad/_433}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4838_gradients/layer2/conv2/BiasAdd_grad/BiasAddGrad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
The text was updated successfully, but these errors were encountered: