Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is_training is not found when restore ckpt from mult_gpu_train #138

Open
XeChen opened this issue Jun 14, 2020 · 0 comments
Open

is_training is not found when restore ckpt from mult_gpu_train #138

XeChen opened this issue Jun 14, 2020 · 0 comments

Comments

@XeChen
Copy link

XeChen commented Jun 14, 2020

When I inference my data with the checkpoint obtained from the multi_gpu_train.py, it get an error as follow:

Traceback (most recent call last):
File "/root/anaconda3/envs/tensorflow12/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/root/anaconda3/envs/tensorflow12/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/root/anaconda3/envs/tensorflow12/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: Key is_training not found in checkpoint
[[{{node save/RestoreV2}} = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
[[{{node save/RestoreV2/_395}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_397_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

I can successfully inference the data with the checkpoints from train.py but it doesn't work when use the checkpoint from the multi_gpu_train.py because of the variable is_training. Please tell me why it happens. Thank you !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant