Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After first epoch of training, it raises ValueError #7

Open
HankyuJang opened this issue Apr 13, 2018 · 5 comments
Open

After first epoch of training, it raises ValueError #7

HankyuJang opened this issue Apr 13, 2018 · 5 comments

Comments

@HankyuJang
Copy link

I am trying to run the code on the GPU using AWS EC2 machine which CUDA 9 is preinstalled. I fixed two small changes in inference.py coming from different versions of Keras:

(1) pickle_safe=False -> use_multiprocessing=True
(2) max_q_size -> max_queue_size

It seems to be training fine for the first epoch, however it raises ValueError. I am not sure what I should do here.. Do you have any suggestions?

2018-04-13 00:52:18 | Training model8 is starting..
/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py:109: UserWarning: Update your `fit_generator` call to the Keras 2 API: `fit_generator(verbose=1, generator=<generator..., workers=1, validation_data=<generator..., steps_per_epoch=938, epochs=33, callbacks=[<keras_im..., max_queue_size=10, validation_steps=157)`
  verbose=self._verbose)
Epoch 1/33
938/938 [==============================] - 523s 557ms/step - loss: 2.5325 - categorical_accuracy_wvt: 0.2447 - val_loss: 2.1476 - val_categorical_accuracy_wvt: 0.3012
  0%|     
Traceback (most recent call last):
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py", line 283, in <module>
    fire.Fire(main)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/fire/core.py", line 127, in Fire
    component_trace = _Fire(component, args, context, name)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/fire/core.py", line 366, in _Fire
    component, remaining_args)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/fire/core.py", line 542, in _CallCallable
    result = fn(*varargs, **kwargs)
  File "/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py", line 276, in main
    training.run()
  File "/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py", line 109, in run
    verbose=self._verbose)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/keras/engine/training.py", line 2262, in fit_generator
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/keras/callbacks.py", line 77, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "keras_image_captioning/callbacks.py", line 55, in on_epoch_end
    in self._inference.evaluate_training_set().items()})
  File "keras_image_captioning/inference.py", line 48, in evaluate_training_set
    return self._evaluate(self.predict_training_set(include_datum=True),
  File "keras_image_captioning/inference.py", line 35, in predict_training_set
    include_datum)
  File "keras_image_captioning/inference.py", line 79, in _predict
    X, y, datum_batch = generator_output
ValueError: need more than 2 values to unpack
@danieljl
Copy link
Owner

danieljl commented Apr 13, 2018

Have you installed the exact version for each library in https://github.com/danieljl/keras-image-captioning/blob/master/requirements.txt? Keras is known to break APIs even in minor version update.

EDIT: Oh, you have a different version of Keras. I suggest you to install the exact same version of Keras and other dependencies in requirements.txt.

@HankyuJang
Copy link
Author

I see. One thing to make sure, the codes are running on CPU right? I also installed exact version for each library as requirements.txt, but I found that it installs tensorflow not the tensorflow-gpu. It seemed to take so much time in training the data each epoch, so I was trying to run it on GPU.

@danieljl
Copy link
Owner

The codes can also run on GPU. Just uninstall tensorflow and install tensorflow-gpu with the same version.

The reason why there is tensorflow and not tensorflow-gpu in requirements.txt is because I built TensorFlow from source.

@aakashgupta96
Copy link

I am also facing the same issue. From requirements.txt, it installs tensorflow 1.1.0. When I try to install same version of tensorflow-gpu (as you are guiding), it shows this error
DistributionNotFound: No matching distribution found for 1.1.0.

Is there any work around to use this code on GPU?

@mememimis
Copy link

Have you found any way to resolve this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants