After first epoch of training, it raises ValueError #7

HankyuJang · 2018-04-13T01:06:52Z

I am trying to run the code on the GPU using AWS EC2 machine which CUDA 9 is preinstalled. I fixed two small changes in inference.py coming from different versions of Keras:

(1) pickle_safe=False -> use_multiprocessing=True
(2) max_q_size -> max_queue_size

It seems to be training fine for the first epoch, however it raises ValueError. I am not sure what I should do here.. Do you have any suggestions?

2018-04-13 00:52:18 | Training model8 is starting..
/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py:109: UserWarning: Update your `fit_generator` call to the Keras 2 API: `fit_generator(verbose=1, generator=<generator..., workers=1, validation_data=<generator..., steps_per_epoch=938, epochs=33, callbacks=[<keras_im..., max_queue_size=10, validation_steps=157)`
  verbose=self._verbose)
Epoch 1/33
938/938 [==============================] - 523s 557ms/step - loss: 2.5325 - categorical_accuracy_wvt: 0.2447 - val_loss: 2.1476 - val_categorical_accuracy_wvt: 0.3012
  0%|     
Traceback (most recent call last):
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py", line 283, in <module>
    fire.Fire(main)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/fire/core.py", line 127, in Fire
    component_trace = _Fire(component, args, context, name)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/fire/core.py", line 366, in _Fire
    component, remaining_args)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/fire/core.py", line 542, in _CallCallable
    result = fn(*varargs, **kwargs)
  File "/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py", line 276, in main
    training.run()
  File "/home/ubuntu/keras-image-captioning/keras_image_captioning/training.py", line 109, in run
    verbose=self._verbose)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/keras/engine/training.py", line 2262, in fit_generator
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/home/ubuntu/anaconda2/envs/tensorflow-2.7/lib/python2.7/site-packages/keras/callbacks.py", line 77, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "keras_image_captioning/callbacks.py", line 55, in on_epoch_end
    in self._inference.evaluate_training_set().items()})
  File "keras_image_captioning/inference.py", line 48, in evaluate_training_set
    return self._evaluate(self.predict_training_set(include_datum=True),
  File "keras_image_captioning/inference.py", line 35, in predict_training_set
    include_datum)
  File "keras_image_captioning/inference.py", line 79, in _predict
    X, y, datum_batch = generator_output
ValueError: need more than 2 values to unpack

The text was updated successfully, but these errors were encountered:

danieljl · 2018-04-13T10:05:51Z

Have you installed the exact version for each library in https://github.com/danieljl/keras-image-captioning/blob/master/requirements.txt? Keras is known to break APIs even in minor version update.

EDIT: Oh, you have a different version of Keras. I suggest you to install the exact same version of Keras and other dependencies in requirements.txt.

HankyuJang · 2018-04-13T14:05:30Z

I see. One thing to make sure, the codes are running on CPU right? I also installed exact version for each library as requirements.txt, but I found that it installs tensorflow not the tensorflow-gpu. It seemed to take so much time in training the data each epoch, so I was trying to run it on GPU.

danieljl · 2018-04-14T02:54:21Z

The codes can also run on GPU. Just uninstall tensorflow and install tensorflow-gpu with the same version.

The reason why there is tensorflow and not tensorflow-gpu in requirements.txt is because I built TensorFlow from source.

aakashgupta96 · 2018-05-15T20:13:18Z

I am also facing the same issue. From requirements.txt, it installs tensorflow 1.1.0. When I try to install same version of tensorflow-gpu (as you are guiding), it shows this error
DistributionNotFound: No matching distribution found for 1.1.0.

Is there any work around to use this code on GPU?

mememimis · 2018-09-22T21:44:52Z

Have you found any way to resolve this issue?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After first epoch of training, it raises ValueError #7

After first epoch of training, it raises ValueError #7

HankyuJang commented Apr 13, 2018

danieljl commented Apr 13, 2018 •

edited

Loading

HankyuJang commented Apr 13, 2018

danieljl commented Apr 14, 2018

aakashgupta96 commented May 15, 2018

mememimis commented Sep 22, 2018

After first epoch of training, it raises ValueError #7

After first epoch of training, it raises ValueError #7

Comments

HankyuJang commented Apr 13, 2018

danieljl commented Apr 13, 2018 • edited Loading

HankyuJang commented Apr 13, 2018

danieljl commented Apr 14, 2018

aakashgupta96 commented May 15, 2018

mememimis commented Sep 22, 2018

danieljl commented Apr 13, 2018 •

edited

Loading