BUG: validation metrics not logged per epoch, only per batch #10

ikizhvatov · 2022-07-08T08:25:57Z

When I add the callback and pass validation set or validation split to Keras, validation per-batch metrics are logged under 'test/batch/...', but validation per-epoch metrics are not logged.

These metrics' names begin with val and this is filtered out in impl/__init__.py:23 since merge request #9.

Removing this filtering lets them being logged under train/epoch/val_.... But then there is inconsistency between the categories: per-batch is under test, and per-epoch is under train.

It seems that on_test_batch_end is invoked after validation, and on_test_end is not.

This happens with Tensorflow 2.2.0, and with Tensorflow 2.9.1 as well.

I hope this can be straightened.

The text was updated successfully, but these errors were encountered:

Blaizzy · 2022-07-08T13:17:34Z

Hi Ilya,

Thank you very much for flagging this on the intercom and writing this comprehensive issue.

As we discussed, I will work on a general solution for it and let you know once we merge the PR!

twolodzko · 2022-09-07T10:14:53Z

@Blaizzy it seems like with #14 and following changes it would be fixed, could you confirm?

Blaizzy · 2022-09-08T13:36:03Z

Unfortunately, it doesn't solve it @twolodzko.

Issue

When you pass validation split or validation dataset Keras logs the validation metrics with key names starting with "val_". These metrics are passed toon_epoch_end method instead of on_test_end method as they state here.

This is metric introduces a new metric field validation to the integration and according to the docs it's under the training namespace. Example:

train/epoch/val_loss
train/epoch/val_metric-name (where metric-name is accuracy or any other metric )

Proposed Fix 🛠️

PR: #22

Blaizzy · 2022-09-08T15:59:15Z

@ikizhvatov and @twolodzko

Upon closer and deeper inspection I found out the following:

Keras validation and test (epoch level) metrics are exactly the same. Meaning, there is no bug ❌ .
In other words on_test_end metrics/logs (i.e. loss) are identical to on_epoch_end and on_train_end metrics/logs prefixed by val_ that's why metrics with name starting with val_ are excluded in this condition here.

Below is a full example of my experiment and results.

Test code

class CustomCallback(keras.callbacks.Callback):
    def _log_metrics(self, logs, category: str, trigger: str):
        if not logs:
            return

        for metric, value in logs.items():
            if metric in ("batch", "size"):
                    continue
            print(f"{category}/{trigger}/{metric} ---> {value}")
    
        print("\n")
        

    def on_train_end(self, logs=None):  # pylint:disable=unused-argument
        print('on_train_end')
        self._log_metrics(logs, 'train', 'epoch')
    
    def on_test_end(self, logs=None):  # pylint:disable=unused-argument
        print('on_test_end')
        self._log_metrics(logs, 'test', 'epoch')

    def on_epoch_end(self, epoch, logs=None):  # pylint:disable=unused-argument
        print('on_epoch_end')
        self._log_metrics(logs, 'train', 'epoch')

model = get_model()

model.fit(
    x_train,
    y_train,
    batch_size=128,
    epochs=1,
    verbose=0,
    validation_split=0.5,
    callbacks=[CustomCallback()]
)

Output:

on_test_end
test/epoch/loss ---> 6.655152320861816
test/epoch/mean_absolute_error ---> 2.1341843605041504


on_epoch_end
train/epoch/loss ---> 260.7928771972656
train/epoch/mean_absolute_error ---> 9.972155570983887
train/epoch/val_loss ---> 6.655152320861816
train/epoch/val_mean_absolute_error ---> 2.1341843605041504


on_train_end
train/epoch/loss ---> 260.7928771972656
train/epoch/mean_absolute_error ---> 9.972155570983887
train/epoch/val_loss ---> 6.655152320861816
train/epoch/val_mean_absolute_error ---> 2.1341843605041504

As you can see, test/epoch/loss and train/epoch/val_loss are the same so as test/epoch/metric and train/epoch/val_metric.

Therefore, I'm closing this issue and PR #22 for now.

Let me know if this logic makes sense or if I may have overlooked something.

Note: If I did indeed overlook something, I'm more than happy to re-open this issue and continue on the quest of getting to the solution.

Blaizzy self-assigned this Jul 8, 2022

Blaizzy added bug Something isn't working enhancement New feature or request and removed bug Something isn't working labels Jul 8, 2022

Blaizzy mentioned this issue Sep 8, 2022

fix skipping val metrics #22

Closed

Blaizzy closed this as completed Sep 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: validation metrics not logged per epoch, only per batch #10

BUG: validation metrics not logged per epoch, only per batch #10

ikizhvatov commented Jul 8, 2022 •

edited

Loading

Blaizzy commented Jul 8, 2022 •

edited

Loading

twolodzko commented Sep 7, 2022

Blaizzy commented Sep 8, 2022

Blaizzy commented Sep 8, 2022 •

edited

Loading

BUG: validation metrics not logged per epoch, only per batch #10

BUG: validation metrics not logged per epoch, only per batch #10

Comments

ikizhvatov commented Jul 8, 2022 • edited Loading

Blaizzy commented Jul 8, 2022 • edited Loading

twolodzko commented Sep 7, 2022

Blaizzy commented Sep 8, 2022

Issue

Proposed Fix 🛠️

Blaizzy commented Sep 8, 2022 • edited Loading

ikizhvatov commented Jul 8, 2022 •

edited

Loading

Blaizzy commented Jul 8, 2022 •

edited

Loading

Blaizzy commented Sep 8, 2022 •

edited

Loading