Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: validation metrics not logged per epoch, only per batch #10

Closed
ikizhvatov opened this issue Jul 8, 2022 · 4 comments
Closed

BUG: validation metrics not logged per epoch, only per batch #10

ikizhvatov opened this issue Jul 8, 2022 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@ikizhvatov
Copy link

ikizhvatov commented Jul 8, 2022

When I add the callback and pass validation set or validation split to Keras, validation per-batch metrics are logged under 'test/batch/...', but validation per-epoch metrics are not logged.

These metrics' names begin with val and this is filtered out in impl/__init__.py:23 since merge request #9.

Removing this filtering lets them being logged under train/epoch/val_.... But then there is inconsistency between the categories: per-batch is under test, and per-epoch is under train.

It seems that on_test_batch_end is invoked after validation, and on_test_end is not.

This happens with Tensorflow 2.2.0, and with Tensorflow 2.9.1 as well.

I hope this can be straightened.

@Blaizzy Blaizzy self-assigned this Jul 8, 2022
@Blaizzy Blaizzy added bug Something isn't working enhancement New feature or request and removed bug Something isn't working labels Jul 8, 2022
@Blaizzy
Copy link

Blaizzy commented Jul 8, 2022

Hi Ilya,

Thank you very much for flagging this on the intercom and writing this comprehensive issue.

As we discussed, I will work on a general solution for it and let you know once we merge the PR!

@twolodzko
Copy link
Contributor

@Blaizzy it seems like with #14 and following changes it would be fixed, could you confirm?

@Blaizzy
Copy link

Blaizzy commented Sep 8, 2022

Unfortunately, it doesn't solve it @twolodzko.

Issue

When you pass validation split or validation dataset Keras logs the validation metrics with key names starting with "val_". These metrics are passed toon_epoch_end method instead of on_test_end method as they state here.

This is metric introduces a new metric field validation to the integration and according to the docs it's under the training namespace. Example:

  • train/epoch/val_loss
  • train/epoch/val_metric-name (where metric-name is accuracy or any other metric )

Proposed Fix 🛠️

PR: #22

@Blaizzy
Copy link

Blaizzy commented Sep 8, 2022

@ikizhvatov and @twolodzko

Upon closer and deeper inspection I found out the following:

  • Keras validation and test (epoch level) metrics are exactly the same. Meaning, there is no bug ❌ .
  • In other words on_test_end metrics/logs (i.e. loss) are identical to on_epoch_end and on_train_end metrics/logs prefixed by val_ that's why metrics with name starting with val_ are excluded in this condition here.

Below is a full example of my experiment and results.

Test code

class CustomCallback(keras.callbacks.Callback):
    def _log_metrics(self, logs, category: str, trigger: str):
        if not logs:
            return

        for metric, value in logs.items():
            if metric in ("batch", "size"):
                    continue
            print(f"{category}/{trigger}/{metric} ---> {value}")
    
        print("\n")
        

    def on_train_end(self, logs=None):  # pylint:disable=unused-argument
        print('on_train_end')
        self._log_metrics(logs, 'train', 'epoch')
    
    def on_test_end(self, logs=None):  # pylint:disable=unused-argument
        print('on_test_end')
        self._log_metrics(logs, 'test', 'epoch')

    def on_epoch_end(self, epoch, logs=None):  # pylint:disable=unused-argument
        print('on_epoch_end')
        self._log_metrics(logs, 'train', 'epoch')

model = get_model()

model.fit(
    x_train,
    y_train,
    batch_size=128,
    epochs=1,
    verbose=0,
    validation_split=0.5,
    callbacks=[CustomCallback()]
)

Output:

on_test_end
test/epoch/loss ---> 6.655152320861816
test/epoch/mean_absolute_error ---> 2.1341843605041504


on_epoch_end
train/epoch/loss ---> 260.7928771972656
train/epoch/mean_absolute_error ---> 9.972155570983887
train/epoch/val_loss ---> 6.655152320861816
train/epoch/val_mean_absolute_error ---> 2.1341843605041504


on_train_end
train/epoch/loss ---> 260.7928771972656
train/epoch/mean_absolute_error ---> 9.972155570983887
train/epoch/val_loss ---> 6.655152320861816
train/epoch/val_mean_absolute_error ---> 2.1341843605041504

As you can see, test/epoch/loss and train/epoch/val_loss are the same so as test/epoch/metric and train/epoch/val_metric.

Therefore, I'm closing this issue and PR #22 for now.

Let me know if this logic makes sense or if I may have overlooked something.

Note: If I did indeed overlook something, I'm more than happy to re-open this issue and continue on the quest of getting to the solution.

@Blaizzy Blaizzy closed this as completed Sep 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants