Skip to content

[RFC] How to handle BC breaking changes on Model weights or hyper-parameters #2955

Open
@datumbox

Description

@datumbox

🚀 Feature

In order to fix bugs we are sometimes forced to introduce BC breaking changes. While the process of such introductions is clear when it comes to code changes, it's not when it comes to model weights or hyper-parameters. Thus we should define when, why and how to introduce BC-breaking changes when it comes to model weights or model hyper-parameters.

Motivation

We have recently bumped to a few issues that motivate this. Here are a few examples:

Approaches

There are quite a few different approaches for this:

  1. Replace the old parameters and Inform the community about the BC breaking changes. Example: [DONOTMERGE] Update the accuracy metrics of detection models #2942
    • Reasonable approach when the accuracy improvement is substantial or the effect on the model behaviour is negligible.
    • Keeps the code-base clean from workarounds and minimizes the number of weights we provide.
    • Can potentially cause issues to users who use transfer learning.
  2. Write code/workarounds to minimize the effect of the changes on existing models. Example: Overwriting FrozenBN eps=0.0 if pretrained=True for detection models. #2940
    • Reasonable approach when the changes lead to slight decrease in accuracy.
    • Minimizes the effects on users who used pre-trained models.
    • Introduces ugly workarounds on the code and increases the number of weights we provide.
  3. Introduce versioning on model weights:
    • Appropriate when introducing significant changes on the models.
    • Keeps the code-base clean from workarounds.
    • Forces us to maintain multiple versions of weights and model config.

It's worth discussing whether we want to adapt our approach depending on the characteristics of the problem or if we want to go with one approach for all cases. Moreover it's worth investigating whether we need to handle differently changes on weights vs changes on hyper-parameters used on inference.

cc @fmassa @cpuhrsch @vfdev-5 @mthrok

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions