Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce consistent handling of warm_up for components #8823

Open
mathislucka opened this issue Feb 6, 2025 · 2 comments
Open

Introduce consistent handling of warm_up for components #8823

mathislucka opened this issue Feb 6, 2025 · 2 comments
Labels
P2 Medium priority, add to the next sprint if no P1 available topic:core type:refactor Not necessarily visible to the users

Comments

@mathislucka
Copy link
Member

This issue was brought up by @davidsbatista during the review of

We defer time-consuming or resource expensive operations when initializing components so that loading of a pipeline to validate it is a fast and lightweight operation.

Instead of performing expensive operations in a component's __init__-method, we advise to move them to a warm_up-method. The warm_up-method should only be called when the intention is to actually run a component.

We also advise that components should only perform an expensive warm_up once.
For example, our embedders and rankers only load and initialize their model when it wasn't loaded and initialized before.

In the run-method of our Pipeline, we call warm_up on each component, to make sure that we don't run any components that haven't been warmed up.

However, if a component doesn't respect our guideline of checking if it has been warmed up before, this might cause slow operations to run on every invocation of the pipeline's run-method.

We should agree on an approach to make sure that this doesn't happen.

@mathislucka mathislucka added topic:core type:refactor Not necessarily visible to the users labels Feb 6, 2025
@mathislucka mathislucka changed the title Introduce consistent handling of warm_up for components. Introduce consistent handling of warm_up for components Feb 6, 2025
@julian-risch julian-risch added the P2 Medium priority, add to the next sprint if no P1 available label Feb 10, 2025
@sjrl
Copy link
Contributor

sjrl commented Feb 10, 2025

@julian-risch I believe your issue from here is also related. Perhaps we can merge the two?

@julian-risch julian-risch marked this as a duplicate of #8769 Feb 17, 2025
@julian-risch
Copy link
Member

I closed the other issue as a duplicate now. Here is context from the other issue.

Currently, the pipeline calls warmup on all its components that implement warm_up when the pipeline is run. Every time.
While we say in PipelineBase:

It's the node's responsibility to make sure this method can be called at every Pipeline.run() without re-initializing everything.

There is also a ToDo in the pipeline implementation stating:

    # TODO: Remove this warmup once we can check reliably whether a component has been warmed up or not
    # As of now it's here to make sure we don't have failing tests that assume warm_up() is called in run()

We should make this more consistent, for example by introducing an is_warmed_up() to components that implement warm_up(). Could be hidden from the user and automatically set to return True after warm_up() was called for the first time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 Medium priority, add to the next sprint if no P1 available topic:core type:refactor Not necessarily visible to the users
Projects
None yet
Development

No branches or pull requests

3 participants