Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug fix in inference_endpoint wait function for proper waiting on update #2867

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

Ajinkya-25
Copy link

#2767

Description of Changes in InferenceEndpoint.wait()

Problem Statement:

Previously, the wait() function in the InferenceEndpoint class waited for the endpoint to become available but did not properly handle the case where the endpoint was updating. This could lead to premature exits before the update was fully completed.

Changes Implemented:

Added a check for InferenceEndpointStatus.UPDATING and InferenceEndpointStatus.UPDATE_FAILED

Before considering the endpoint as ready, the function now ensures that the endpoint is not in the updating state.
This prevents returning an endpoint that is still in the process of updating.

Potential Issues Fixed:

Previously, wait() could return immediately after an update request, even if the endpoint was still being updated.
The function is now more robust and ensures that users do not interact with an endpoint that is still updating.

Copy link
Contributor

@hanouticelina hanouticelina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Ajinkya-25, thanks a lot for your contribution, I left comments to make the approach more robust :)
also, could you add a test in huggingface_hub/tests/test_inference_endpoints.py similar to this one please?

Comment on lines 213 to 216
if response.status_code == 200 and self.status not in [
InferenceEndpointStatus.UPDATING,
InferenceEndpointStatus.UPDATE_FAILED,
]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could allow some intermediate states (like INITIALIZING) to pass through if they return a 200 status code (not sure if it's the case)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks For Reviewing And Suggesting The Changes. I have implemented the suggested changes as they provide a more robust solution to the bug.

src/huggingface_hub/_inference_endpoints.py Outdated Show resolved Hide resolved
Ajinkya-25 and others added 3 commits February 19, 2025 10:46
improve code clarity and added logging based on review

Co-authored-by: Célina <[email protected]>
…ddition of test case in test_inference_endpoint for testing changes in wait function
@Ajinkya-25 Ajinkya-25 force-pushed the fix-endpoint-wait-update branch from b0e8889 to e565f45 Compare February 19, 2025 05:17
Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good and clean PR, thanks @Ajinkya-25 ! I've left two comments in the test section as I think we should focus the test on "if an endpoint is updating, we should not return until it is "running" again, even if an URL is already provisioned for it".

tests/test_inference_endpoints.py Show resolved Hide resolved
tests/test_inference_endpoints.py Outdated Show resolved Hide resolved
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment on lines +287 to +291
responses = chain(
[InferenceEndpoint.from_raw(MOCK_UPDATE, namespace="foo")] * 3,
repeat(InferenceEndpoint.from_raw(MOCK_RUNNING, namespace="foo")),
)
mock_get_inference_endpoint.side_effect = lambda *args, **kwargs: next(responses)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though this is perfectly valid, I prefer the way you did it previously with an explicit list (it makes it more readable IMO). Is the problem that you need to repeat the last side effect (RUNNING) an unknown number of times?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I first used a list with three MOCK_UPDATE responses, but it caused a StopIteration error. After some research, I realized it was because the iterator ran out of values. Switching to chain() and repeat() fixed it by making sure the test keeps returning MOCK_RUNNING after the updates, preventing the error. Still learning testing, but this worked well! 🚀. Also this test case needs to set status_code to 200 for testing and as i am writing test case first time it was hard to set it as 200 can some function be added to testing files to set all this parameters(maybe a new feature). Thanks for the support.

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but let's wait for @hanouticelina 's final review before merging :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants