-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: Rerun tests that fail due to networking issues. #2029
base: main
Are you sure you want to change the base?
Conversation
I observed a Failed test: If there are any additional issues, let me know, and I'll add them to the PR. |
There is some more integrational tests |
It seems that we need to rerun tasks with |
@Samoed What exceptions do you have for these tests? I’d like to cover them using the CLI call since there will be more than one test, ensuring that all errors are caught. |
I thought similar like |
Currently, I am handling the requests.exceptions.ReadTimeout error through the CLI call, as I observed this issue in the actions. If there are additional exceptions occurring, they will also need to be identified and added to the list for rerun |
@@ -84,6 +84,12 @@ async def check_datasets_are_available_on_hf(tasks): | |||
assert False, f"Datasets not available on Hugging Face:\n{pretty_print}" | |||
|
|||
|
|||
@pytest.mark.flaky( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any way why we simply can't use this only? I would def. prefer that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would advocate doing this only for specific cases; otherwise, we would need to add the marker to many functions, and everyone adding a new test would have to consider it as well. Instead, I’ll add it as an argument in pyproject.toml
. This should make it cleaner and apply to all cases automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we only have 2 integration tests, the cli test, the reproducible workflow I believe. I would much rather add specific handling to these rather than add overall generic handling this also adds documentation to when it fails locally with e.g. HTTPConnectionErrror that this is an expected error for this test. If we add it in pyproject.toml most people will not see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that this is precisely the approach we should be pursuing. Since the issue appears to be related to networking on a part of the system that we do not control, I believe this is an error that new contributors should not need to account for when adding new tests. Additionally, long-time contributors might overlook this when reviewing the PR.
Therefore, I strongly recommend adding the relevant configurations to the pyproject.toml file. This would allow maintainers to easily document errors that affect all tests. Our goal should be to ensure that contributors, especially those who may not anticipate such issues, can contribute without unnecessary complexity.
Moreover, the test results indicate that the test was executed multiple times (the reruns) if it fails. While I do not expect the test to fail after this change, should it happen, we should consider increasing the timeouts to prevent future issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it is always possible to add after a PR and it is not an overly huge problem if it gets merged. I also in general don't think that integration tests should be added by new contributors.
@Samoed would love to get a second opinion here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have some tests that download data from hf, or github. Obviously, test_benchmark_integration_with_sentencetransformers and test_benchmark_integration_with_datasets. Then
- test_model_memory_usage
- test_mteb_rerank
- test_cli
- test_encoder_interfaces
- test_reproducible_workflow
I used the files instead of the test functions, and I think there are almost 10 (or maybe more) tests that involve a network connection. I don't think we should add a decorator to each of these tests and can use some global config (cli or pyproject)
I think there are enough eyes on this PR. I'll sit this one out. |
@@ -84,6 +84,12 @@ async def check_datasets_are_available_on_hf(tasks): | |||
assert False, f"Datasets not available on Hugging Face:\n{pretty_print}" | |||
|
|||
|
|||
@pytest.mark.flaky( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we only have 2 integration tests, the cli test, the reproducible workflow I believe. I would much rather add specific handling to these rather than add overall generic handling this also adds documentation to when it fails locally with e.g. HTTPConnectionErrror that this is an expected error for this test. If we add it in pyproject.toml most people will not see it.
#2012: Rerun test cases that frequently fail in the pipeline. After 3 unsuccessful reruns, the test cases will fail.
Code Quality
make lint
to maintain consistent style.Documentation
Testing
make test-with-coverage
.make test
ormake test-with-coverage
to ensure no existing functionality is broken.Closes: #2012
cc: @KennethEnevoldsen @Samoed @isaac-chung