-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LoadBalancerCacheManager supports refresh cache, not just expire cache #1215
Comments
Hello, our calling relationship A->B, B has two instances (B1,B2). Set the ttl to 30s, and continue to send requests to A. If B2 is offline, A's cache is not updated. As A result, an error occurs when A calls B2. In this case, traffic must be disabled for the cache of A to be updated. I think this is a very big bug that needs to be updated or not. |
Hello, @xinlunanxinlunan. First of all, this is not a bug, what you need is a health check before invoking remote service instance. Secondly, refreshing means that if there is a new list, we will replace it, but it is irresponsible to clear it if the new list cannot be obtained. In your scenario, if 50% of the instances are unavailable, then use the history list when the new service list cannot be obtained, and the availability rate will only drop to 50%, but if all the instances are directly expired, then the service cannot be obtained, the availability rate is 0%. Finally, this Issue is not for discussing when to update, you'd better create a new issue, I will not continue to discuss this topic in this. |
@jizhuozhi @OlgaMaciaszek Is there any known walk-around for this? I was thinking about using
This seems like that should behave just like 30s cache but it would not clear cache after refetch-interval. WDYT? |
Hello, @sobelek. What your need in this scene is not scheduling
And scheduling |
Hey. |
Hello, @sobelek . Thanks for your reply. We may need to align a boundary issue. Generally speaking, health checks check the health status of an instance when given one from instance list, but it has no responsibility to change the instance list. If the logic of refetching instances is incorporated into health checks, the boundary between service discovery and health checks will be broken, and the responsibilities of health checks will no longer be single. Of course, I tried to abstract your solution. In fact, what we need is still an event source that triggers an event regularly to notify the loadbalancer that the instance list needs to be refreshed, so our solution is the same. |
Fully agree with a boundary between discovery and loadbalancing. Thanks for clarifying this. Event driven discovery or refreshing cache would both be great and would resolve to original issue👍 |
Hello, @sobelek. As an aside, and in my experience, usually only application layer protocols that support multiplexing (such as Apache Dubbo) will perform health checks on the client side, because only a simple ping-pong request is required. But for HTTP/1 (Spring MVC or Webflux), because each request is a new TCP connection, when the number of instances is very large, the health check may issue a large number of on-the-fly requests and leave lots of CLOSE_WAIT connections after the request ends. As a result, normal requests cannot be processed. The same is true for the server side. |
Yea, true that. We are actually using k8s so healthchecking is done by k8s out of the box. Back to original issue. Do you have any idea if you guys are going to work on refreshAfter cache for spring loadbalancer? |
Hello, @sobelek . There is no out box solution for all |
Thanks, I am trying to do this. Would you like me to also make a PR here when finished? |
Of course, welcome. This Issue has been assigned to @OlgaMaciaszek. Do you have any good suggestions @OlgaMaciaszek ? |
Thanks, @jizhuozhi, makes sense; however, I'd like to see a separate property for the refresh interval (can be the value you've proposed as default, but needs to be customisable). |
@sobelek Please confirm if you will be submitting this PR. |
@OlgaMaciaszek |
Ok, please do and ping me here. |
Is your feature request related to a problem? Please describe.
Spring Cloud LoadBalancer now only supports expired caching, that is, after a period of time, the cache item will become invalid, and when the same content is requested again, it will re-submit a request to the registry blockly to obtain the service instance. In smooth service there will be spikes.
In addition to causing spikes, this also poses a challenge to high availability. When the cache expires, all cache items are evicted. At this time, there are no related service instances in memory. If the registry goes down, the entire None of the client instances can submit a request to the service provider, and the service is completely unavailable. In a refresh cache, this will have no effect. If the registry goes down, we just don't get the latest data, and slightly older data is better than no data (in fact, the service instance changes very frequently low, the impact of old data is negligible).
Describe the solution you'd like
Based on this, I think it is necessary to provide refresh cache instead of just expire cache
A feasible solution is to support configuring the refresh interval and use DiscoveryClient to grab service instances, and at the same time support configuring the expiration time to avoid using expired data for a long time (exposing the problem)
The text was updated successfully, but these errors were encountered: