Parallel updates of VPAs and checkpoints. #7951

tkukushkin · 2025-03-19T16:29:40Z

Which component are you using?:

/area vertical-pod-autoscaler

Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:

Hi! We have more than 2.7k VPAs in our cluster and we encountered a problem, that recommender starts working really slowly, one recommendations cycle takes more than 10 minutes.

We've tried to make updating all VPAs and all checkpoints in parallel with simple wait groups, relying on rate limit of --kube-api-qps and it works pretty nice, UpdateVPAs step takes around 9 seconds and MaintainCheckpoints takes 13 seconds in our cluster.

This also allowed to remove --min-checkpoints and --checkpoints-timeout options because imho they don't make sense anymore.

I pushed modified version for reference: master...tkukushkin:autoscaler:parallel-updates

Describe the solution you'd like.:

I'm not good at Go, otherwise I would make a Pull Request by myself, but I think such approach might be implemented.

I'm not completely sure --min-checkpoints and --checkpoints-timeout options should be removed.

Also may be it's better to add specific rate limit for update operations.

The text was updated successfully, but these errors were encountered:

adrianmoisey · 2025-03-19T19:23:16Z

Hi! We have more than 2.7k VPAs in our cluster and we encountered a problem, that recommender starts working really slowly, one recommendations cycle takes more than 10 minutes.

Wow! That's pretty darn big!

Thanks for the code change and description, I think we could improve something here. Will see if someone wants to take the issue and see how they solve it.
/help-wanted

adrianmoisey · 2025-03-19T19:24:22Z

/help
/triage accepted

k8s-ci-robot · 2025-03-19T19:24:25Z

@adrianmoisey:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

Why are we solving this issue?
To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
Does this issue have zero to low barrier of entry?
How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help
/triage accepted

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

omerap12 · 2025-03-20T07:21:53Z

Thanks for this! I really like that approach :) Updating the VPAs concurrently is definitely a great idea.
I do think there's something off with min-checkpoints, but I disagree about checkpoints-timeout.

/assign

adrianmoisey · 2025-03-20T08:29:01Z

I'm wondering if this will require a setting to control concurrency, but I'll wait for a PR before I make more comments :P

omerap12 · 2025-03-20T08:47:38Z

Yeah, I see your point. I'll dig into this and see if we can come up with a smarter approach. In any case, I plan to put "concurrent mode" behind a flag (defaulting to false for now) so users can choose between both modes. WDYT?

I'm wondering if this will require a setting to control concurrency, but I'll wait for a PR before I make more comments :P

adrianmoisey · 2025-03-20T08:48:43Z

Yeah, I see your point. I'll dig into this and see if we can come up with a smarter approach. In any case, I plan to put "concurrent mode" behind a flag (defaulting to false for now) so users can choose between both modes. WDYT?

That makes perfect sense to me

voelzmo · 2025-03-20T08:56:33Z

Hey @tkukushkin,

looking at the numbers you provide, it seems that you left the default values for the recommender client-side qps settings:

autoscaler/vertical-pod-autoscaler/common/flags.go

Lines 42 to 43 in dc57f7c

    
           flag.Float64Var(&cf.KubeApiQps, "kube-api-qps", 5.0, "QPS limit when making requests to Kubernetes apiserver") 
        
           flag.Float64Var(&cf.KubeApiBurst, "kube-api-burst", 10.0, "QPS burst limit when making requests to Kubernetes apiserver")

2700/5/60 = 9, so my guess is it should take 9 minutes to run UpdateVPAs and then some additional time until the minimum amount of Checkpoints has been updated.

As you can see, those values (5 requests/s to the KAPI) are not fit for large-scale usage. There was an upstream discussion around removing client-side rate limiting at all, now that API priority and fairness is available for the KAPIs, maybe we could pick this up for VPA as well and remove the client-side rate limiting entirely?

voelzmo · 2025-03-20T08:59:07Z

As additional context: vpa-recommender has some pretty good instrumentation, allowing you to keep an eye on how long the individual steps in its loop take. A few years back, I had an issue showing the histograms for those steps in action – we had similar problems in large-scale environments like you do now: #4498

tkukushkin · 2025-03-20T09:08:37Z

Hey @voelzmo,

looking at the numbers you provide, it seems that you left the default values for the recommender client-side qps settings:

I didn't change defaults, we override them through command line options of our recommender's deployment, our current values are --kube-api-qps=200 --kube-api-burst=200.

There was kubernetes/kubernetes#111880 around removing client-side rate limiting at all, now that API priority and fairness is available for the KAPIs, maybe we could pick this up for VPA as well and remove the client-side rate limiting entirely?

My knowledge about it is really poor, so I just trust your opinion.

As additional context: vpa-recommender has some pretty good instrumentation, allowing you to keep an eye on how long the individual steps in its loop take.

Yeah, we know about these metrics and monitor them.

voelzmo · 2025-03-20T09:27:01Z

Oh that's so confusing, because the numbers seemed to line up so nicely! Which version of VPA are you running?

Maybe the recommender needs more than 1 call to update a VPA status then?
2700/200=13.5, this whole thing should finish in <14 seconds. Could you try to increase the the qps limits to e.g. 1000?

tkukushkin · 2025-03-20T09:35:17Z

Which version of VPA are you running?

We've running 1.2.1.

Maybe the recommender needs more than 1 call to update a VPA status then?

As far as I understand, 1 call to update VPA, 1 call to update checkpoint. Updating of VPAs takes 9 seconds, updating of checkpoints takes 13 seconds, in total one iteration takes less than 25 seconds and it's totally fine for us.

Could you try to increase the the qps limits to e.g. 1000?

I wouldn't like to increase load on our Kubernetes masters so much.

voelzmo · 2025-03-20T09:39:15Z

Oh, lol, sorry – I misread seconds for minutes. Not enough coffee this morning ☕

voelzmo · 2025-03-20T09:54:30Z

No, wait, I did read "minutes" in your original post:

Hi! We have more than 2.7k VPAs in our cluster and we encountered a problem, that recommender starts working really slowly, one recommendations cycle takes more than 10 minutes.

In the scenario, where one loop took 10 minutes to execute:

what were your qps settings?
how long did the steps take?

I understand that using your modified code, the UpdateVPA step takes 9 seconds and MaintainCheckpoints takes 13 seconds. It makes sense that this is fast enough for you ;)

I wouldn't like to increase load on our Kubernetes masters so much.

That's what the goroutine-based solution does as well, doesn't it?
The reason why using goroutines is so much faster is that it effectively skips the client-side rate limiting and doing more queries in less time. You should see similar execution times with high enough QPS settings.

omerap12 · 2025-03-20T10:01:58Z

That's what the goroutine-based solution does as well, doesn't it? The reason why using goroutines is so much faster is that it effectively skips the client-side rate limiting and doing more queries in less time. You should see similar execution times with high enough QPS settings.

So if I got it right, we should completely remove client-side rate limiting (regardless of this issue) since Kubernetes has built-in flow control (https://kubernetes.io/docs/concepts/cluster-administration/flow-control/) from version 1.20, right?

tkukushkin · 2025-03-20T10:18:59Z

what were your qps settings?

If memory serves me well we had default settings.

how long did the steps take?

I don't have this information anymore, it was long ago 😞

But I remember that even after this change default rate limit was not enough to fit in 1 minute, so we increased it through CLI options.

That's what the goroutine-based solution does as well, doesn't it?
The reason why using goroutines is so much faster is that it effectively skips the client-side rate limiting and doing more queries in less time.

I don't get this point, sorry, could you please explain why it skips client-side rate limiting?

I believe goroutine-based solution makes as many requests concurrently as client side rate limit allows.

And I see a lot of logs like:

Waited for 9.935918784s due to client-side throttling, not priority and fairness, ...

You should see similar execution times with high enough QPS settings.

Current approach makes only 1 request at a time. Even if rate limit is high, like 1000 qps, API server has to reply in 1ms to make 1000 requests in 1 second?

tkukushkin · 2025-03-20T10:29:32Z

I apologize that I misled you with numbers by not providing the settings under which we received them.

voelzmo · 2025-03-20T10:46:11Z

Yeah, sorry, I'm easily confused today. Great point about the query latency, which I absolutely wasn't accounting for!

So to summarize

we understand why with an unmodified vpa-recommender with default QPS settings it takes ~10 minutes to process a single loop at the scale you're using it in:
- client-side rate limiting leads to at least 9 minutes (even with query latency of 0)
- query latency can make this even worse
we understand why a concurrent approach can be helpful
- client-side rate limit increases are only helpful up to a certain point. Once you reach a certain scale, it is the query latency to the KAPI limiting your VPA update queries per second
we understand that we need/want an option to control the amount of concurrency
- you showed that kube-api-qps and kube-api-burst can achieve this implicitly, because client-go seems to be able to track this in a goroutine-safe manner (TIL, I wasn't aware that this works). We still end up creating a goroutine for each VPA, though, even if they cannot do their work concurrently due to client-side rate limiting
- we can also think about doing it explicitly by limiting the amount of goroutines and distributing the VPAs across them

Does that make sense?

tkukushkin · 2025-03-20T10:50:57Z

Yes, it totally makes sense.

Btw, I've just tested not modified version of recommender with our current configuration of rate limit, and one iteration takes more than 6 minutes even with timeout error from MaintainCheckpoints. So it confirms, concurrency really helps there.

adrianmoisey · 2025-03-20T11:17:07Z

So to summarize

we understand why with an unmodified vpa-recommender with default QPS settings it takes ~10 minutes to process a single loop at the scale you're using it in:

client-side rate limiting leads to at least 9 minutes (even with query latency of 0)

query latency can make this even worse

we understand why a concurrent approach can be helpful

client-side rate limit increases are only helpful up to a certain point. Once you reach a certain scale, it is the query latency to the KAPI limiting your VPA update queries per second

we understand that we need/want an option to control the amount of concurrency

you showed that kube-api-qps and kube-api-burst can achieve this implicitly, because client-go seems to be able to track this in a goroutine-safe manner (TIL, I wasn't aware that this works). We still end up creating a goroutine for each VPA, though, even if they cannot do their work concurrently due to client-side rate limiting

we can also think about doing it explicitly by limiting the amount of goroutines and distributing the VPAs across them

This matches my assumption of the system.

If concurrency is increased too high, the QPS and burst in client-go gets hit.

I assume a go-routine-per-vpa (as per the current solution) is quite overkill, and I think we need to set a sane (safe?) default, but also make it configurable, should users need to increase the throughput.

omerap12 · 2025-03-20T11:18:50Z

Yeah, sorry, I'm easily confused today. Great point about the query latency, which I absolutely wasn't accounting for!

So to summarize

we understand why with an unmodified vpa-recommender with default QPS settings it takes ~10 minutes to process a single loop at the scale you're using it in:

client-side rate limiting leads to at least 9 minutes (even with query latency of 0)

query latency can make this even worse

we understand why a concurrent approach can be helpful

client-side rate limit increases are only helpful up to a certain point. Once you reach a certain scale, it is the query latency to the KAPI limiting your VPA update queries per second

we understand that we need/want an option to control the amount of concurrency

you showed that kube-api-qps and kube-api-burst can achieve this implicitly, because client-go seems to be able to track this in a goroutine-safe manner (TIL, I wasn't aware that this works). We still end up creating a goroutine for each VPA, though, even if they cannot do their work concurrently due to client-side rate limiting

we can also think about doing it explicitly by limiting the amount of goroutines and distributing the VPAs across them

Does that make sense?

Thanks for a great summarisation!

omerap12 · 2025-03-28T12:56:30Z

/unassign
/assign @voelzmo

tkukushkin added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 19, 2025

k8s-ci-robot added the area/vertical-pod-autoscaler label Mar 19, 2025

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Mar 19, 2025

k8s-ci-robot assigned omerap12 Mar 20, 2025

This was referenced Mar 26, 2025

Increase VPA client-side rate limits gardener/gardener#11748

Merged

Make VPA and Checkpoint updates concurrent #7992

Open

k8s-ci-robot unassigned omerap12 Mar 28, 2025

k8s-ci-robot assigned voelzmo Mar 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel updates of VPAs and checkpoints. #7951

Parallel updates of VPAs and checkpoints. #7951

tkukushkin commented Mar 19, 2025

adrianmoisey commented Mar 19, 2025

adrianmoisey commented Mar 19, 2025

k8s-ci-robot commented Mar 19, 2025

omerap12 commented Mar 20, 2025

adrianmoisey commented Mar 20, 2025

omerap12 commented Mar 20, 2025

adrianmoisey commented Mar 20, 2025

voelzmo commented Mar 20, 2025

voelzmo commented Mar 20, 2025

tkukushkin commented Mar 20, 2025

voelzmo commented Mar 20, 2025

tkukushkin commented Mar 20, 2025

voelzmo commented Mar 20, 2025

voelzmo commented Mar 20, 2025 •

edited

Loading

omerap12 commented Mar 20, 2025

tkukushkin commented Mar 20, 2025

tkukushkin commented Mar 20, 2025

voelzmo commented Mar 20, 2025

tkukushkin commented Mar 20, 2025 •

edited

Loading

adrianmoisey commented Mar 20, 2025

omerap12 commented Mar 20, 2025

omerap12 commented Mar 28, 2025

Parallel updates of VPAs and checkpoints. #7951

Parallel updates of VPAs and checkpoints. #7951

Comments

tkukushkin commented Mar 19, 2025

adrianmoisey commented Mar 19, 2025

adrianmoisey commented Mar 19, 2025

k8s-ci-robot commented Mar 19, 2025

Guidelines

omerap12 commented Mar 20, 2025

adrianmoisey commented Mar 20, 2025

omerap12 commented Mar 20, 2025

adrianmoisey commented Mar 20, 2025

voelzmo commented Mar 20, 2025

voelzmo commented Mar 20, 2025

tkukushkin commented Mar 20, 2025

voelzmo commented Mar 20, 2025

tkukushkin commented Mar 20, 2025

voelzmo commented Mar 20, 2025

voelzmo commented Mar 20, 2025 • edited Loading

omerap12 commented Mar 20, 2025

tkukushkin commented Mar 20, 2025

tkukushkin commented Mar 20, 2025

voelzmo commented Mar 20, 2025

tkukushkin commented Mar 20, 2025 • edited Loading

adrianmoisey commented Mar 20, 2025

omerap12 commented Mar 20, 2025

omerap12 commented Mar 28, 2025

voelzmo commented Mar 20, 2025 •

edited

Loading

tkukushkin commented Mar 20, 2025 •

edited

Loading