-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tpu-client-next: add client selection to validator #5201
tpu-client-next: add client selection to validator #5201
Conversation
If this PR represents a change to the public RPC API:
Thank you for keeping the RPC clients in sync with the server API @KirillLykov. |
0be26b3
to
26233f7
Compare
Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis. |
rpc/src/rpc_service.rs
Outdated
))?; | ||
|
||
let client = TpuClientNextClient::new( | ||
runtime.handle().clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we use for the client code the same runtime as used for RPC itself. When using ConnectionCache, it is contrary uses it's own runtime.
This runtime used by rpc-json server and internally we use it to do blocking spawn. It looks logical to use the same runtime also for sending over the network because it decreases the complexity of the system, increases utilization of the given runtime. Due to limited resources, it seems to be advantageous to use less runtimes to avoid competition between them for the idle core. But maybe there are might be some other considerations, @alexpyattaev .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to pass the runtime handle from a higher layer function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why but the runtime was introduced to be owned by rpc instead of validator (as for other services). This is outside of the scope of this PR.
@@ -600,8 +790,10 @@ pub fn service_runtime( | |||
// NB: `rpc_blocking_threads` shouldn't be set too high (defaults to num_cpus / 2). Too many | |||
// (busy) blocking threads could compete with CPU time with other validator threads and | |||
// negatively impact performance. | |||
let rpc_threads = 1.max(rpc_threads); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two checks were just moved inside this function, nothing new.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting to see how this performs.
Is this ::create_foo()
approach preferable to one of these?
- Split into
JsonRpcServiceLegacy
andJsonRpcService
. Make the choice of the correct one based on the config. Signature stays::new()
for both. Share as much implementation as appropriate.if config.use_tpu_client_next { JsonRpcService::new(/* ... */) } else { JsonRpcServiceLegacy::new(/* ... */) }.map_err(ValidatorError::Other)?
- Keep
JsonRpcService
but change the::new()
method to accept the union of both configs, match on them, and do the right thing internally.let config = if config.use_tpu_client_next { JsonRpcServiceNextConfig { /* ... */ } } else { JsonRpcServiceConfig { /* ... */ } }; let (json_rpc_service, client_updater) = JsonRpcService::new( config, ).map_err(ValidatorError::Other)?;
Anyway, whatever is fine if it works! Just suggestions.
@steveluscher the problem with approach (1) is that it breaks existing API of public crate RPC. So what can be done is to add a new Regarding why I used To the question why do we need this client to be returned: there is Admin RPC call that changes the validator identity and in this I need to close all the client connections and create a new connection with a new certificate (that uses this new validator key). |
You can break public API if you bump the version of the crate. As long as semver is observed breaking API is ok. |
@steveluscher what do you think about introducing a Builder? see 14a08c7 Why i pass two arguments to build instead of having them as part of the Builder structure: because I must use |
2574afd
to
14a08c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicer! Still really don't like the long list of positional arguments, and would rather a config struct, but I'm happy to leave that up to a stylistic preference.
cool(
true,
true,
0,
true,
1,
false,
true,
"0",
true,
"true",
)
… number of lines added to validator.rs has decreased which I think was your concern.
My main concern is actually how easy it will be to delete the deprecated path without leaving JsonRpcService
in a weird state (ie. with weird method names that no longer make sense). My optimization criteria would be:
- be able to delete the old code in as few steps as possible
- once deleted, the remaining code looks like it would if you'd written it from scratch
@steveluscher makes sense. I rewrote this with |
Probably, would be good to get feedback from @t-nelson since these changes are to be backported to 2.2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I missed this! I didn't feel the need to re-review because it was only a suggestion. I like this better, for sure, though!
Add a flag `use_tpu_client_next` to the validator configuration (without cla for now) which allows to use tpu-client-next instead of `ConnectionCache` for RPC. (cherry picked from commit 5df0d56)
i don't understand why the config type was introduced in this change set. why was it not done in a preliminary pr? it is totally independent change and obscures the one that this pr claims to make |
I see what you mean: that this PR description is saying that this PR is about the flag but it also adds quite some code to RPC. Originally, these changes on RPC side were not part of this PR, I introduced them while addressing PR comments. So the point of attention has drifted a bit which creates confusion. But adding this flag alone would be a very small change of introducing a flag which is not used anywhere. |
Problem
This PR adds a flag
use_tpu_client_next
to the validator configuration (without cla for now) which allows to use tpu-client-next instead of ConnectionCache for RPC.Summary of Changes