-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Seamless Concurrency #46
Comments
(the magic number of I/O might be 2 less than thread count on Write, but 1 on Read, and 0 on hard core number crunching) |
There's no universal "magic number" for concurrency levels, as the optimal value depends heavily on the nature of the task and external constraints. Let me break this down: I/O Bound TasksFor I/O bound operations (the primary use case for rill), the optimal concurrency level is often much higher than the number of CPUs. In practice, starting with a small concurrency level like 5 or 10 often provides good results. Some examples:
The settings are typically determined through trial and error, considering:
For databases specifically, both concurrency and batch size need to be tuned together to find the right balance between making the application fast while not overwhelming the database. CPU Bound TasksFor CPU-intensive operations, the approach depends on the task size:
Please let me know if this answer was helpful. |
Your I/O example was network, which I agree where there's bad code (no pipelining), you want more requests in-flight than cores. I do disagree that there's no reasonable rule of thumb a library couldn't provide. A big disclaimer at the header of the function would be fine. |
I apologize - I initially misunderstood your request. You're suggesting adding a helper function that provides sensible defaults for concurrency levels based on different workloads. While I appreciate the suggestion, the library is intentionally designed to let users explicitly choose the concurrency level based on their specific needs. |
When I first started learning go, the question came up, (naturally with others, I would expect) do I just blindly call go for every invocation (100,000 times) and use a waitgroup, do I spin a magic number like 7 on a 8 thread processor, or what do I do (channels! who calls close on send, oh, one per worker? oh... channels...).
There's since been hints like GOMAXPROCS and similar added, with other more neat libraries like reading cgroup limits and setting that environment variable. This is great stuff, I want to use it for some large jobs, but I don't know what to put in for the number, and from the examples they appear quite haphazard.
If GOMAXPROCS is set I'd prefer there be some function I can call in rill to guess the magic number of workers (maybe a param can be the type of workload I think it is), and if it isn't set Rill estimate what would be a correct amount of concurrency. Obviously there's no right answer, but 2 on a 4 thread box is very different from 2 on a 64 thread box, and this is the crux of the issue.
I'd normally keep it tight and vague, but tried to explain the who/what/where/when/why of why I'd want something like this, and a gap I think can be filled easily. Uber has a library that can get this information from cgroups and similar, maybe even just repackaging that until there's a reason to fork is a way forward.
The text was updated successfully, but these errors were encountered: