vine: count child_count properly in the transfer server process #4078
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposed Changes
I found this problamatic while investigating #4076
In the worker's transfer server process, the time when increasing the
child_count
is before we even check if thelnk
is valid or if the fork was successful. This leads to inaccurate counting, especially when the connection isNULL
(timeout) or the fork fails. The result is that the worker could block indefinitely in thewaitpid
call, waiting for a child process that doesn't exist.The solution is to only increment the
child_count
after successfully forking a child process, and handle fork failures separately to ensure the count remains accurate, preventing unnecessary blocking.Merge Checklist
The following items must be completed before PRs can be merged.
Check these off to verify you have completed all steps.
make test
Run local tests prior to pushing.make format
Format source code to comply with lint policies. Note that some lint errors can only be resolved manually (e.g., Python)make lint
Run lint on source code prior to pushing.