-
Notifications
You must be signed in to change notification settings - Fork 2k
Fix child process kill error with npx based servers #850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix child process kill error with npx based servers #850
Conversation
Failing test seems to be a streamable http test which was not modified in this change. Please let me know in case it's a side-effect of this change. |
- Add process group support on POSIX systems via start_new_session=True - Implement _terminate_process_with_children() and _kill_process_and_children() functions that properly handle child process cleanup - On POSIX: Kill entire process group at once to avoid orphaning children when parent dies first - On Windows: Use taskkill /T for process tree termination - Add pid property to FallbackProcess for consistent interface - Add proper type annotations (Process | FallbackProcess) - Replace existing termination logic with new unified approach This addresses issues from PRs #850 (npx child processes) and #729 (Windows process trees) in a unified, dependency-free way. Reported-by:surya-prakash-susarla Reported-by:jingx8885 Github-Issue:#547
- Add process group support on POSIX systems via start_new_session=True - Implement _terminate_process_with_children() and _kill_process_and_children() functions that properly handle child process cleanup - On POSIX: Kill entire process group at once to avoid orphaning children when parent dies first - On Windows: Use taskkill /T for process tree termination - Add pid property to FallbackProcess for consistent interface - Add proper type annotations (Process | FallbackProcess) - Replace existing termination logic with new unified approach This addresses issues from PRs #850 (npx child processes) and #729 (Windows process trees) in a unified, dependency-free way. Reported-by:surya-prakash-susarla Reported-by:jingx8885 Github-Issue:#547 Add regression tests for child process termination - test_stdio_client_child_process_cleanup: Verifies that child processes spawned by the parent (like npx spawning node) are properly terminated - test_stdio_client_nested_process_tree: Tests that deeply nested process trees (parent -> child -> grandchild) are all terminated - test_stdio_client_early_parent_exit: Tests the race condition where parent exits during cleanup but children are still terminated via process group These tests verify the fix works correctly across different subprocess spawning scenarios without requiring external dependencies like psutil.
- Add process group support on POSIX systems via start_new_session=True - Implement _terminate_process_with_children() and _kill_process_and_children() functions that properly handle child process cleanup - On POSIX: Kill entire process group at once to avoid orphaning children when parent dies first - On Windows: Use taskkill /T for process tree termination - Add pid property to FallbackProcess for consistent interface - Add proper type annotations (Process | FallbackProcess) - Replace existing termination logic with new unified approach This addresses issues from PRs #850 (npx child processes) and #729 (Windows process trees) in a unified, dependency-free way. Reported-by:surya-prakash-susarla Reported-by:jingx8885 Github-Issue:#547 Add regression tests for child process termination - test_stdio_client_child_process_cleanup: Verifies that child processes spawned by the parent (like npx spawning node) are properly terminated - test_stdio_client_nested_process_tree: Tests that deeply nested process trees (parent -> child -> grandchild) are all terminated - test_stdio_client_early_parent_exit: Tests the race condition where parent exits during cleanup but children are still terminated via process group These tests verify the fix works correctly across different subprocess spawning scenarios without requiring external dependencies like psutil.
- Add process group support on POSIX systems via start_new_session=True - Implement _terminate_process_with_children() and _kill_process_and_children() functions that properly handle child process cleanup - On POSIX: Kill entire process group at once to avoid orphaning children when parent dies first - On Windows: Use taskkill /T for process tree termination - Add pid property to FallbackProcess for consistent interface - Add proper type annotations (Process | FallbackProcess) - Replace existing termination logic with new unified approach This addresses issues from PRs #850 (npx child processes) and #729 (Windows process trees) in a unified, dependency-free way. Reported-by:surya-prakash-susarla Reported-by:jingx8885 Github-Issue:#547 Add regression tests for child process termination - test_stdio_client_child_process_cleanup: Verifies that child processes spawned by the parent (like npx spawning node) are properly terminated - test_stdio_client_nested_process_tree: Tests that deeply nested process trees (parent -> child -> grandchild) are all terminated - test_stdio_client_early_parent_exit: Tests the race condition where parent exits during cleanup but children are still terminated via process group These tests verify the fix works correctly across different subprocess spawning scenarios without requiring external dependencies like psutil.
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored by: @jingx8885, @surya-prakash-susarla
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
Thank you @surya-prakash-susarla for submitting this PR and the detailed information on reproduction of the issue! I spent yesterday and today trying to unify all the different approaches and fixes we have pending in this process termination space at the moment, as there are several interrelated fixes that either conflict or depend on each other - specifically #555, #729, #765, and #850. I've added your change to #1044 as a draft with you as a co-author + added extensive regression testing. Would you be OK with consolidating this change into #1044 for the comprehensive testing & process handling introduced there? Would love your feedback on the PR as well if you have time! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting changes for now to put this PR in your queue - please take a look at my comment above regarding #1044 🙂
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
Motivation and Context
When running npx based servers in stdio mode, the npx process spawns a sub-process. During exit, the main process does not propagate the SIGTERM to the child process which leads to a failed kill signal and the stdin & stdout of the root process getting mangled.
The current change creates a new process group when launching the fork for the server and ensures that a kill signal is sent for the process group is the process does not shutdown cleanly. This would ensure that the process group (which includes all the child process) would be killed successfully to avoid any potential issues.
How Has This Been Tested?
Yes, a simple script like this:
The above script will currently block the stdio of the parent server when run using as:
python test.py
After the change the npx root & the node child process die succesfully allowing the control to flow back to the parent process ensuring a clean shutdown.
Breaking Changes
No, this should not be a breaking change since the process group flag is ignored in non-POSIX systems, and the cleanup via process group functionality is only applied for POSIX style launches.
Types of changes
Checklist
Additional context
Please refer to the issue here (slightly different context):
#547