-
Notifications
You must be signed in to change notification settings - Fork 2k
Subprocess kill win32 #729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subprocess kill win32 #729
Conversation
- Refactor process termination logic to handle both parent and child processes - Increase timeout for graceful termination from 1s to 2s - Consolidate process termination functions to avoid zombie processes - Add proper type hints for process list handling This change ensures that all child processes are properly terminated when killing a parent process on Windows, preventing zombie processes from being left behind.
@dsp-ant require code review |
@dsp-ant require code review, thinks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jingx8885, thank you for working on this and submitting a PR for process tree termination!
We're currently reviewing PR #555 which takes a simpler, cross-platform approach to the hanging process issue by standardizing the termination logic across all platforms. However, it doesn't modify any Windows behavior, so your change may still be needed.
Before we add the complexity of process tree handling and the psutil
dependency (which also requires setup of Microsoft C++ 14.0 setup apparently in my local testing, so adding more setup complexity), could you help us understand:
- The Windows platform tests are still failing in this PR - could you investigate those failures first? I checked out this PR on my local machine and the tests do still fail.
- Do you have specific test cases or reproduction scenarios where child processes remain hanging after the parent is terminated? If we could create a test or sample script to illustrate the issue you're seeing clearly, that would help us massively in figuring out the right fix.
Given the additional complexity of adding psutil
, we'd prefer to only add complexity if we have concrete cases with failing tests that demonstrate it's needed.
Also I realize this PR has been in queue for a while - if you'd be able to rebase on main to address 1 & 2 above that would be greatly appreciated!
|
- Add process group support on POSIX systems via start_new_session=True - Implement _terminate_process_with_children() and _kill_process_and_children() functions that properly handle child process cleanup - On POSIX: Kill entire process group at once to avoid orphaning children when parent dies first - On Windows: Use taskkill /T for process tree termination - Add pid property to FallbackProcess for consistent interface - Add proper type annotations (Process | FallbackProcess) - Replace existing termination logic with new unified approach This addresses issues from PRs #850 (npx child processes) and #729 (Windows process trees) in a unified, dependency-free way. Reported-by:surya-prakash-susarla Reported-by:jingx8885 Github-Issue:#547
- Add process group support on POSIX systems via start_new_session=True - Implement _terminate_process_with_children() and _kill_process_and_children() functions that properly handle child process cleanup - On POSIX: Kill entire process group at once to avoid orphaning children when parent dies first - On Windows: Use taskkill /T for process tree termination - Add pid property to FallbackProcess for consistent interface - Add proper type annotations (Process | FallbackProcess) - Replace existing termination logic with new unified approach This addresses issues from PRs #850 (npx child processes) and #729 (Windows process trees) in a unified, dependency-free way. Reported-by:surya-prakash-susarla Reported-by:jingx8885 Github-Issue:#547 Add regression tests for child process termination - test_stdio_client_child_process_cleanup: Verifies that child processes spawned by the parent (like npx spawning node) are properly terminated - test_stdio_client_nested_process_tree: Tests that deeply nested process trees (parent -> child -> grandchild) are all terminated - test_stdio_client_early_parent_exit: Tests the race condition where parent exits during cleanup but children are still terminated via process group These tests verify the fix works correctly across different subprocess spawning scenarios without requiring external dependencies like psutil.
- Add process group support on POSIX systems via start_new_session=True - Implement _terminate_process_with_children() and _kill_process_and_children() functions that properly handle child process cleanup - On POSIX: Kill entire process group at once to avoid orphaning children when parent dies first - On Windows: Use taskkill /T for process tree termination - Add pid property to FallbackProcess for consistent interface - Add proper type annotations (Process | FallbackProcess) - Replace existing termination logic with new unified approach This addresses issues from PRs #850 (npx child processes) and #729 (Windows process trees) in a unified, dependency-free way. Reported-by:surya-prakash-susarla Reported-by:jingx8885 Github-Issue:#547 Add regression tests for child process termination - test_stdio_client_child_process_cleanup: Verifies that child processes spawned by the parent (like npx spawning node) are properly terminated - test_stdio_client_nested_process_tree: Tests that deeply nested process trees (parent -> child -> grandchild) are all terminated - test_stdio_client_early_parent_exit: Tests the race condition where parent exits during cleanup but children are still terminated via process group These tests verify the fix works correctly across different subprocess spawning scenarios without requiring external dependencies like psutil.
- Add process group support on POSIX systems via start_new_session=True - Implement _terminate_process_with_children() and _kill_process_and_children() functions that properly handle child process cleanup - On POSIX: Kill entire process group at once to avoid orphaning children when parent dies first - On Windows: Use taskkill /T for process tree termination - Add pid property to FallbackProcess for consistent interface - Add proper type annotations (Process | FallbackProcess) - Replace existing termination logic with new unified approach This addresses issues from PRs #850 (npx child processes) and #729 (Windows process trees) in a unified, dependency-free way. Reported-by:surya-prakash-susarla Reported-by:jingx8885 Github-Issue:#547 Add regression tests for child process termination - test_stdio_client_child_process_cleanup: Verifies that child processes spawned by the parent (like npx spawning node) are properly terminated - test_stdio_client_nested_process_tree: Tests that deeply nested process trees (parent -> child -> grandchild) are all terminated - test_stdio_client_early_parent_exit: Tests the race condition where parent exits during cleanup but children are still terminated via process group These tests verify the fix works correctly across different subprocess spawning scenarios without requiring external dependencies like psutil.
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored by: @jingx8885, @surya-prakash-susarla
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
Thank you again @jingx8885 for submitting this PR and for your detailed responses on what failures you were seeing - extremely helpful. I spent yesterday and today trying to unify all the different approaches and fixes we have pending in this process termination space at the moment, as there are several interrelated fixes that either conflict or depend on each other - specifically #555, #729, #765, and #850. I've added your change to #1044 as a draft with you as a co-author + added extensive regression testing. Would you be OK with consolidating this change into #1044 for the comprehensive testing & process handling introduced there? If you're able to, I'd also love your feedback whether the unified approach in #1044 successfully addresses the issue you're seeing in this PR. |
@felixweinberger Great news: I can confirm that my specific changes from this PR have been correctly integrated into the unified approach of #1044. More importantly, the logic addressing the process termination issue I encountered is fully present. Critical Validation: I want to highlight that this exact logic (or its direct equivalent now in #1044) has been running successfully in both our production environment and daily development/testing cycles for a month. It has proven to be robust and reliable on Windows 11 under our workload conditions. This gives me high confidence that #1044 effectively resolves the core issue I originally aimed to fix. Therefore, I fully support merging these changes via #1044 and agree to close this PR in favor of it. I remain very keen to collaborate on finalizing #1044. Please let me know how I can best contribute next – whether it's further testing on specific platforms/scenarios, reviewing recent updates, or assisting with documentation. Looking forward to getting this comprehensive fix merged! Best regards, |
Close this merge, support #1044 to fix this bug. |
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
When terminating MCP servers, child processes were being orphaned because only the parent process was killed. This caused resource leaks and prevented proper cleanup, especially with tools like npx that spawn child processes for the actual server implementation. This was happening on both POSIX and Windows systems - however because of implementation details, resolving this is non-trivial and requires introducing psutil to introduce cross-platform utilities for dealing with children and process trees. This addresses critical issues where MCP servers using process spawning tools would leave zombie processes running after client shutdown. resolves #850 resolves #729 Co-authored-by: jingx8885 <[email protected]> Co-authored-by: Surya Prakash Susarla <[email protected]>
Motivation and Context
How Has This Been Tested?
Breaking Changes
Types of changes
Checklist
Additional context