Skip to content

Fix stdio_client kill process after timeout #555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

cristipufu
Copy link
Contributor

@cristipufu cristipufu commented Apr 21, 2025

Motivation and Context

Be consistent across platforms (Windows, Linux, etc). Some MCP Servers don't react to SIGTERM, so stdio_client will hang indefinitely when exiting the context manager.

Reference: #372

How Has This Been Tested?

Yes

Breaking Changes

No

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Copy link
Contributor

@ihrpr ihrpr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @cristipufu for working on this!

Do you have steps to reproduce? How this changes were tested?

@ihrpr ihrpr modified the milestones: r-05-25, stdio shutdown May 28, 2025
await process.wait()
except TimeoutError:
# Force kill if it doesn't terminate
process.kill()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the same TerminateProcess is called on Windows... Did this worked locally?

https://anyio.readthedocs.io/en/stable/api.html#anyio.abc.Process.kill

@surya-prakash-susarla
Copy link
Contributor

Slightly orthogonal but this does not handle child process spawns which can cause the parent to leave ghost processes which are being re-parented to the root process. Addressed it here: #850

felixweinberger added a commit that referenced this pull request Jun 26, 2025
Add comprehensive tests that validate stdio client cleanup behavior:
- Universal timeout test ensures cleanup completes within reasonable time even with long-running processes
- Immediate completion test ensures fast processes remain fast and timeout mechanism doesn't introduce delays
- Cross-platform compatibility with different commands for Windows vs Unix

These tests will help ensure PR #555's timeout mechanism works correctly without regressing normal operation.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
felixweinberger added a commit that referenced this pull request Jun 26, 2025
Changes from PR #555 by @cristipufu:
- Remove Windows-specific terminate_windows_process function
- Apply uniform 2-second timeout mechanism across all platforms:
  - Try process.terminate() first (SIGTERM on Unix, similar on Windows)
  - Wait up to 2 seconds for graceful termination
  - If timeout occurs, force process.kill() (SIGKILL equivalent)

This ensures consistent behavior between Windows and Unix systems when
cleaning up stdio client processes, preventing hanging when MCP servers
don't respond to SIGTERM.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
felixweinberger added a commit that referenced this pull request Jun 26, 2025
This change demonstrates a simpler approach than PR #555's timeout mechanism:

1. **Evidence**: Extensive testing shows PR #559 stream cleanup alone prevents
   hanging, even with servers that ignore SIGTERM and keep streams open.

2. **Simplification**: Removes all timeout logic and Windows-specific termination
   function in favor of unified `process.terminate()` + stream cleanup.

3. **Benefits**:
   - Less code complexity (no timeout handling, no platform branching)
   - Preserves proven stream cleanup protection from PR #559
   - Makes behavior consistent across all platforms
   - All existing timeout tests still pass

4. **Risk reduction**: Avoids changing process termination semantics while
   maintaining hanging protection through stream cleanup.

The core insight: process hanging was caused by stream management issues
(solved by PR #559), not termination timing issues (targeted by PR #555).
felixweinberger added a commit that referenced this pull request Jun 26, 2025
Copy of #555

Testing shows PR #559 stream cleanup alone prevents hanging,
even with servers that ignore SIGTERM and keep streams open.

Removes all timeout logic and Windows-specific termination
function in favor of unified `process.terminate()`.
felixweinberger added a commit that referenced this pull request Jun 26, 2025
NOTE: this test FAIL without #555

This test verifies that stdio_client properly handles processes that
ignore SIGTERM but respond to SIGINT, preventing hanging issues with
Node.js servers and other interactive tools that expect Ctrl+C signals.
felixweinberger added a commit that referenced this pull request Jun 26, 2025
**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()
felixweinberger added a commit that referenced this pull request Jun 26, 2025
**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()
felixweinberger added a commit that referenced this pull request Jun 26, 2025
**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()

This test verifies that on UNIX systems MCP servers that don't respect
SIGTERM but e.g. SIGINT still get terminated after a grace period.
felixweinberger added a commit that referenced this pull request Jun 26, 2025
felixweinberger added a commit that referenced this pull request Jun 26, 2025
**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()

This test verifies that on UNIX systems MCP servers that don't respect
SIGTERM but e.g. SIGINT still get terminated after a grace period.
Copy link
Contributor

@felixweinberger felixweinberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cristipufu, thank you for working on this and for submitting this PR!

Having looked into this in some detail I think your simplification makes sense:

  • There shouldn't be any change in behavior on Windows from this change, so we're good on that end
  • #559 already made some changes to stream cleanup and I believe addresses SIGTERM hangs already on UNIX systems
  • As you mention on #372 though, MCP servers that ignore SIGTERM may still be an issue on UNIX systems
  • I was able to create a test (see below) that simulates a service that expects SIGINT and ignores SIGTERM - this PR fixes that case.

I created a draft PR #1044 with new regression tests in an attempt to actually reproduce the hanging resolved by both this change (#555), #559, and #765.

  • #559 resolves test_stdio_client_universal_cleanup
  • #555 (this PR) resolves test_stdio_client_sigint_only_process
  • #765 resolves test_stdio_client_graceful_stdin_exit and test_stdio_client_stdin_close_ignored

Please take a look if that reflects what you were going for here - if yes could you rebase your PR here on latest main?

felixweinberger added a commit that referenced this pull request Jun 26, 2025
**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()

This test verifies that on UNIX systems MCP servers that don't respect
SIGTERM but e.g. SIGINT still get terminated after a grace period.
felixweinberger added a commit that referenced this pull request Jun 26, 2025
**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()

This test verifies that on UNIX systems MCP servers that don't respect
SIGTERM but e.g. SIGINT still get terminated after a grace period.
felixweinberger added a commit that referenced this pull request Jun 26, 2025
**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()

This test verifies that on UNIX systems MCP servers that don't respect
SIGTERM but e.g. SIGINT still get terminated after a grace period.
felixweinberger added a commit that referenced this pull request Jun 26, 2025
**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()

This test verifies that on UNIX systems MCP servers that don't respect
SIGTERM but e.g. SIGINT still get terminated after a grace period.
@felixweinberger felixweinberger mentioned this pull request Jun 30, 2025
9 tasks
felixweinberger added a commit that referenced this pull request Jun 30, 2025
felixweinberger added a commit that referenced this pull request Jun 30, 2025
**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()

This test verifies that on UNIX systems MCP servers that don't respect
SIGTERM but e.g. SIGINT still get terminated after a grace period.
felixweinberger added a commit that referenced this pull request Jul 1, 2025
cherry-pick of #555

Add regression tests for stdio cleanup hanging

**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()

This test verifies that on UNIX systems MCP servers that don't respect
SIGTERM but e.g. SIGINT still get terminated after a grace period.
felixweinberger added a commit that referenced this pull request Jul 1, 2025
cherry-pick of #555

Add regression tests for stdio cleanup hanging

**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()

This test verifies that on UNIX systems MCP servers that don't respect
SIGTERM but e.g. SIGINT still get terminated after a grace period.
felixweinberger added a commit that referenced this pull request Jul 1, 2025
cherry-pick of #555

Add regression tests for stdio cleanup hanging

**NOTE: These tests FAIL without the changes introduced in #559**

- test_stdio_client_universal_timeout
- test_stdio_client_immediate_completion

await read_stream.aclose()
await write_stream.aclose()
await read_stream_writer.aclose()
await write_stream_reader.aclose()

These tests verify that stdio_client completes cleanup within reasonable
time for both slow-terminating and fast-exiting processes, preventing
the hanging issues reported in #559.

**NOTE: This test FAILS without the changes introduced in #555**

- test_stdio_client_sigint_only_process

try:
    process.terminate()
    with anyio.fail_after(2.0):
        await process.wait()
except TimeoutError:
    # If process doesn't terminate in time, force kill it
    process.kill()

This test verifies that on UNIX systems MCP servers that don't respect
SIGTERM but e.g. SIGINT still get terminated after a grace period.
felixweinberger added a commit that referenced this pull request Jul 1, 2025
The stdio cleanup was hanging indefinitely when processes ignored
termination signals or took too long to exit. This caused the MCP
client to freeze during shutdown, especially with servers that don't
handle SIGTERM properly.

The fix introduces a 2-second timeout for process termination. If a
process doesn't exit gracefully within this window, it's forcefully
killed. This ensures the client always completes cleanup in bounded
time while still giving well-behaved servers a chance to exit cleanly.

This resolves hanging issues reported when MCP servers ignore standard
termination signals or perform lengthy cleanup operations.

Original PR: #555 by @cristipufu

Github-Issue:#559
Github-Issue:#555
felixweinberger added a commit that referenced this pull request Jul 1, 2025
The stdio cleanup was hanging indefinitely when processes ignored
termination signals or took too long to exit. This caused the MCP
client to freeze during shutdown, especially with servers that don't
handle SIGTERM properly.

The fix introduces a 2-second timeout for process termination. If a
process doesn't exit gracefully within this window, it's forcefully
killed. This ensures the client always completes cleanup in bounded
time while still giving well-behaved servers a chance to exit cleanly.

This resolves hanging issues reported when MCP servers ignore standard
termination signals or perform lengthy cleanup operations.

Also adds regression tests for #559.

resolves #555
@cristipufu
Copy link
Contributor Author

@felixweinberger rebased

felixweinberger added a commit that referenced this pull request Jul 1, 2025
The stdio cleanup was hanging indefinitely when processes ignored
termination signals or took too long to exit. This caused the MCP
client to freeze during shutdown, especially with servers that don't
handle SIGTERM properly.

The fix introduces a 2-second timeout for process termination. If a
process doesn't exit gracefully within this window, it's forcefully
killed. This ensures the client always completes cleanup in bounded
time while still giving well-behaved servers a chance to exit cleanly.

This resolves hanging issues reported when MCP servers ignore standard
termination signals or perform lengthy cleanup operations.

Also adds regression tests for #559.

resolves #555

Co-authored by: @cristipufu
felixweinberger added a commit that referenced this pull request Jul 1, 2025
The stdio cleanup was hanging indefinitely when processes ignored
termination signals or took too long to exit. This caused the MCP
client to freeze during shutdown, especially with servers that don't
handle SIGTERM properly.

The fix introduces a 2-second timeout for process termination. If a
process doesn't exit gracefully within this window, it's forcefully
killed. This ensures the client always completes cleanup in bounded
time while still giving well-behaved servers a chance to exit cleanly.

This resolves hanging issues reported when MCP servers ignore standard
termination signals or perform lengthy cleanup operations.

Also adds regression tests for #559.

resolves #555

Co-authored-by: Cristian Pufu <[email protected]>
@felixweinberger
Copy link
Contributor

Thank you @cristipufu for your response and again for submitting this PR!

I spent yesterday and today trying to unify all the different approaches and fixes we have pending in this process termination space at the moment, as there are several interrelated fixes that either conflict or depend on each other - specifically #555, #729, #765, and #850.

I've added your change to #1044 as a draft with you as a co-author + added extensive regression testing. Would you be OK with consolidating this change into #1044 for the comprehensive testing & process handling introduced there?

@cristipufu
Copy link
Contributor Author

@felixweinberger yes, of course, thanks!

@cristipufu cristipufu closed this Jul 2, 2025
felixweinberger added a commit that referenced this pull request Jul 4, 2025
The stdio cleanup was hanging indefinitely when processes ignored
termination signals or took too long to exit. This caused the MCP
client to freeze during shutdown, especially with servers that don't
handle SIGTERM properly.

The fix introduces a 2-second timeout for process termination. If a
process doesn't exit gracefully within this window, it's forcefully
killed. This ensures the client always completes cleanup in bounded
time while still giving well-behaved servers a chance to exit cleanly.

This resolves hanging issues reported when MCP servers ignore standard
termination signals or perform lengthy cleanup operations.

Also adds regression tests for #559.

resolves #555

Co-authored-by: Cristian Pufu <[email protected]>
felixweinberger added a commit that referenced this pull request Jul 4, 2025
The stdio cleanup was hanging indefinitely when processes ignored
termination signals or took too long to exit. This caused the MCP
client to freeze during shutdown, especially with servers that don't
handle SIGTERM properly.

The fix introduces a 2-second timeout for process termination. If a
process doesn't exit gracefully within this window, it's forcefully
killed. This ensures the client always completes cleanup in bounded
time while still giving well-behaved servers a chance to exit cleanly.

This resolves hanging issues reported when MCP servers ignore standard
termination signals or perform lengthy cleanup operations.

Also adds regression tests for #559.

resolves #555

Co-authored-by: Cristian Pufu <[email protected]>
felixweinberger added a commit that referenced this pull request Jul 4, 2025
The stdio cleanup was hanging indefinitely when processes ignored
termination signals or took too long to exit. This caused the MCP
client to freeze during shutdown, especially with servers that don't
handle SIGTERM properly.

This was already being handled on Windows, but not Unix systems.
This Commit unifies the two approaches, removing special logic
for windows process termination.

The fix introduces a 2-second timeout for process termination. If a
process doesn't exit gracefully within this window, it's forcefully
killed. This ensures the client always completes cleanup in bounded
time while still giving well-behaved servers a chance to exit cleanly.

This resolves hanging issues reported when MCP servers ignore standard
termination signals.

resolves #555

Also adds regression tests for #559.

Co-authored-by: Cristian Pufu <[email protected]>
felixweinberger added a commit that referenced this pull request Jul 7, 2025
The stdio cleanup was hanging indefinitely when processes ignored
termination signals or took too long to exit. This caused the MCP
client to freeze during shutdown, especially with servers that don't
handle SIGTERM properly.

This was already being handled on Windows, but not Unix systems.
This Commit unifies the two approaches, removing special logic
for windows process termination.

The fix introduces a 2-second timeout for process termination. If a
process doesn't exit gracefully within this window, it's forcefully
killed. This ensures the client always completes cleanup in bounded
time while still giving well-behaved servers a chance to exit cleanly.

This resolves hanging issues reported when MCP servers ignore standard
termination signals.

resolves #555

Also adds regression tests for #559.

Co-authored-by: Cristian Pufu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants