fix: prevent silent failure on GET stream so clients don't hang #1943
+113
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When connecting to an MCP server over streamable HTTP (ex: GitHub’s MCP server) from Dapr Agents, the client could hang indefinitely on calls like list_tools() if run outside a debugger; with a debugger attached, the timing often hid the issue and pointed to a race.
Root cause: The streamable_http client starts a background GET stream for server‑sent events. If that GET fails (ex: 405 Method Not Allowed), the client retries twice with a 1s delay and then gives up without surfacing that failure. The GET task exits while the rest of the client still assumes the stream may be used. Subsequent requests that expect or coordinate with that stream then block forever on a dead stream.
Fix: For permanent GET failures such as 405, we no longer retry and we log a clear warning instead of failing silently. The client continues to use POST for request/response (e.g. list_tools()), so it no longer hangs when the GET stream has been abandoned. This addresses the silent-failure behavior; per the spec, servers are expected to support both POST and GET, but when the GET stream cannot be established we now fail visibly and avoid leaving the client in a hung state.
This is a fix for an agent hang that I saw in Dapr Agents when it was trying to connect to the Github MCP server using streamable HTTP Transport. I only saw the issue if I ran it locally in terminal, but it was fine within the scope of a debugger. That led me down the path of a race condition. The MCP streamable_http client starts a background GET stream for SSE events that immediately get a 405 err from Github (which doesn't support GET), retries twice with the 1sec delay, and then gives up failing silently. If I called to list tools after the GET stream exhausted it's retries, it would hang forever waiting for the SSE responses on the dead stream. This PR prevents this hang by handling the 405 error.Motivation and Context
I'm associated with the Dapr open source project (CNCF graduated project), and am a maintainer for Dapr Agents repo. We are experiencing issues that this PR helps to correct for our open source community.
dapr/dapr-agents#399
How Has This Been Tested?
I tested and confirmed this works against my repo for a release note agent that we want to eventually publicize and make into a github action that others can consume for open source project release notes when it's ready:
https://github.com/sicoyle/release-note-agent
I can confirm locally that this corrects at least my hanging issue.
Breaking Changes
Types of changes
Checklist
Additional context
related issues:
#1941
dapr/dapr-agents#399