Conformance burn-down: server-side InputRequiredResult, Mcp-Method/Name validation, x-mcp-header filter (14 scenarios → green) by maxisbey · Pull Request #2974 · modelcontextprotocol/python-sdk

maxisbey · 2026-06-25T16:18:53Z

Bumps the conformance pin to 0.2.0-alpha.7 and clears 14 scenarios from the expected-failures baseline, leaving only the 4 that depend on unlanded design (client Mcp-Param-* emission, modern stateless SSE for progress, MCPServer prompt-pipeline InputRequiredResult, and the SEP-990 enterprise-auth waiver).

Motivation and Context

#2967 widened the lowlevel server return types for InputRequiredResult and #2968 gave Client.call_tool the allow_input_required / input_responses / request_state surface, but neither was reachable through the conformance harness: the everything-server had no MRTR fixture tools, the client driver had no SEP-2322 handler, and MCPServer's func_metadata.convert_result() still mangled an InputRequiredResult into TextContent. Separately, the SEP-2243 boundary checks (Mcp-Method/Mcp-Name cross-check, x-mcp-header validity) were unimplemented on both sides. This closes those gaps.

How Has This Been Tested?

All five conformance legs run green with --expected-failures at @modelcontextprotocol/conformance@0.2.0-alpha.7:

Leg	Result
server `--suite active`	42 pass / 0 fail
server `--suite draft`	69 pass / 1 expected fail
server `--suite all --spec-version 2026-07-28`	97 pass / 2 expected fail
client `--suite all`	402 pass / 7 expected fail (2 scenarios)
client `--suite all --spec-version 2026-07-28`	352 pass / 5 expected fail (1 scenario)

./scripts/test: 2776 passed, 100% coverage, strict-no-cover clean. pyright: 0 errors.

Breaking Changes

None for v1-lineage callers. At 2026-07-28, ClientSession.list_tools() now drops tools whose x-mcp-header annotations are invalid (the spec MUST is on the client); handshake-era sessions are unaffected. The modern HTTP server entry now rejects requests where Mcp-Method/Mcp-Name headers contradict the body — handshake-era requests don't pass through that classifier rung.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update

Checklist

I have read the MCP Documentation
My code follows the repository's style guidelines
New and existing tests pass locally
I have added appropriate error handling
I have added or updated documentation as needed

Additional context

SDK changes

func_metadata.convert_result() passes InputRequiredResult through; MCPServer._handle_call_tool / call_tool return type widened to CallToolResult | InputRequiredResult. Context gains .input_responses / .request_state / .client_capabilities read-only props so a tool body can read what the client supplied.
shared/inbound.py: classify_inbound_request rung 2 cross-checks Mcp-Method == body.method and Mcp-Name == params.{name|uri} (new NAME_BEARING_METHODS map + decode_header_value). New find_invalid_x_mcp_header() validates the annotation per RFC 9110 token rules + uniqueness + primitive-type.
_streamable_http_modern.py filters tools/list results for invalid x-mcp-header at the modern HTTP boundary (so the version gate is structural, not a conditional in handler code).
ClientSession.list_tools applies the same filter when the negotiated version is modern. _make_modern_stamp now derives Mcp-Name from NAME_BEARING_METHODS (covers prompts/get and resources/read, not just tools/call); send_discover stamps Mcp-Method so the auto-mode probe doesn't fail the cross-check it just introduced.

Fixtures

everything-server: 8 test_input_required_result_* tools (elicitation/sampling/list_roots/request_state/multiple_inputs/multi_round/tampered_state/capabilities) + json_schema_2020_12_tool carrying the 2020-12 keywords the scenario checks. The tampered_state tool uses a fixture-local hmac to sign/verify requestState; an SDK-level helper is follow-up.
.github/actions/conformance/client.py: handlers for sep-2322-client-request-state (drives the four mock tools via allow_input_required=True) and http-invalid-tool-headers (lists then calls every surfaced tool).

Pin

CONFORMANCE_PKG moves from the pkg.pr.new@65fcd39 URL to the published @modelcontextprotocol/conformance@0.2.0-alpha.7; the URL fetch-and-verify steps are dropped now that npm's own integrity check applies. Baselined: alpha.7 introduces no new failures over the previous pin.

Stacked on #2968.

_{AI Disclaimer}

cubic-dev-ai

3 issues found across 17 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="examples/servers/everything-server/mcp_everything_server/server.py">

<violation number="1" location="examples/servers/everything-server/mcp_everything_server/server.py:445">
P2: `request_state` is parsed/indexed unsafely and can crash the tool instead of falling back. Validate decoded state shape before using `state["round"]`.</violation>
</file>

<file name=".github/actions/conformance/client.py">

<violation number="1" location=".github/actions/conformance/client.py:373">
P2: MRTR retry results are not asserted, so repeated `InputRequiredResult` can pass silently.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

claude

I didn't find any bugs, but this is a large PR (17 files) adding new protocol-boundary validation (Mcp-Method/Mcp-Name cross-checks, x-mcp-header filtering on both client and server) plus the MRTR/InputRequiredResult plumbing through MCPServer — behavior-changing core code that warrants a human look.

Extended reasoning...

Overview

This PR spans the conformance harness (workflow pin bump, expected-failures burn-down, new client scenario handlers, everything-server fixtures) and substantive SDK changes: shared/inbound.py gains Mcp-Method/Mcp-Name header cross-validation and an x-mcp-header schema validator; ClientSession.list_tools and the modern HTTP server entry now drop tools with invalid x-mcp-header annotations; func_metadata.convert_result passes InputRequiredResult through; and Context gains input_responses/request_state/client_capabilities properties.

Security risks

Low-to-moderate. The new validation is additive and tightening (rejecting header/body mismatches, filtering malformed tool annotations), and the base64 sentinel decode fails closed (None) on malformed input. The fixture HMAC key in the everything-server is hard-coded but is example/conformance-fixture code, not the SDK. The conformance pin moves from a sha256-verified URL to a published npm package, relying on npm's integrity check — a reasonable but supply-chain-relevant change. Nothing looks like an injection or auth-bypass vector.

Level of scrutiny

High for the src/mcp portions: the inbound classifier and client/server tools/list filtering sit on the protocol hot path and change wire-visible behavior at 2026-07-28 (modern requests with mismatched Mcp-Method headers are now rejected; tools can silently disappear from list_tools results). These are spec-interpretation decisions (e.g. admitting number as a primitive type, treating a missing Mcp-Name as a mismatch only when the body carries the param) that a maintainer should confirm against SEP-2243/SEP-2322.

Other factors

Test coverage is thorough — every new branch in inbound.py, the modern HTTP filter, the client filter, and the func_metadata pass-through has dedicated unit tests, and the PR reports all five conformance legs green plus full coverage. The bug-hunting pass found no issues. The size, the new public-ish API surface on Context, and the protocol-behavior changes are what push this to human review rather than auto-approval.

claude

I didn't find any bugs in this change, but it's a large PR that adds protocol-boundary validation (Mcp-Method/Mcp-Name cross-checks, x-mcp-header filtering) and widens the MCPServer tool-call return surface, so it warrants a human look rather than an automated approval.

Extended reasoning...

Overview

This PR spans 17 files across the SDK proper (shared/inbound.py, client/session.py, server/_streamable_http_modern.py, mcpserver/context.py, mcpserver/server.py, func_metadata.py), conformance fixtures (everything-server, conformance client driver), CI workflow, and the expected-failures baselines. The SDK changes implement SEP-2243 header validation (Mcp-Method/Mcp-Name cross-check at the modern server boundary, x-mcp-header annotation validation and tool filtering on both client and server) and SEP-2322 plumbing (InputRequiredResult pass-through in convert_result, new Context properties for input_responses/request_state/client_capabilities).

Security risks

The new code is mostly validation that tightens (not loosens) what the server accepts: header/body cross-checks reject inconsistent requests, and decode_header_value is written so a malformed base64 sentinel can never accidentally match a body value. The fixture-side HMAC for requestState is example code only. No auth, crypto, or permission paths in the SDK itself are touched. The main behavioral risks are protocol-level rather than security-level: the client now silently drops tools with invalid x-mcp-header annotations on modern sessions, and the modern HTTP entry rejects requests with mismatched Mcp-Method/Mcp-Name headers — both are spec-mandated but are observable behavior changes for 2026-07-28 callers.

Level of scrutiny

This deserves real human review: it changes request classification on the modern server path (a rejection here means the handler never runs), changes what tools clients surface, and widens public return types (MCPServer.call_tool → CallToolResult | InputRequiredResult), which is an API-surface decision a maintainer should sign off on. It is far outside the simple/mechanical category eligible for shadow approval.

Other factors

Test coverage is thorough — the new ladder rungs, the codec, the x-mcp-header validator, and the end-to-end MCPServer flows all have dedicated tests, and the PR reports all five conformance legs green plus 100% coverage. The bug hunting system found no issues. There are outstanding cubic-dev-ai comments on the fixture code (defensive parsing in the multi-round/tampered-state example tools and missing assertions in the conformance driver) that the author may want to address, though they only affect example/CI code, not the SDK.

maxisbey

Solid burn-down — the Mcp-Method/Mcp-Name cross-check, the NAME_BEARING_METHODS table shared between client emit and server validate, and the discover probe being kept in lockstep with the new rung are all the right shape, and the test additions over the new classifier rungs are thorough. The inline notes cluster into a few groups, in roughly the order I'd weight them.

Two places find_invalid_x_mcp_header diverges from a normative MUST. Both are in the inline comments on shared/inbound.py. The type: "number" concession is a divergence the spec already resolved against the harness (the spec commit that forbade number predates the 0.2.0-alpha.7 pin; the harness fixture that emits it is on both expected-failures lists, so removing it costs nothing in CI). The bigger one is that the validator only walks top-level properties, while the spec permits nested annotations (which then have to be validated and counted toward the case-insensitive uniqueness set) and says an annotation in an illegal position — under items, a composition keyword, if/then/else, or $ref — makes the tool definition invalid. The current walk never sees any of those, so it keeps tools the spec says the client must exclude. The docstring's claim that the spec restricts the annotation to top-level properties is what the spec text contradicts.

One design question, not a defect. _drop_invalid_header_tools on the server side implements an obligation the spec assigns only to the client, and after this PR the client-side list_tools filter carries both the spec MUST and the conformance scenario, so the server copy carries neither. Detail in the inline comment; my vote is to delete it, but it's a genuine call.

Smaller things, all inline: Context.input_responses / request_state re-derive fields the already-validated CallToolRequestParams in _handle_call_tool's hand carries, with a second alias spelling and a ValidationError path that's unreachable from the wire; the new InputRequiredResult union arm in func_metadata lands before the CallToolResult-in-union InvalidSignature, so a three-arm -> CallToolResult | str | InputRequiredResult is now silently accepted where -> CallToolResult | str raises; and one test docstring claims spec backing it doesn't have.

Two notes I'm keeping in the body rather than as inline comments:

docs/migration.md has an existing section headed "MCPServer.call_tool() returns CallToolResult" whose first sentence is "MCPServer.call_tool() now always returns a CallToolResult." — this PR makes that sentence false. One-line amendment in the existing section; I don't think a new section is needed, the widening only affects the unreleased v2 surface.
Net +4 # pragma: no branch on the stub def fn() -> X: ... lines in test_func_metadata.py. They match the existing in-file precedent for that idiom so I'm fine with them; noting it because we try to keep the pragma count flat.

_{AI Disclaimer}

The pkg.pr.new URL was a stopgap until #357 published; alpha.7 is now on npm. Baseline run on this branch shows zero new failures and zero stale expected-failures entries, so the bump is reconciled with no list changes. Drops the URL fetch-and-verify steps now that npm's own integrity check applies.

…conformance fixtures SDK changes: - MCPServer tools can now return InputRequiredResult: func_metadata.convert_result() passes it through instead of mangling into TextContent; _handle_call_tool / call_tool return type widened; Context gains .input_responses / .request_state / .client_capabilities read-only props. - shared/inbound.py: classify_inbound_request rung 2 now cross-checks Mcp-Method / Mcp-Name headers against the request body (NAME_BEARING_METHODS map + decode_header_value); new find_invalid_x_mcp_header() pure validator. - _streamable_http_modern.py: filters tools/list result for invalid x-mcp-header annotations at the modern HTTP boundary. - ClientSession.list_tools: drops tools with invalid x-mcp-header at modern protocol versions (the spec MUST is on the client); _make_modern_stamp uses NAME_BEARING_METHODS for Mcp-Name; send_discover stamps Mcp-Method. Conformance fixtures: - everything-server: 8 test_input_required_result_* tools (elicitation/sampling/ list_roots/request_state/multiple_inputs/multi_round/tampered_state/ capabilities) + json_schema_2020_12_tool with literal 2020-12 schema. - conformance/client.py: handlers for sep-2322-client-request-state and http-invalid-tool-headers.

…ow pass Removes from both baseline files: - client: sep-2322-client-request-state, http-invalid-tool-headers - server: 11 input-required-result-* (all except non-tool-request), http-header-validation, json-schema-2020-12 Remaining: http-custom-headers (client Mcp-Param-* design unresolved), auth/enterprise-managed-authorization (out of scope), input-required-result-non-tool-request (MCPServer prompt pipeline not yet widened), tools-call-with-progress (modern stateless has no SSE response mode yet).

…ce timeout - find_invalid_x_mcp_header: a JSON-Schema array type (e.g. ["string", "null"]) raised TypeError on the frozenset membership check, crashing list_tools() on modern sessions against untrusted server input. Guard with isinstance(str) so any non-scalar type cleanly maps to the not-primitive reason. Covers dict/list type values via two new parametrize cases. - conformance.yml: bump --timeout to 60s on the client legs. The harness runs all scenarios via unbounded Promise.all; 40 parallel python processes on a 2-core runner starve sse-retry (the only scenario with a real-time SSE reconnect wait) past the 30s default. - everything-server _unseal_state: normalise base64/utf-8 decode errors to the same INVALID_PARAMS as the HMAC mismatch path.

…ter, typed Context params, strip-IIR schema derivation Six review comments addressed in one pass: - Drop the server-side x-mcp-header tools/list filter (_drop_invalid_header_tools). The spec MUST is on the client only; the filter ran on one transport and masked author bugs inconsistently. ClientSession.list_tools carries the requirement. - find_invalid_x_mcp_header: walk nested schemas per the spec's 'statically reachable via a chain of properties keys' rule — an annotation under items/anyOf/allOf/oneOf/not/if/then/else/$ref/$defs/patternProperties makes the whole tool invalid, and uniqueness is over all annotations in the schema. Iterative walk over JSON Schema 2020-12 schema positions; instance- data keywords and $ref are not entered so termination is structural. Also drop 'number' from the admitted types (spec names integer/string/boolean only; conformance#344 tracks the harness's stale wording). - Context.input_responses/.request_state: read from the typed InputResponseRequestParams the runner already parsed instead of re-spelling wire aliases against the raw dict via a parallel TypeAdapter. - func_metadata: strip InputRequiredResult arms from a union return annotation and derive on the residual, so 'SomeModel | InputRequiredResult' keeps its output schema and 'CallToolResult | str | InputRequiredResult' raises InvalidSignature consistently with the two-arm case. One noqa:UP007 on the runtime Union[tuple] rebuild — PEP 604 has no runtime-sequence syntax.

cubic-dev-ai

1 issue found across 8 files (changes from recent commits).

_{Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic}

…t; fix migration.md call_tool return Stripping InputRequiredResult arms rebound return_type_expr but not inspected_return_ann, so a single-arm residual that was itself an Annotated[...] (e.g. -> Annotated[CallToolResult, Model] | InputRequiredResult, the documented pattern in README.v2.md) skipped the CallToolResult-with- metadata special case: isinstance(Annotated[...], type) is False and .metadata was still the union's empty tuple. It fell through to _create_wrapped_model, advertising {result: CallToolResult} and raising ValidationError on every successful return. Re-run inspect_annotation on the residual so the post-strip state is indistinguishable from a fresh call without InputRequiredResult — the existing dispatch then handles every case unchanged. This also restores Annotated metadata for the simpler Annotated[X, meta] | InputRequiredResult case. Also: docs/migration.md still said MCPServer.call_tool() 'always returns a CallToolResult'; amend to note the InputRequiredResult arm.

maxisbey marked this pull request as ready for review June 25, 2026 16:19

cubic-dev-ai Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread examples/servers/everything-server/mcp_everything_server/server.py Outdated

Comment thread examples/servers/everything-server/mcp_everything_server/server.py

Comment thread .github/actions/conformance/client.py

claude Bot reviewed Jun 25, 2026

View reviewed changes