Conformance burn-down: server-side InputRequiredResult, Mcp-Method/Name validation, x-mcp-header filter (14 scenarios → green)#2974
Conversation
There was a problem hiding this comment.
3 issues found across 17 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="examples/servers/everything-server/mcp_everything_server/server.py">
<violation number="1" location="examples/servers/everything-server/mcp_everything_server/server.py:445">
P2: `request_state` is parsed/indexed unsafely and can crash the tool instead of falling back. Validate decoded state shape before using `state["round"]`.</violation>
</file>
<file name=".github/actions/conformance/client.py">
<violation number="1" location=".github/actions/conformance/client.py:373">
P2: MRTR retry results are not asserted, so repeated `InputRequiredResult` can pass silently.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
There was a problem hiding this comment.
I didn't find any bugs, but this is a large PR (17 files) adding new protocol-boundary validation (Mcp-Method/Mcp-Name cross-checks, x-mcp-header filtering on both client and server) plus the MRTR/InputRequiredResult plumbing through MCPServer — behavior-changing core code that warrants a human look.
Extended reasoning...
Overview
This PR spans the conformance harness (workflow pin bump, expected-failures burn-down, new client scenario handlers, everything-server fixtures) and substantive SDK changes: shared/inbound.py gains Mcp-Method/Mcp-Name header cross-validation and an x-mcp-header schema validator; ClientSession.list_tools and the modern HTTP server entry now drop tools with invalid x-mcp-header annotations; func_metadata.convert_result passes InputRequiredResult through; and Context gains input_responses/request_state/client_capabilities properties.
Security risks
Low-to-moderate. The new validation is additive and tightening (rejecting header/body mismatches, filtering malformed tool annotations), and the base64 sentinel decode fails closed (None) on malformed input. The fixture HMAC key in the everything-server is hard-coded but is example/conformance-fixture code, not the SDK. The conformance pin moves from a sha256-verified URL to a published npm package, relying on npm's integrity check — a reasonable but supply-chain-relevant change. Nothing looks like an injection or auth-bypass vector.
Level of scrutiny
High for the src/mcp portions: the inbound classifier and client/server tools/list filtering sit on the protocol hot path and change wire-visible behavior at 2026-07-28 (modern requests with mismatched Mcp-Method headers are now rejected; tools can silently disappear from list_tools results). These are spec-interpretation decisions (e.g. admitting number as a primitive type, treating a missing Mcp-Name as a mismatch only when the body carries the param) that a maintainer should confirm against SEP-2243/SEP-2322.
Other factors
Test coverage is thorough — every new branch in inbound.py, the modern HTTP filter, the client filter, and the func_metadata pass-through has dedicated unit tests, and the PR reports all five conformance legs green plus full coverage. The bug-hunting pass found no issues. The size, the new public-ish API surface on Context, and the protocol-behavior changes are what push this to human review rather than auto-approval.
There was a problem hiding this comment.
I didn't find any bugs in this change, but it's a large PR that adds protocol-boundary validation (Mcp-Method/Mcp-Name cross-checks, x-mcp-header filtering) and widens the MCPServer tool-call return surface, so it warrants a human look rather than an automated approval.
Extended reasoning...
Overview
This PR spans 17 files across the SDK proper (shared/inbound.py, client/session.py, server/_streamable_http_modern.py, mcpserver/context.py, mcpserver/server.py, func_metadata.py), conformance fixtures (everything-server, conformance client driver), CI workflow, and the expected-failures baselines. The SDK changes implement SEP-2243 header validation (Mcp-Method/Mcp-Name cross-check at the modern server boundary, x-mcp-header annotation validation and tool filtering on both client and server) and SEP-2322 plumbing (InputRequiredResult pass-through in convert_result, new Context properties for input_responses/request_state/client_capabilities).
Security risks
The new code is mostly validation that tightens (not loosens) what the server accepts: header/body cross-checks reject inconsistent requests, and decode_header_value is written so a malformed base64 sentinel can never accidentally match a body value. The fixture-side HMAC for requestState is example code only. No auth, crypto, or permission paths in the SDK itself are touched. The main behavioral risks are protocol-level rather than security-level: the client now silently drops tools with invalid x-mcp-header annotations on modern sessions, and the modern HTTP entry rejects requests with mismatched Mcp-Method/Mcp-Name headers — both are spec-mandated but are observable behavior changes for 2026-07-28 callers.
Level of scrutiny
This deserves real human review: it changes request classification on the modern server path (a rejection here means the handler never runs), changes what tools clients surface, and widens public return types (MCPServer.call_tool → CallToolResult | InputRequiredResult), which is an API-surface decision a maintainer should sign off on. It is far outside the simple/mechanical category eligible for shadow approval.
Other factors
Test coverage is thorough — the new ladder rungs, the codec, the x-mcp-header validator, and the end-to-end MCPServer flows all have dedicated tests, and the PR reports all five conformance legs green plus 100% coverage. The bug hunting system found no issues. There are outstanding cubic-dev-ai comments on the fixture code (defensive parsing in the multi-round/tampered-state example tools and missing assertions in the conformance driver) that the author may want to address, though they only affect example/CI code, not the SDK.
maxisbey
left a comment
There was a problem hiding this comment.
Solid burn-down — the Mcp-Method/Mcp-Name cross-check, the NAME_BEARING_METHODS table shared between client emit and server validate, and the discover probe being kept in lockstep with the new rung are all the right shape, and the test additions over the new classifier rungs are thorough. The inline notes cluster into a few groups, in roughly the order I'd weight them.
Two places find_invalid_x_mcp_header diverges from a normative MUST. Both are in the inline comments on shared/inbound.py. The type: "number" concession is a divergence the spec already resolved against the harness (the spec commit that forbade number predates the 0.2.0-alpha.7 pin; the harness fixture that emits it is on both expected-failures lists, so removing it costs nothing in CI). The bigger one is that the validator only walks top-level properties, while the spec permits nested annotations (which then have to be validated and counted toward the case-insensitive uniqueness set) and says an annotation in an illegal position — under items, a composition keyword, if/then/else, or $ref — makes the tool definition invalid. The current walk never sees any of those, so it keeps tools the spec says the client must exclude. The docstring's claim that the spec restricts the annotation to top-level properties is what the spec text contradicts.
One design question, not a defect. _drop_invalid_header_tools on the server side implements an obligation the spec assigns only to the client, and after this PR the client-side list_tools filter carries both the spec MUST and the conformance scenario, so the server copy carries neither. Detail in the inline comment; my vote is to delete it, but it's a genuine call.
Smaller things, all inline: Context.input_responses / request_state re-derive fields the already-validated CallToolRequestParams in _handle_call_tool's hand carries, with a second alias spelling and a ValidationError path that's unreachable from the wire; the new InputRequiredResult union arm in func_metadata lands before the CallToolResult-in-union InvalidSignature, so a three-arm -> CallToolResult | str | InputRequiredResult is now silently accepted where -> CallToolResult | str raises; and one test docstring claims spec backing it doesn't have.
Two notes I'm keeping in the body rather than as inline comments:
docs/migration.mdhas an existing section headed "MCPServer.call_tool()returnsCallToolResult" whose first sentence is "MCPServer.call_tool()now always returns aCallToolResult." — this PR makes that sentence false. One-line amendment in the existing section; I don't think a new section is needed, the widening only affects the unreleased v2 surface.- Net +4
# pragma: no branchon the stubdef fn() -> X: ...lines intest_func_metadata.py. They match the existing in-file precedent for that idiom so I'm fine with them; noting it because we try to keep the pragma count flat.
The pkg.pr.new URL was a stopgap until #357 published; alpha.7 is now on npm. Baseline run on this branch shows zero new failures and zero stale expected-failures entries, so the bump is reconciled with no list changes. Drops the URL fetch-and-verify steps now that npm's own integrity check applies.
…conformance fixtures SDK changes: - MCPServer tools can now return InputRequiredResult: func_metadata.convert_result() passes it through instead of mangling into TextContent; _handle_call_tool / call_tool return type widened; Context gains .input_responses / .request_state / .client_capabilities read-only props. - shared/inbound.py: classify_inbound_request rung 2 now cross-checks Mcp-Method / Mcp-Name headers against the request body (NAME_BEARING_METHODS map + decode_header_value); new find_invalid_x_mcp_header() pure validator. - _streamable_http_modern.py: filters tools/list result for invalid x-mcp-header annotations at the modern HTTP boundary. - ClientSession.list_tools: drops tools with invalid x-mcp-header at modern protocol versions (the spec MUST is on the client); _make_modern_stamp uses NAME_BEARING_METHODS for Mcp-Name; send_discover stamps Mcp-Method. Conformance fixtures: - everything-server: 8 test_input_required_result_* tools (elicitation/sampling/ list_roots/request_state/multiple_inputs/multi_round/tampered_state/ capabilities) + json_schema_2020_12_tool with literal 2020-12 schema. - conformance/client.py: handlers for sep-2322-client-request-state and http-invalid-tool-headers.
…ow pass Removes from both baseline files: - client: sep-2322-client-request-state, http-invalid-tool-headers - server: 11 input-required-result-* (all except non-tool-request), http-header-validation, json-schema-2020-12 Remaining: http-custom-headers (client Mcp-Param-* design unresolved), auth/enterprise-managed-authorization (out of scope), input-required-result-non-tool-request (MCPServer prompt pipeline not yet widened), tools-call-with-progress (modern stateless has no SSE response mode yet).
…ce timeout - find_invalid_x_mcp_header: a JSON-Schema array type (e.g. ["string", "null"]) raised TypeError on the frozenset membership check, crashing list_tools() on modern sessions against untrusted server input. Guard with isinstance(str) so any non-scalar type cleanly maps to the not-primitive reason. Covers dict/list type values via two new parametrize cases. - conformance.yml: bump --timeout to 60s on the client legs. The harness runs all scenarios via unbounded Promise.all; 40 parallel python processes on a 2-core runner starve sse-retry (the only scenario with a real-time SSE reconnect wait) past the 30s default. - everything-server _unseal_state: normalise base64/utf-8 decode errors to the same INVALID_PARAMS as the HMAC mismatch path.
992b4da to
eb5519d
Compare
…ter, typed Context params, strip-IIR schema derivation Six review comments addressed in one pass: - Drop the server-side x-mcp-header tools/list filter (_drop_invalid_header_tools). The spec MUST is on the client only; the filter ran on one transport and masked author bugs inconsistently. ClientSession.list_tools carries the requirement. - find_invalid_x_mcp_header: walk nested schemas per the spec's 'statically reachable via a chain of properties keys' rule — an annotation under items/anyOf/allOf/oneOf/not/if/then/else/$ref/$defs/patternProperties makes the whole tool invalid, and uniqueness is over all annotations in the schema. Iterative walk over JSON Schema 2020-12 schema positions; instance- data keywords and $ref are not entered so termination is structural. Also drop 'number' from the admitted types (spec names integer/string/boolean only; conformance#344 tracks the harness's stale wording). - Context.input_responses/.request_state: read from the typed InputResponseRequestParams the runner already parsed instead of re-spelling wire aliases against the raw dict via a parallel TypeAdapter. - func_metadata: strip InputRequiredResult arms from a union return annotation and derive on the residual, so 'SomeModel | InputRequiredResult' keeps its output schema and 'CallToolResult | str | InputRequiredResult' raises InvalidSignature consistently with the two-arm case. One noqa:UP007 on the runtime Union[tuple] rebuild — PEP 604 has no runtime-sequence syntax.
There was a problem hiding this comment.
1 issue found across 8 files (changes from recent commits).
Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic | Re-trigger cubic
…t; fix migration.md call_tool return
Stripping InputRequiredResult arms rebound return_type_expr but not
inspected_return_ann, so a single-arm residual that was itself an
Annotated[...] (e.g. -> Annotated[CallToolResult, Model] | InputRequiredResult,
the documented pattern in README.v2.md) skipped the CallToolResult-with-
metadata special case: isinstance(Annotated[...], type) is False and
.metadata was still the union's empty tuple. It fell through to
_create_wrapped_model, advertising {result: CallToolResult} and raising
ValidationError on every successful return.
Re-run inspect_annotation on the residual so the post-strip state is
indistinguishable from a fresh call without InputRequiredResult — the
existing dispatch then handles every case unchanged. This also restores
Annotated metadata for the simpler Annotated[X, meta] | InputRequiredResult
case.
Also: docs/migration.md still said MCPServer.call_tool() 'always returns a
CallToolResult'; amend to note the InputRequiredResult arm.
Bumps the conformance pin to
0.2.0-alpha.7and clears 14 scenarios from the expected-failures baseline, leaving only the 4 that depend on unlanded design (clientMcp-Param-*emission, modern stateless SSE for progress, MCPServer prompt-pipelineInputRequiredResult, and the SEP-990 enterprise-auth waiver).Motivation and Context
#2967 widened the lowlevel server return types for
InputRequiredResultand #2968 gaveClient.call_tooltheallow_input_required/input_responses/request_statesurface, but neither was reachable through the conformance harness: the everything-server had no MRTR fixture tools, the client driver had no SEP-2322 handler, andMCPServer'sfunc_metadata.convert_result()still mangled anInputRequiredResultintoTextContent. Separately, the SEP-2243 boundary checks (Mcp-Method/Mcp-Namecross-check,x-mcp-headervalidity) were unimplemented on both sides. This closes those gaps.How Has This Been Tested?
All five conformance legs run green with
--expected-failuresat@modelcontextprotocol/conformance@0.2.0-alpha.7:--suite active--suite draft--suite all --spec-version 2026-07-28--suite all--suite all --spec-version 2026-07-28./scripts/test: 2776 passed, 100% coverage, strict-no-cover clean.pyright: 0 errors.Breaking Changes
None for v1-lineage callers. At 2026-07-28,
ClientSession.list_tools()now drops tools whosex-mcp-headerannotations are invalid (the spec MUST is on the client); handshake-era sessions are unaffected. The modern HTTP server entry now rejects requests whereMcp-Method/Mcp-Nameheaders contradict the body — handshake-era requests don't pass through that classifier rung.Types of changes
Checklist
Additional context
SDK changes
func_metadata.convert_result()passesInputRequiredResultthrough;MCPServer._handle_call_tool/call_toolreturn type widened toCallToolResult | InputRequiredResult.Contextgains.input_responses/.request_state/.client_capabilitiesread-only props so a tool body can read what the client supplied.shared/inbound.py:classify_inbound_requestrung 2 cross-checksMcp-Method==body.methodandMcp-Name==params.{name|uri}(newNAME_BEARING_METHODSmap +decode_header_value). Newfind_invalid_x_mcp_header()validates the annotation per RFC 9110 token rules + uniqueness + primitive-type._streamable_http_modern.pyfilterstools/listresults for invalidx-mcp-headerat the modern HTTP boundary (so the version gate is structural, not a conditional in handler code).ClientSession.list_toolsapplies the same filter when the negotiated version is modern._make_modern_stampnow derivesMcp-NamefromNAME_BEARING_METHODS(coversprompts/getandresources/read, not justtools/call);send_discoverstampsMcp-Methodso the auto-mode probe doesn't fail the cross-check it just introduced.Fixtures
test_input_required_result_*tools (elicitation/sampling/list_roots/request_state/multiple_inputs/multi_round/tampered_state/capabilities) +json_schema_2020_12_toolcarrying the 2020-12 keywords the scenario checks. Thetampered_statetool uses a fixture-localhmacto sign/verifyrequestState; an SDK-level helper is follow-up..github/actions/conformance/client.py: handlers forsep-2322-client-request-state(drives the four mock tools viaallow_input_required=True) andhttp-invalid-tool-headers(lists then calls every surfaced tool).Pin
CONFORMANCE_PKGmoves from thepkg.pr.new@65fcd39URL to the published@modelcontextprotocol/conformance@0.2.0-alpha.7; the URL fetch-and-verify steps are dropped now that npm's own integrity check applies. Baselined: alpha.7 introduces no new failures over the previous pin.Stacked on #2968.
AI Disclaimer