Skip to content

test(conformance): bump referee to 0.2.0-alpha.8; arm SEP-2575 diagnostic fixtures; fix post-dispatch -32021 HTTP status#2399

Open
felixweinberger wants to merge 1 commit into
mainfrom
fweinberger/conformance-alpha8-bump
Open

test(conformance): bump referee to 0.2.0-alpha.8; arm SEP-2575 diagnostic fixtures; fix post-dispatch -32021 HTTP status#2399
felixweinberger wants to merge 1 commit into
mainfrom
fweinberger/conformance-alpha8-bump

Conversation

@felixweinberger

Copy link
Copy Markdown
Contributor

Adopt conformance 0.2.0-alpha.8 and reconcile the expected-failures baselines.

Motivation and Context

alpha.8 carries conformance#372: checks whose prerequisite is missing now fail with Not testable: instead of silently skipping. Adopting it surfaced two things here:

  1. Missing diagnostic fixtures. The conformance everythingServer never registered the three tools the server-stateless scenario hardcodes — test_missing_capability, test_streaming_elicitation, test_logging_tool — so four checks silently SKIPPED at alpha.7. The fixtures are now armed and three of the four pass.

  2. A real gap, reachable only once the fixture existed. A MissingRequiredClientCapabilityError (-32021) produced after dispatch — the input_required capability gate — surfaced in-band on HTTP 200, while the spec mandates 400 for this error with no origin condition ("For HTTP, the response status code MUST be 400 Bad Request"). The in-band status policy (httpStatusForErrorCode + the per-request transport) now carries that one code-keyed exception, applied only while the response is uncommitted: an exchange that already streamed, or one hosted with responseMode: 'sse' (which opens its stream at dispatch end), keeps its committed 200. A handler relaying a downstream peer's -32020/-32022 is not this server's spec error and deliberately keeps the origin-keyed in-band 200.

The fourth check — sep-2575-server-rejects-undeclared-capability, and only it: 29 of the scenario's 30 checks pass — fails on a referee-side assertion: requiredCapabilities asserted as an array where the schema (and the conformance repo's own spec-types and tasks scenario) define a ClientCapabilities object; fix open as conformance#376. Baselines are keyed by scenario name, so server-stateless must be listed to excuse that one check; both baseline entries say so explicitly, and the entry burns down at the next conformance release. The tasks-* baseline is unchanged.

How Has This Been Tested?

All six conformance legs exit 0 against the new pin, zero stale entries:

leg alpha.7 this PR
client:all / client:2026 / server 438/0 · 374/0 · 42/0 unchanged
server:draft 81/0 84 / 1 expected
server:extensions 136 / 27 expected 139 / 31 expected
server:2026 110/0 113 / 1 expected

(+3 newly passing checks per affected leg; every expected failure is reason-commented.)

Full suite: typecheck:all, lint:all, pnpm -r test (e2e 2633 passed / 155 expected-fail, integration 348/348). The -32021 fix verified on the wire: tools/call test_missing_capability with empty clientCapabilities answers HTTP/1.1 400 carrying data.requiredCapabilities: { "sampling": {} }; with sampling declared → 200; an unknown tool (-32602) → 200 (origin-keyed rule untouched). New transport tests pin the 400, the relay 200s, and the already-streamed case (the terminal error frame is asserted on the stream).

Breaking Changes

None for spec-conformant consumers. The -32021 HTTP-status change is a patch-level spec-compliance fix; changeset and a migration-guide note are included (earlier v2 alphas surfaced the post-dispatch -32021 on 200).

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

Once conformance#376 ships in a release, a one-line pin bump removes server-stateless from both baselines.

…stic fixtures; fix post-dispatch -32021 HTTP status

conformance 0.2.0-alpha.8 fails checks whose prerequisite is missing instead
of silently skipping them (conformance#372). Adopting it surfaced two gaps:

- the conformance everythingServer never registered the three diagnostic
  tools the server-stateless scenario hardcodes; test_missing_capability,
  test_streaming_elicitation and test_logging_tool are now armed.
- arming test_missing_capability made a real defect reachable: a
  MissingRequiredClientCapabilityError (-32021) produced after dispatch (the
  input_required capability gate) surfaced in-band on HTTP 200, while the
  spec mandates 400 for this error with no origin condition. The in-band
  status policy now carries that one code-keyed exception
  (httpStatusForErrorCode + the per-request transport), applied only while
  the response is uncommitted: an exchange that already streamed, or one
  hosted with responseMode 'sse' (which opens its stream at dispatch end),
  keeps its committed 200. Every other handler-produced code - including a
  handler relaying a downstream peer's -32020/-32022 - keeps the
  origin-keyed in-band 200.

server-stateless now passes 29/30; the remaining check fails on a
referee-side assertion (requiredCapabilities asserted as an array where the
schema defines a ClientCapabilities object; fix proposed in
conformance#376), so the scenario sits in both expected-failures baselines
until a release containing that fix. All other legs are unchanged:
client:all 438/0, client:2026 374/0, server 42/0, and the three legs that
select draft scenarios each gain three newly passing checks
(draft 81->84, extensions 136->139, 2026 110->113).
@felixweinberger felixweinberger requested a review from a team as a code owner June 30, 2026 20:13
@changeset-bot

changeset-bot Bot commented Jun 30, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: da2796a

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@modelcontextprotocol/server Patch
@modelcontextprotocol/core-internal Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new

pkg-pr-new Bot commented Jun 30, 2026

Copy link
Copy Markdown

Open in StackBlitz

@modelcontextprotocol/client

npm i https://pkg.pr.new/@modelcontextprotocol/client@2399

@modelcontextprotocol/codemod

npm i https://pkg.pr.new/@modelcontextprotocol/codemod@2399

@modelcontextprotocol/core

npm i https://pkg.pr.new/@modelcontextprotocol/core@2399

@modelcontextprotocol/server

npm i https://pkg.pr.new/@modelcontextprotocol/server@2399

@modelcontextprotocol/server-legacy

npm i https://pkg.pr.new/@modelcontextprotocol/server-legacy@2399

@modelcontextprotocol/express

npm i https://pkg.pr.new/@modelcontextprotocol/express@2399

@modelcontextprotocol/fastify

npm i https://pkg.pr.new/@modelcontextprotocol/fastify@2399

@modelcontextprotocol/hono

npm i https://pkg.pr.new/@modelcontextprotocol/hono@2399

@modelcontextprotocol/node

npm i https://pkg.pr.new/@modelcontextprotocol/node@2399

commit: da2796a

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No bugs found, but this changes wire behavior of the modern HTTP path (post-dispatch -32021 now answers 400) and codifies a deliberate asymmetry — code-keyed mapping for -32021 vs origin-keyed for -32020/-32022, including a relayed downstream -32021 now also getting 400 — so a maintainer should confirm that spec reading and the committed-stream carve-out before merge.

Extended reasoning...

Overview

The PR bumps the conformance referee pin to 0.2.0-alpha.8, registers three SEP-2575 diagnostic tools in the conformance everythingServer, reconciles both expected-failures baselines (excusing the whole server-stateless scenario for one referee-side check, with burn-down notes), and — the substantive part — changes HTTP status mapping so a handler-/post-dispatch-produced -32021 (MissingRequiredClientCapability) answers HTTP 400 instead of the in-band 200, but only while the response is uncommitted. Touched production files: inboundClassification.ts (httpStatusForErrorCode), perRequestTransport.ts (terminal-error status arm), and a comment update in createMcpHandler.ts. Changeset and migration-guide note are included.

Security risks

None identified. The change affects only the HTTP status code attached to an already-emitted JSON-RPC error body on the modern per-request path; no auth, parsing, or data-exposure surface changes. The new conformance fixtures are test-only.

Level of scrutiny

Moderate-to-high for the transport change: it alters observable wire behavior of the 2026-07-28 serving path and introduces the first code-keyed exception to an otherwise origin-keyed status policy. The implementation looks correct — httpStatusForErrorCode('in-band', ...) has no production callers (only tests), the per-request transport applies the mapping only when no SSE stream is open so committed/forced-SSE exchanges keep their 200 and still deliver the error in-stream, and pre-dispatch ladder behavior is unchanged — but the design choice (relayed -32021 from a downstream peer also maps to 400, while relayed -32020/-32022 deliberately stay in-band) is a spec-interpretation call a maintainer should ratify, as is excusing the entire server-stateless scenario in the baselines for a single referee-side assertion (conformance#376).

Other factors

Test coverage is good: new transport tests pin the post-dispatch 400, the relay-200 cases, and the already-streamed terminal-frame case; the core-internal status-matrix tests are updated consistently. Docs (changeset, migration note) match the implemented behavior including the committed-stream exception. The conformance description claims all six legs exit 0 against the new pin, which I cannot verify here. No prior human review exists on this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant