diff --git a/.github/workflows/npm-publish.yml b/.github/workflows/npm-publish.yml index ed36c1553..9ece3d3f2 100644 --- a/.github/workflows/npm-publish.yml +++ b/.github/workflows/npm-publish.yml @@ -111,6 +111,7 @@ jobs: - budget-allocator-server - cohort-heatmap-server - customer-segmentation-server + - conformance-server - debug-server - lazy-auth-server - map-server diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml index 478c5b807..bdf2a1919 100644 --- a/.github/workflows/publish.yml +++ b/.github/workflows/publish.yml @@ -33,6 +33,7 @@ jobs: ./examples/budget-allocator-server \ ./examples/cohort-heatmap-server \ ./examples/customer-segmentation-server \ + ./examples/conformance-server \ ./examples/debug-server \ ./examples/lazy-auth-server \ ./examples/map-server \ diff --git a/examples/conformance-server/.gitignore b/examples/conformance-server/.gitignore new file mode 100644 index 000000000..b94707787 --- /dev/null +++ b/examples/conformance-server/.gitignore @@ -0,0 +1,2 @@ +node_modules/ +dist/ diff --git a/examples/conformance-server/README.md b/examples/conformance-server/README.md new file mode 100644 index 000000000..3f834593b --- /dev/null +++ b/examples/conformance-server/README.md @@ -0,0 +1,134 @@ +# Conformance server + +![MCP Apps Conformance screenshot](screenshot.png) + +A **host-conformance test server** for the MCP Apps spec ([SEP-1865 · `2026-01-26`](https://github.com/modelcontextprotocol/ext-apps/blob/main/specification/2026-01-26/apps.mdx), extension id `io.modelcontextprotocol/ui`), modeled on [web-platform-tests](https://web-platform-tests.org). + +It ships a single `ui://` test page that renders **inside the host's sandboxed iframe**, drives the `postMessage`/JSON-RPC bridge, asserts the host's behaviour against the spec, and shows `PASS`/`FAIL` right in the iframe. + +> **The host is the browser. The `ui://` page is the WPT test. The bridge is `testharness.js`.** + +## Run it against a host + +Connect a host to this server's `/mcp` endpoint, then prompt the host to call the `run_conformance` tool. The host renders the runner; click **Run conformance tests** to see results. + +Start it from the monorepo root: + +```bash +npm install +EXAMPLE=conformance-server npm run examples:start # serves http://localhost:31xx/mcp +``` + +The console prints the assigned port. Automatic `in-view` checks run on click; the human-in-the-loop (`· manual`) checks prompt you mid-run to take an action (toggle the theme, open a link, send a message) and confirm the outcome. + +## How it reads + +Each test carries a **vantage**, where the requirement is observable: + +- `in-view`, from inside the iframe (this runner asserts it directly) +- `host`, only by inspecting the host's own surface (rendered DOM, the host↔sandbox channel, or the conversation/model) from outside the view +- `server`, only the test server sees it + +`· manual` flags a check that needs a **human action** to trigger or verify. ⚠️ flags a measurement caveat in the row. Optional (`MAY`) checks may report an **`INFO`** signal ("does the host do it or not") instead of pass/fail. + +The catalogue below covers **only host-directed** normative requirements (the Sandbox proxy is host-side, so its requirements are included). App/View- and server-directed requirements are intentionally excluded. IDs are namespaced by the spec **capability area** (WPT-path style). `✅` = implemented, `⬜` = planned (id reserved). + +## Host conformance catalogue + +> **24 of 45 host requirements implemented**: `in-view` automatic checks plus six human-in-the-loop (`· manual`) ones: an **auto-detect** check (`context/context-changed`) and five **human-declaration** checks (`links/open-external`, `messages/add-to-conversation`, `messages/consent`, `visibility/app-tool-hidden`, `model-context/provide-future-turns`). `⬜` rows have reserved IDs and need host DOM/channel/log inspection or multi-turn setup. + +### `security/`: sandboxing & CSP · §Sandbox proxy, §Host Behavior, §Security Considerations + +| ID | Requirement | Clause | Vantage | Status | +| ------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ------------- | ------ | +| `security/iframe-sandboxed` | All View content is rendered in sandboxed iframes with restricted permissions. ⚠️ the view can't read its own iframe's `sandbox` attribute (cross-origin), verify by inspecting the host's DOM | MUST | host | ⬜ | +| `security/sandbox-proxy-required` | A web-page host wraps the View behind an intermediate Sandbox proxy. ⚠️ the view can infer frame nesting (`window.parent !== window.top`) but can't verify the proxy, inspect the host's DOM | MUST | host | ⬜ | +| `security/sandbox-distinct-origin` | Host and Sandbox have different origins, reading `window.parent.location` throws | MUST | in-view | ✅ | +| `security/sandbox-permissions` | Sandbox iframe uses exactly `allow-scripts allow-same-origin`. ⚠️ inferred: scripts run + `window.origin` not opaque | MUST | in-view | ✅ | +| `security/sandbox-proxy-ready` | Sandbox sends `ui/notifications/sandbox-proxy-ready` **to the Host** when ready. ⚠️ sandbox↔host message, never forwarded to the view, observe by instrumenting the host↔sandbox channel | MUST | host | ⬜ | +| `security/sandbox-resource-ready` | Host sends raw HTML via `ui/notifications/sandbox-resource-ready` once the sandbox is ready. ⚠️ sandbox↔host message, not visible to the view | MUST | host | ⬜ | +| `security/sandbox-csp-enforced` | Sandbox loads HTML with CSP enforcing declared domains, `frame-src`, `base-uri`, `object-src 'none'`, restrictive defaults. ⚠️ the `connect-src` slice is covered in-view by `csp-allow-declared`/`csp-no-loosening`; verifying the full applied CSP needs inspecting the host's CSP header | MUST | host | ⬜ | +| `security/sandbox-message-forwarding` | Sandbox forwards Host↔View messages for any non-`ui/notifications/sandbox-` method. ⚠️ the view only sees its own end, confirming a message was forwarded and received needs the host to acknowledge (also transitively proven: if forwarding broke, `initialize` would never complete) | MUST | host | ⬜ | +| `security/sandbox-no-self-requests` | Sandbox does not originate its own requests. ⚠️ observe on the host↔sandbox channel, not from the view | SHOULD NOT | host | ⬜ | +| `security/csp-construct-from-domains` | Host constructs the CSP from the declared domains, verified by reading the **applied policy** (`` tag, or the `securitypolicyviolation` event's `originalPolicy`) and checking `connect-src` includes the declared domain. ⚠️ unreadable if the host uses a header-only CSP that never fires a violation | MUST | in-view | ✅ | +| `security/csp-default-deny` | With **no** `ui.csp`, host applies the restrictive default (`connect-src 'none'`, …). ⚠️ needs a dedicated **no-CSP** resource, the current runner declares a CSP, so this "omitted" path isn't exercised | MUST | in-view | ⬜ | +| `security/csp-allow-declared` | A declared `connectDomains` origin is permitted. The runner declares `connectDomains: ["https://modelcontextprotocol.io"]`. ⚠️ this is a **positive control** for `csp-construct-from-domains`/`csp-no-loosening`, not an independent normative MUST, the spec lets a host **further restrict** declared domains (§UI Resource Format → No Loosening), so a compliant host could legitimately block this; a network failure also reads as "not allowed", so the origin must be reachable | MUST | in-view | ✅ | +| `security/csp-no-loosening` | Even with a CSP declared, an **undeclared** origin stays blocked. Backed by `csp-allow-declared` as the positive control, so the block is genuinely the CSP | MUST NOT | in-view | ✅ | +| `security/permissions-allow-attr` | Sandbox sets the inner iframe `allow` attribute from declared permissions. ⚠️ the `allow` attribute lives on the cross-origin parent's iframe, inspect the host's DOM (feature detection from the view is gesture-gated and doesn't confirm the attribute) | MAY | host | ⬜ | +| `security/csp-audit-log` | Host logs CSP configurations for security review. ⚠️ inspect the host's logs | SHOULD | host · manual | ⬜ | +| `security/external-domain-warning` | Host warns the user when the UI **requires external-domain network access**, a resource that declares `connectDomains` to a third-party origin (Security Considerations §CSP). ⚠️ tied to CSP/`connectDomains` at connection time, **not** to `ui/open-link` (which has no warning clause); needs a resource declaring an external connect domain + host-surface inspection | SHOULD | host · manual | ⬜ | +| `security/global-allowlist` | Host applies global domain allow/block lists. ⚠️ configure host policy, then verify | MAY | host · manual | ⬜ | + +### `lifecycle/`: handshake & tool notifications · §Lifecycle, §Data Passing + +| ID | Requirement | Clause | Vantage | Status | +| ----------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------ | ---------------- | ------ | +| `lifecycle/initialize-capabilities` | Host responds to `ui/initialize` with `hostCapabilities` in `McpUiInitializeResult` | MUST | in-view | ✅ | +| `lifecycle/tool-input` | Host sends `ui/notifications/tool-input` with complete arguments after the View's initialize completes (via `ontoolinput`) | MUST | in-view | ✅ | +| `lifecycle/tool-input-partial` | Host may stream `ui/notifications/tool-input-partial` before `tool-input`. Reported as a capability **signal** (runtime `INFO`, not pass/fail). ⚠️ partials only appear when the agent streams tool args | MAY | in-view | ✅ | +| `lifecycle/tool-input-partial-stop` | Host stops sending `ui/notifications/tool-input-partial` once `tool-input` is sent. ⚠️ only catches a violation if the host streams partials (our launcher has none), so usually passes vacuously | MUST | in-view | ✅ | +| `lifecycle/tool-result` | Host sends `ui/notifications/tool-result` when execution completes (if the View is displayed; via `ontoolresult`) | MUST | in-view | ✅ | +| `lifecycle/tool-cancelled` | Host sends `ui/notifications/tool-cancelled` if execution is cancelled. Captured in-view via `ontoolcancelled`; the user must cancel a running tool | MUST | in-view · manual | ⬜ | +| `lifecycle/teardown-notify` | Host sends a teardown notification before tearing down the View. Captured in-view via `onteardown`; the user must close/replace the view | MUST | in-view · manual | ⬜ | +| `lifecycle/teardown-await` | Host waits for a response before tearing down (to prevent data loss). In-view via a delayed `onteardown` response; the user must trigger teardown | SHOULD | in-view · manual | ⬜ | + +### `tools/` & `visibility/`: proxying & tool exposure · §Resource Discovery, §Visibility + +| ID | Requirement | Clause | Vantage | Status | +| -------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | ------------- | ------ | +| `tools/proxy-call` | Host proxies `tools/call` from the View to the server and returns the result. The spec only states the host **MAY** forward non-`ui/` messages to the server (§Sandbox proxy); proxying is a functional expectation **once the host advertises `serverTools`**. Also corroborated server-side | MAY | in-view | ✅ | +| `visibility/app-tool-hidden` | Host excludes tools lacking `"model"` visibility from the agent's `tools/list`. The app asks the agent (via `ui/message`) to enumerate the conformance server's tools; operator confirms the app-only `conformance_probe` is absent | MUST NOT | host · manual | ✅ | +| `visibility/app-tool-call-guard` | Host rejects `tools/call` from apps for tools that don't include `"app"` visibility | MUST | in-view | ✅ | + +### `resources/`: UI resource fetching · §Resource Discovery + +| ID | Requirement | Clause | Vantage | Status | +| --------------------------- | ----------------------------------------------------------------------------------------------------- | ------ | ------- | ------ | +| `resources/read-referenced` | Host fetches the referenced UI resource via `resources/read`. ⚠️ observed by the server, not the view | MUST | server | ⬜ | +| `resources/prefetch` | Host may prefetch/cache UI resource content. ⚠️ server-observed | MAY | server | ⬜ | + +### `context/`: host context & change notifications · §Host Context, §Theming + +| ID | Requirement | Clause | Vantage | Status | +| -------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------ | ---------------- | ------ | +| `context/initialize-hostcontext` | Host includes `hostContext` in `McpUiInitializeResult`. ⚠️ SHOULD, a host may legitimately omit it | SHOULD | in-view | ✅ | +| `context/context-changed` | Host emits `ui/notifications/host-context-changed` when context fields change. Captured in-view via `onhostcontextchanged`; the user must change the theme/display mode | MAY | in-view · manual | ✅ | + +### `dimensions/`: sizing · §Container Dimensions + +| ID | Requirement | Clause | Vantage | Status | +| -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------ | ------- | ------ | +| `dimensions/listen-size-changed` | In flexible mode, host resizes the iframe on `ui/notifications/size-changed`. Observed by growing the content and watching the view's own `window.innerHeight` grow. ⚠️ flexible mode only (INFO if the host pins a fixed height); relies on autoResize; host may clamp to maxHeight | MUST | in-view | ✅ | + +### `display/`: display modes · §Display Modes + +| ID | Requirement | Clause | Vantage | Status | +| ------------------------------------- | ------------------------------------------------------------------------------ | -------- | ------- | ------ | +| `display/no-undeclared-mode` | Host never switches the View to a mode absent from its `availableDisplayModes` | MUST NOT | in-view | ✅ | +| `display/return-resulting-mode` | Host returns the resulting mode in the `ui/request-display-mode` response | MUST | in-view | ✅ | +| `display/unavailable-returns-current` | If the requested mode is unavailable, host returns the current mode | SHOULD | in-view | ✅ | +| `display/decline-undeclared` | Host may decline mode requests for modes the View didn't declare | MAY | in-view | ⬜ | + +### `links/`, `messages/`, `model-context/`: View→Host requests · §MCP Apps Specific Messages + +| ID | Requirement | Clause | Vantage | Status | +| ---------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | ------ | ------------- | ------ | +| `links/open-external` | Host opens a `ui/open-link` URL in the user's default browser or a new tab. ⚠️ side effect outside the iframe, operator confirms the opened tab | SHOULD | host · manual | ✅ | +| `messages/add-to-conversation` | Host adds a `ui/message` to the conversation context, preserving the role. App triggers `ui/message`; operator confirms it appeared | SHOULD | host · manual | ✅ | +| `messages/consent` | Host may request user consent for a `ui/message`. App triggers `ui/message`; operator reports whether a consent prompt showed (INFO, optional) | MAY | host · manual | ✅ | +| `model-context/provide-future-turns` | Host provides `ui/update-model-context` to the model in future turns. App seeds a secret code then asks the agent for it; operator confirms recall | SHOULD | host · manual | ✅ | +| `model-context/last-wins` | If several updates arrive before the next user message, host sends only the last. ⚠️ multi-turn | SHOULD | host · manual | ⬜ | +| `model-context/overwrite-defer-dedupe-display` | Host may overwrite / defer / dedupe / display context updates. ⚠️ multi-turn / UX | MAY | host · manual | ⬜ | + +### `capabilities/`: negotiation & forwarding · §Capability Negotiation, §Sandbox proxy + +| ID | Requirement | Clause | Vantage | Status | +| --------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------ | ------- | ------ | +| `capabilities/mimetypes-required` | Host's UI capability declaration includes `mimeTypes`. ⚠️ negotiation, server-observed | REQUIRED | server | ⬜ | +| `capabilities/server-passthrough` | Host forwards non-`ui/` MCP methods from the view to the server. Tested via `resources/list` (`listServerResources` → expects `ui://conformance/runner` back), distinct from `tools/proxy-call`. Gated on `serverResources` (INFO otherwise) | MAY · SHOULD | in-view | ✅ | + +## Layout + +- `server.ts`, the MCP server: one `ui://` runner resource + fixture tools (`run_conformance` launcher, app-only `conformance_probe`, model-only `model_only_probe`) +- `main.ts`, Streamable HTTP / stdio entry point +- `mcp-app.html` + `src/`, the React runner: `mcp-app.tsx` (UI) + `testharness.ts` (the `mcp_test()` harness) + `tests.ts` (the catalogue, in code) diff --git a/examples/conformance-server/grid-cell.png b/examples/conformance-server/grid-cell.png new file mode 100644 index 000000000..7ac3645d3 Binary files /dev/null and b/examples/conformance-server/grid-cell.png differ diff --git a/examples/conformance-server/main.ts b/examples/conformance-server/main.ts new file mode 100644 index 000000000..ec187b68a --- /dev/null +++ b/examples/conformance-server/main.ts @@ -0,0 +1,93 @@ +/** + * Entry point for running the MCP server. + * Run with: npx @modelcontextprotocol/server-basic-react + * Or: node dist/index.js [--stdio] + */ + +import { createMcpExpressApp } from "@modelcontextprotocol/sdk/server/express.js"; +import type { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; +import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; +import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js"; +import cors from "cors"; +import type { Request, Response } from "express"; +import { createServer } from "./server.js"; + +/** + * Starts an MCP server with Streamable HTTP transport in stateless mode. + * + * @param createServer - Factory function that creates a new McpServer instance per request. + */ +export async function startStreamableHTTPServer( + createServer: () => McpServer, +): Promise { + const port = parseInt(process.env.PORT ?? "3001", 10); + + const app = createMcpExpressApp({ host: "0.0.0.0" }); + app.use(cors()); + + app.all("/mcp", async (req: Request, res: Response) => { + const server = createServer(); + const transport = new StreamableHTTPServerTransport({ + sessionIdGenerator: undefined, + }); + + res.on("close", () => { + transport.close().catch(() => {}); + server.close().catch(() => {}); + }); + + try { + await server.connect(transport); + await transport.handleRequest(req, res, req.body); + } catch (error) { + console.error("MCP error:", error); + if (!res.headersSent) { + res.status(500).json({ + jsonrpc: "2.0", + error: { code: -32603, message: "Internal server error" }, + id: null, + }); + } + } + }); + + const httpServer = app.listen(port, (err) => { + if (err) { + console.error("Failed to start server:", err); + process.exit(1); + } + console.log(`MCP server listening on http://localhost:${port}/mcp`); + }); + + const shutdown = () => { + console.log("\nShutting down..."); + httpServer.close(() => process.exit(0)); + }; + + process.on("SIGINT", shutdown); + process.on("SIGTERM", shutdown); +} + +/** + * Starts an MCP server with stdio transport. + * + * @param createServer - Factory function that creates a new McpServer instance. + */ +export async function startStdioServer( + createServer: () => McpServer, +): Promise { + await createServer().connect(new StdioServerTransport()); +} + +async function main() { + if (process.argv.includes("--stdio")) { + await startStdioServer(createServer); + } else { + await startStreamableHTTPServer(createServer); + } +} + +main().catch((e) => { + console.error(e); + process.exit(1); +}); diff --git a/examples/conformance-server/mcp-app.html b/examples/conformance-server/mcp-app.html new file mode 100644 index 000000000..6a73265ef --- /dev/null +++ b/examples/conformance-server/mcp-app.html @@ -0,0 +1,13 @@ + + + + + + + MCP Apps Conformance Runner + + +
+ + + diff --git a/examples/conformance-server/package.json b/examples/conformance-server/package.json new file mode 100644 index 000000000..64a7eb582 --- /dev/null +++ b/examples/conformance-server/package.json @@ -0,0 +1,56 @@ +{ + "name": "@modelcontextprotocol/server-conformance", + "version": "1.0.0", + "type": "module", + "description": "Host-conformance test server for the MCP Apps spec: ships a ui:// runner that asserts host behaviour and reports PASS/FAIL inside the iframe", + "repository": { + "type": "git", + "url": "https://github.com/modelcontextprotocol/ext-apps", + "directory": "examples/conformance-server" + }, + "license": "MIT", + "main": "dist/server.js", + "types": "dist/server.d.ts", + "bin": { + "mcp-server-conformance": "dist/index.js" + }, + "files": [ + "dist" + ], + "exports": { + ".": { + "types": "./dist/server.d.ts", + "default": "./dist/server.js" + } + }, + "scripts": { + "build": "tsc --noEmit && cross-env INPUT=mcp-app.html vite build && tsc -p tsconfig.server.json && bun build server.ts --outdir dist --target node && bun build main.ts --outfile dist/index.js --target node --external \"./server.js\" --banner \"#!/usr/bin/env node\"", + "watch": "cross-env INPUT=mcp-app.html vite build --watch", + "serve": "bun --watch main.ts", + "start": "cross-env NODE_ENV=development npm run build && npm run serve", + "dev": "cross-env NODE_ENV=development concurrently \"npm run watch\" \"npm run serve\"", + "prepublishOnly": "npm run build" + }, + "dependencies": { + "@modelcontextprotocol/ext-apps": "^1.7.4", + "@modelcontextprotocol/sdk": "^1.24.0", + "cors": "^2.8.5", + "express": "^5.1.0", + "react": "^19.2.0", + "react-dom": "^19.2.0", + "zod": "^4.1.13" + }, + "devDependencies": { + "@types/cors": "^2.8.19", + "@types/express": "^5.0.0", + "@types/node": "22.10.0", + "@types/react": "^19.2.2", + "@types/react-dom": "^19.2.2", + "@vitejs/plugin-react": "^4.3.4", + "concurrently": "^9.2.1", + "cross-env": "^10.1.0", + "typescript": "^5.9.3", + "vite": "^6.0.0", + "vite-plugin-singlefile": "^2.3.0" + } +} diff --git a/examples/conformance-server/screenshot.png b/examples/conformance-server/screenshot.png new file mode 100644 index 000000000..ab667b61c Binary files /dev/null and b/examples/conformance-server/screenshot.png differ diff --git a/examples/conformance-server/server.ts b/examples/conformance-server/server.ts new file mode 100644 index 000000000..45ffdc72d --- /dev/null +++ b/examples/conformance-server/server.ts @@ -0,0 +1,112 @@ +/** + * The reference conformance test server. + * + * Exposes one ui:// test page (the conformance runner) plus the fixture tools + * the in-iframe harness needs: a model-visible launcher, an app-only echo probe + * (for the tool-proxying test), and a model-only tool (for the visibility test). + * Point any MCP Apps host at this server's /mcp endpoint and run the suite. + * + * POC scope: the runner only, results are shown in the iframe, not persisted. + */ + +import { existsSync, readFileSync } from "node:fs"; +import path from "node:path"; +import { + RESOURCE_MIME_TYPE, + registerAppResource, + registerAppTool, +} from "@modelcontextprotocol/ext-apps/server"; +import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; +import type { CallToolResult } from "@modelcontextprotocol/sdk/types.js"; +import { z } from "zod"; + +const RUNNER_URI = "ui://conformance/runner"; +// Works both from source (server.ts) and compiled (dist/server.js) +const DIST_DIR = import.meta.filename.endsWith(".ts") + ? path.join(import.meta.dirname, "dist") + : import.meta.dirname; +const VIEW_HTML = path.join(DIST_DIR, "mcp-app.html"); + +// The runner declares a CSP so the suite can test both directions: this origin +// is ALLOWED (connectDomains), and any other origin must stay blocked. +const CSP_ALLOWED_ORIGIN = "https://modelcontextprotocol.io"; + +function loadRunnerHtml(): string { + if (existsSync(VIEW_HTML)) return readFileSync(VIEW_HTML, "utf-8"); + return ` +

Runner not built

Run npm run build first.

`; +} + +export function createServer(): McpServer { + const server = new McpServer({ + name: "MCP Apps Conformance Server", + version: "0.1.0", + }); + + const cspMeta = { ui: { csp: { connectDomains: [CSP_ALLOWED_ORIGIN] } } }; + registerAppResource( + server, + "Conformance Runner", + RUNNER_URI, + { + description: "Runs the MCP Apps conformance suite inside the host.", + _meta: cspMeta, + }, + () => ({ + contents: [ + { + uri: RUNNER_URI, + mimeType: RESOURCE_MIME_TYPE, + text: loadRunnerHtml(), + _meta: cspMeta, + }, + ], + }), + ); + + registerAppTool( + server, + "run_conformance", + { + description: "Run the MCP Apps conformance test suite against this host.", + _meta: { ui: { resourceUri: RUNNER_URI, visibility: ["model", "app"] } }, + }, + (): CallToolResult => ({ + content: [ + { type: "text", text: "Launching the MCP Apps conformance runner…" }, + ], + }), + ); + + registerAppTool( + server, + "conformance_probe", + { + description: + "Echo probe used by the conformance harness to verify tool proxying.", + inputSchema: { ping: z.string() }, + _meta: { ui: { visibility: ["app"] } }, + }, + ({ ping }): CallToolResult => ({ + content: [{ type: "text", text: `echo:${ping}` }], + }), + ); + + // Model-only fixture tool (NOT app-visible). The visibility test calls this + // from the view; a conformant host MUST reject that call. + registerAppTool( + server, + "model_only_probe", + { + description: + "Model-only fixture; an app calling this MUST be rejected by the host.", + inputSchema: { ping: z.string() }, + _meta: { ui: { visibility: ["model"] } }, + }, + ({ ping }): CallToolResult => ({ + content: [{ type: "text", text: `model-only:${ping}` }], + }), + ); + + return server; +} diff --git a/examples/conformance-server/src/mcp-app.tsx b/examples/conformance-server/src/mcp-app.tsx new file mode 100644 index 000000000..fc58859c2 --- /dev/null +++ b/examples/conformance-server/src/mcp-app.tsx @@ -0,0 +1,302 @@ +/** + * The conformance runner View (React, via ext-apps' `useApp`). + * + * Tests run behind a **user-gesture button** rather than auto-running on + * connect: some hosts (e.g. ChatGPT) only allow display-mode / fullscreen + * changes under transient user activation, so a click is required for those + * tests to behave. `useApp`'s `onAppCreated` lets us capture host→view + * notifications (tool-input/tool-result) BEFORE connect. + */ +import { useApp } from "@modelcontextprotocol/ext-apps/react"; +import { useCallback, useRef, useState } from "react"; +import { createRoot } from "react-dom/client"; +import { + captureHostSignals, + getRegistry, + type HostSignals, + type InteractionRequest, + runAll, + type SubtestResult, +} from "./testharness"; +import "./tests"; +import "./style.css"; + +type Row = Pick< + SubtestResult, + | "id" + | "name" + | "status" + | "clause" + | "vantage" + | "manual" + | "caveat" + | "message" +>; + +const freshRows = (): Row[] => + getRegistry().map((d) => ({ + id: d.id, + name: d.name, + status: "NOTRUN", + clause: d.clause, + vantage: d.vantage, + manual: d.manual, + caveat: d.caveat, + })); + +const statusClass = (s: string) => `st st-${s.toLowerCase()}`; +const toRow = (r: SubtestResult): Row => ({ + id: r.id, + name: r.name, + status: r.status, + clause: r.clause, + vantage: r.vantage, + manual: r.manual, + caveat: r.caveat, + message: r.message, +}); + +/** A pending interaction request plus the resolver that settles the test. */ +type PendingInteraction = { + req: InteractionRequest; + resolve: (v: boolean) => void; +}; + +function ConformanceRunner() { + const signalsRef = useRef(null); + const [rows, setRows] = useState(freshRows); + const [runningId, setRunningId] = useState(null); + const [running, setRunning] = useState(false); + const [ran, setRan] = useState(false); + const [interaction, setInteraction] = useState( + null, + ); + + const { app, error } = useApp({ + appInfo: { name: "mcp-apps-conformance-runner", version: "0.1.0" }, + capabilities: { availableDisplayModes: ["inline", "fullscreen"] }, + autoResize: true, + onAppCreated: (created) => { + signalsRef.current = captureHostSignals(created); + created.onerror = (e) => console.error("[conformance] app error:", e); + }, + }); + + // POC scope: results are rendered in the iframe only, not reported anywhere. + const run = useCallback(async () => { + if (!app) return; + setRan(false); + setRows(freshRows()); + setRunning(true); + // Automatic tests run inline (resize/dimension checks need flexible inline + // mode); reset to inline in case a previous run left us fullscreen. + try { + await app.requestDisplayMode({ mode: "inline" }); + } catch { + /* host may decline */ + } + + const results = await runAll(app, signalsRef.current ?? undefined, { + onStart: (id) => setRunningId(id), + onResult: (r) => + setRows((prev) => + prev.map((row) => (row.id === r.id ? toRow(r) : row)), + ), + // Switch to fullscreen for the interactive finale, best-effort (some + // hosts may gate display-mode changes on a fresh user gesture). + onEnterManual: async () => { + try { + await app.requestDisplayMode({ mode: "fullscreen" }); + } catch { + /* host may decline */ + } + }, + requestInteraction: (req) => + new Promise((resolve) => { + const settle = (v: boolean) => { + setInteraction(null); + resolve(v); + }; + // "await" mode: pass automatically the moment the test's signal settles + // (e.g. the host-context-changed notification arrives). + if (req.kind === "await" && req.signal) { + req.signal.then( + () => settle(true), + () => settle(false), + ); + } + setInteraction({ req, resolve: settle }); + }), + }); + + setRows(results.map(toRow)); + setRunningId(null); + setInteraction(null); + setRunning(false); + setRan(true); + }, [app]); + + const runTrigger = useCallback((req: InteractionRequest) => { + void Promise.resolve(req.trigger?.run()).catch((e) => + console.error("[conformance] trigger error:", e), + ); + }, []); + + const host = app?.getHostVersion(); + // INFO rows are capability signals, not pass/fail, exclude them from the score. + const pass = rows.filter((r) => r.status === "PASS").length; + const failed = rows.filter( + (r) => r.status === "FAIL" || r.status === "TIMEOUT", + ).length; + const info = rows.filter((r) => r.status === "INFO").length; + const gradeable = rows.length - info; + const done = rows.filter((r) => r.status !== "NOTRUN").length; + const summaryText = `${pass}/${gradeable} passing${info ? ` · ${info} info` : ""}`; + const hostLabel = error + ? "error" + : app + ? `${host?.name ?? "unknown"}${host?.version ? ` v${host.version}` : ""}` + : "connecting…"; + + return ( +
+
+
+

MCP Apps Conformance

+

+ Host under test: {hostLabel} +

+
+
+ {ran && !running && ( + + {summaryText} + + )} + +
+
+ + {running && ( +
+
+
+ )} + + {error &&

connection error: {error.message}

} + + + + + + + + + + + + {rows.map((r) => ( + + + + + + + ))} + +
IDTestClauseStatus
{r.id} + {r.name} + {r.message &&
{r.message}
} + {r.caveat &&
⚠️ {r.caveat}
} +
+ {r.clause} + {r.vantage && {r.vantage}} + {r.manual && manual} + + {r.id === runningId ? ( + + + running… + + ) : ( + {r.status} + )} +
+ + {interaction && ( +
+
+ action needed +

{interaction.req.prompt}

+
+ {interaction.req.trigger && ( + + )} + {interaction.req.kind === "await" ? ( + <> + + detecting… + + + + ) : interaction.req.kind === "ack" ? ( + + ) : ( + <> + + + + )} +
+
+
+ )} +
+ ); +} + +createRoot(document.getElementById("root")!).render(); diff --git a/examples/conformance-server/src/style.css b/examples/conformance-server/src/style.css new file mode 100644 index 000000000..60c26a68f --- /dev/null +++ b/examples/conformance-server/src/style.css @@ -0,0 +1,296 @@ +:root { + /* Plain hex + a dark-mode media query rather than light-dark(): the CSS + minifier mangles light-dark() into an invalid value (e.g. "#fff#14171d"), + which made var(--bg)/var(--ink) resolve to transparent. */ + --bg: #ffffff; + --ink: #171a1f; + --muted: #5b6573; + --line: #e6e8ec; + --pass: #1a9c63; + --fail: #d6492f; + --timeout: #c98a1e; + --notrun: #8a93a3; + --accent: #3b82f6; + --mono: ui-monospace, "SF Mono", Menlo, monospace; + --sans: -apple-system, BlinkMacSystemFont, "Inter", "Segoe UI", sans-serif; +} +@media (prefers-color-scheme: dark) { + :root { + --bg: #14171d; + --ink: #e7ecf3; + --muted: #97a3b6; + --line: #262d39; + } +} +* { + box-sizing: border-box; +} +body { + margin: 0; + background: var(--bg); + color: var(--ink); + font-family: var(--sans); + font-size: 14px; +} +.wrap { + padding: 18px 20px; +} +.head { + display: flex; + justify-content: space-between; + align-items: flex-start; + gap: 16px; + margin-bottom: 14px; +} +h1 { + font-size: 17px; + margin: 0 0 2px; + letter-spacing: -0.01em; +} +.sub { + margin: 0; + color: var(--muted); + font-size: 12px; +} +.mono { + font-family: var(--mono); +} +.summary { + font-family: var(--mono); + font-size: 13px; + padding: 4px 12px; + border-radius: 8px; + border: 1px solid var(--line); + white-space: nowrap; +} +.summary.ok { + color: var(--pass); + border-color: var(--pass); +} +.summary.bad { + color: var(--fail); + border-color: var(--fail); +} +.head-actions { + display: flex; + align-items: center; + gap: 12px; + flex-shrink: 0; +} +.run-btn { + font-family: var(--sans); + font-size: 13px; + font-weight: 600; + padding: 7px 16px; + border-radius: 8px; + cursor: pointer; + color: #fff; + background: var(--accent); + border: 1px solid var(--accent); +} +.run-btn:disabled { + opacity: 0.5; + cursor: default; +} +.grid { + width: 100%; + border-collapse: collapse; +} +th, +td { + text-align: left; + padding: 9px 10px; + border-bottom: 1px solid var(--line); + vertical-align: top; +} +th { + font-size: 10px; + letter-spacing: 0.08em; + text-transform: uppercase; + color: var(--muted); + font-family: var(--mono); +} +.msg { + color: var(--fail); + font-family: var(--mono); + font-size: 11px; + margin-top: 4px; + white-space: pre-wrap; +} +.caveat { + color: var(--timeout); + font-size: 11px; + margin-top: 4px; + line-height: 1.4; +} +.vantage { + display: inline-block; + margin-left: 8px; + padding: 1px 6px; + border-radius: 5px; + border: 1px solid var(--line); + color: var(--muted); + font-size: 10px; +} +.st { + font-family: var(--mono); + font-size: 11px; + font-weight: 600; + padding: 2px 8px; + border-radius: 6px; +} +.st-pass { + color: var(--pass); + background: color-mix(in srgb, var(--pass) 14%, transparent); +} +.st-fail { + color: var(--fail); + background: color-mix(in srgb, var(--fail) 14%, transparent); +} +.st-timeout { + color: var(--timeout); + background: color-mix(in srgb, var(--timeout) 14%, transparent); +} +.st-notrun { + color: var(--notrun); + background: color-mix(in srgb, var(--notrun) 14%, transparent); +} +.st-info { + color: var(--accent); + background: color-mix(in srgb, var(--accent) 14%, transparent); +} +.st-running { + color: var(--accent); + background: color-mix(in srgb, var(--accent) 14%, transparent); + display: inline-flex; + align-items: center; + gap: 6px; +} + +/* live run progress + the currently-running row */ +.progress { + height: 4px; + border-radius: 999px; + background: var(--line); + overflow: hidden; + margin: 0 0 14px; +} +.progress-bar { + height: 100%; + background: var(--accent); + border-radius: 999px; + transition: width 0.25s ease; +} +tr.running { + background: color-mix(in srgb, var(--accent) 7%, transparent); +} +.spinner { + width: 10px; + height: 10px; + border-radius: 50%; + border: 2px solid currentColor; + border-top-color: transparent; + display: inline-block; + animation: spin 0.6s linear infinite; +} +@keyframes spin { + to { + transform: rotate(360deg); + } +} + +/* human-in-the-loop interaction panel */ +.interaction-scrim { + position: fixed; + inset: 0; + display: flex; + align-items: center; + justify-content: center; + padding: 24px; + background: rgba(0, 0, 0, 0.22); + backdrop-filter: blur(6px); + -webkit-backdrop-filter: blur(6px); + z-index: 50; +} +.interaction-card { + width: 100%; + max-width: 460px; + background: var(--bg); + border: 1px solid var(--line); + border-radius: 14px; + padding: 22px; + box-shadow: 0 20px 60px rgba(0, 0, 0, 0.35); + animation: pop 0.16s ease-out; +} +@keyframes pop { + from { + transform: scale(0.97); + opacity: 0; + } + to { + transform: scale(1); + opacity: 1; + } +} +.interaction-tag { + font-family: var(--mono); + font-size: 10px; + letter-spacing: 0.08em; + text-transform: uppercase; + color: var(--accent); +} +.interaction-prompt { + margin: 8px 0 18px; + font-size: 15px; + line-height: 1.5; +} +.interaction-actions { + display: flex; + flex-wrap: wrap; + gap: 10px; +} +.trigger-btn { + font-family: var(--sans); + font-size: 14px; + font-weight: 600; + padding: 9px 16px; + border-radius: 9px; + cursor: pointer; + color: #fff; + background: var(--accent); + border: 1px solid var(--accent); +} +.verdict-btn { + font-family: var(--sans); + font-size: 14px; + font-weight: 600; + padding: 9px 16px; + border-radius: 9px; + cursor: pointer; + background: transparent; + border: 1px solid var(--line); + color: var(--ink); +} +.verdict-btn.ok { + color: var(--pass); + border-color: color-mix(in srgb, var(--pass) 55%, var(--line)); +} +.verdict-btn.no { + color: var(--fail); + border-color: color-mix(in srgb, var(--fail) 55%, var(--line)); +} +.verdict-btn:hover, +.trigger-btn:hover { + filter: brightness(1.05); +} +.interaction-actions .verdict-btn.ok { + margin-left: auto; +} +.awaiting { + display: inline-flex; + align-items: center; + gap: 8px; + color: var(--muted); + font-size: 13px; + margin-right: auto; +} diff --git a/examples/conformance-server/src/testharness.ts b/examples/conformance-server/src/testharness.ts new file mode 100644 index 000000000..49cb28980 --- /dev/null +++ b/examples/conformance-server/src/testharness.ts @@ -0,0 +1,457 @@ +/** + * mcp-app-testharness, the WPT-style assertion harness for MCP Apps. + * + * Each `mcp_test(...)` registers a subtest. `runAll(app)` runs them inside the + * host's sandboxed iframe (the App is already connected) and returns a + * WPT-shaped result array. This is the analog of testharness.js, except the + * "browser" is the MCP host and assertions drive the postMessage/JSON-RPC + * bridge exposed by the ext-apps `App`. + */ +import type { App } from "@modelcontextprotocol/ext-apps"; + +/** + * `INFO` = an optional (`MAY`) behaviour was observed and reported as a + * capability signal, neither pass nor fail. Use `t.info()` instead of asserting. + */ +export type Status = "PASS" | "FAIL" | "TIMEOUT" | "NOTRUN" | "INFO"; +export type Clause = + | "MUST" + | "MUST NOT" + | "SHOULD" + | "SHOULD NOT" + | "MAY" + | "REQUIRED"; +/** "core" = spec-mandated; "host-specific" = shim/extension behaviour not scored against strict hosts. */ +export type Tag = "core" | "host-specific"; +/** + * Where the requirement is observed (orthogonal to the `manual` flag below): + * - "in-view", measurable from inside the iframe (this harness) + * - "host" , only by inspecting the host's own surface (rendered DOM, the + * host↔sandbox channel, or the conversation/model) from outside + * the view; the view can't see its cross-origin container + * - "server" , only the test server sees it (proxied call, resources/read) + * + * The `manual` flag (on a test) marks requirements that need a human action to + * trigger or verify (e.g. change the theme, cancel a tool, read the conversation). + */ +export type Vantage = "in-view" | "host" | "server"; + +export interface SubtestResult { + id: string; + name: string; + status: Status; + tag: Tag; + vantage: Vantage; + /** Needs a human action to trigger or verify. */ + manual: boolean; + clause?: Clause; + /** Why this result may be unreliable / what it can't distinguish. */ + caveat?: string; + message?: string; + durationMs: number; +} + +export class AssertionError extends Error { + constructor(message: string) { + super(message); + this.name = "AssertionError"; + } +} + +/** + * Host→View notifications that fire around connect (before runAll runs), so we + * capture them as promises BEFORE `app.connect()` and let tests await them. + * A notification that never arrives makes its test TIMEOUT (the correct + * conformance signal that the host didn't send it). + */ +export interface HostSignals { + toolInput: Promise; + toolResult: Promise; + /** + * `ui/notifications/tool-input-partial` observations. Partials may arrive 0+ + * times BEFORE `tool-input`; the spec forbids any after it. `sawAfterToolInput` + * flips true if the host violates that. + */ + partials: { count: number; last: unknown; sawAfterToolInput: boolean }; +} + +export function captureHostSignals(app: App): HostSignals { + let resolveInput!: (v: unknown) => void; + let resolveResult!: (v: unknown) => void; + let toolInputArrived = false; + const toolInput = new Promise((r) => { + resolveInput = r; + }); + const toolResult = new Promise((r) => { + resolveResult = r; + }); + const partials = { + count: 0, + last: undefined as unknown, + sawAfterToolInput: false, + }; + + app.ontoolinput = (params) => { + toolInputArrived = true; + resolveInput(params); + }; + app.ontoolinputpartial = (params) => { + partials.count += 1; + partials.last = params; + if (toolInputArrived) partials.sawAfterToolInput = true; // illegal: partial after tool-input + }; + app.ontoolresult = (result) => resolveResult(result); + return { toolInput, toolResult, partials }; +} + +/** Never-resolving signals, used when runAll is called without capture. */ +function pendingSignals(): HostSignals { + return { + toolInput: new Promise(() => {}), + toolResult: new Promise(() => {}), + partials: { count: 0, last: undefined, sawAfterToolInput: false }, + }; +} + +/** An optional button rendered alongside an interaction prompt that fires the action under test. */ +export interface InteractionTrigger { + label: string; + run: () => void | Promise; +} + +/** + * A request for human input, surfaced to the runner UI: + * - "ack" , the operator performs an action in the host, then clicks Done. + * The test captures and asserts the resulting state itself. + * - "confirm", the operator judges an outcome the view can't observe (e.g. a + * link opening in a new tab) and answers worked / didn't. + */ +export interface InteractionRequest { + kind: "ack" | "confirm" | "await"; + prompt: string; + trigger?: InteractionTrigger; + /** + * For kind "await": a promise the runner watches. When it resolves, the + * prompt auto-dismisses and the test passes, no confirmation click. The + * operator performs the action; the host's notification settles the promise. + */ + signal?: Promise; +} + +/** Resolves to the operator's verdict (always true for "ack", the answer for "confirm"). */ +export type RequestInteraction = (req: InteractionRequest) => Promise; + +export class TestContext { + constructor( + public readonly app: App, + public readonly signals: HostSignals, + requestInteraction?: RequestInteraction, + ) { + if (requestInteraction) this.requestInteraction = requestInteraction; + } + + /** Bridge to the runner UI for human-in-the-loop tests; injected by runAll. */ + requestInteraction: RequestInteraction = () => { + throw new AssertionError( + "this test needs a human, but no interaction channel was provided", + ); + }; + + /** + * Pause the run and ask the operator to do something in the host (e.g. change + * the theme), then continue once they click Done. The test then captures and + * asserts the resulting state itself. Pass a `trigger` to also render an + * action button (e.g. one that sends a notification). + */ + async requireUserAction( + prompt: string, + trigger?: InteractionTrigger, + ): Promise { + await this.requestInteraction({ kind: "ack", prompt, trigger }); + } + + /** + * Ask the operator to confirm an outcome the view can't observe (e.g. a link + * opened in a new tab). Returns their verdict. Pass a `trigger` to render a + * button that fires the action under test (e.g. sends ui/open-link). + */ + async confirmWithUser( + prompt: string, + trigger?: InteractionTrigger, + ): Promise { + return this.requestInteraction({ kind: "confirm", prompt, trigger }); + } + + /** + * Show a prompt and pass automatically when `signal` resolves, e.g. the host + * sends a notification after the operator acts, so no Done click is needed. + * The operator can still Skip, which fails the test. + */ + async awaitUserAction( + prompt: string, + signal: Promise, + trigger?: InteractionTrigger, + ): Promise { + const detected = await this.requestInteraction({ + kind: "await", + prompt, + trigger, + signal, + }); + if (!detected) + throw new AssertionError( + "skipped before the expected host notification arrived", + ); + } + + /** + * Cleanups run after the test completes, pass OR fail, in reverse order. + * Register one to restore any host state the test mutated (e.g. display mode) + * so it can't leak into the next test. + */ + readonly cleanups: Array<() => void | Promise> = []; + addCleanup(fn: () => void | Promise): void { + this.cleanups.push(fn); + } + + /** + * Report a capability signal for an optional (`MAY`) behaviour instead of + * asserting. The result becomes `INFO` (not pass/fail) with this message. + */ + infoMessage?: string; + info(message: string): void { + this.infoMessage = message; + } + + assert(cond: unknown, msg: string): asserts cond { + if (!cond) throw new AssertionError(msg); + } + + assertEquals(actual: T, expected: T, msg = "assertEquals"): void { + if (actual !== expected) { + throw new AssertionError( + `${msg}: expected ${JSON.stringify(expected)}, got ${JSON.stringify(actual)}`, + ); + } + } + + /** + * Returns true if a network request to `url` is blocked, either by the host's + * Content-Security-Policy (`connect-src`) or by the network layer. Used to + * prove the host enforces the spec's restrictive CSP default. + */ + async expectFetchBlocked(url: string): Promise { + try { + await fetch(url, { mode: "no-cors", cache: "no-store" }); + return false; // request was allowed to leave → NOT blocked + } catch { + return true; // threw → blocked (CSP violation or network error) + } + } + + /** + * Returns true if a request to `url` is allowed out (CSP permits it). The + * positive control for declared `connectDomains`. ⚠️ a network error also + * reads as "not allowed", so point this at a reliably reachable origin. + */ + async expectFetchAllowed(url: string): Promise { + return !(await this.expectFetchBlocked(url)); + } + + /** + * Reads the CSP actually applied to this document, JS can't read its own + * response headers, so we use a `` CSP tag if present, otherwise trigger + * a `connect-src` violation and read the `securitypolicyviolation` event's + * `originalPolicy` (the full policy string). Returns null if neither yields it + * (e.g. header-only CSP that never fires a violation). + */ + async readAppliedCsp( + violationUrl = "https://blocked.invalid/", + ): Promise { + const meta = document.querySelector( + 'meta[http-equiv="Content-Security-Policy" i]', + ) as HTMLMetaElement | null; + if (meta?.content) return meta.content; + return new Promise((resolve) => { + const onViolation = (e: SecurityPolicyViolationEvent) => { + clearTimeout(timer); + document.removeEventListener("securitypolicyviolation", onViolation); + resolve(e.originalPolicy || null); + }; + const timer = setTimeout(() => { + document.removeEventListener("securitypolicyviolation", onViolation); + resolve(null); + }, 1500); + document.addEventListener("securitypolicyviolation", onViolation); + void fetch(violationUrl, { mode: "no-cors", cache: "no-store" }).catch( + () => {}, + ); + }); + } + + /** + * Returns true if the host rejects a `tools/call` for `name` (e.g. the + * visibility guard rejecting an app's call to a model-only tool). + */ + async expectToolRejected( + name: string, + args: Record = {}, + ): Promise { + try { + await this.app.callServerTool({ name, arguments: args }); + return false; // call succeeded → NOT rejected + } catch { + return true; // threw → rejected + } + } +} + +interface TestDef { + id: string; + name: string; + tag: Tag; + vantage: Vantage; + manual: boolean; + clause?: Clause; + caveat?: string; + timeoutMs: number; + fn: (t: TestContext) => void | Promise; +} + +const registry: TestDef[] = []; + +export interface TestOptions { + tag?: Tag; + vantage?: Vantage; + /** Requires a human action to trigger or verify (e.g. change theme, cancel a tool). */ + manual?: boolean; + clause?: Clause; + /** A warning about what this result can't distinguish or where it may mislead. */ + caveat?: string; + timeoutMs?: number; +} + +export function mcp_test( + id: string, + name: string, + fn: (t: TestContext) => void | Promise, + opts: TestOptions = {}, +): void { + registry.push({ + id, + name, + fn, + tag: opts.tag ?? "core", + vantage: opts.vantage ?? "in-view", + manual: opts.manual ?? false, + clause: opts.clause, + caveat: opts.caveat, + timeoutMs: opts.timeoutMs ?? 5000, + }); +} + +export function getRegistry(): ReadonlyArray> { + return registry; +} + +function withTimeout(p: Promise, ms: number): Promise { + // ms = 0 (or non-finite) disables the timeout, for human-in-the-loop tests + // that legitimately block on operator input for an unbounded time. + if (!ms || !Number.isFinite(ms)) return p; + return new Promise((resolve, reject) => { + const timer = setTimeout( + () => reject(new AssertionError(`timed out after ${ms}ms`)), + ms, + ); + p.then( + (v) => { + clearTimeout(timer); + resolve(v); + }, + (e) => { + clearTimeout(timer); + reject(e); + }, + ); + }); +} + +export interface RunHooks { + /** Fires just before a test runs (its row should show a running state). */ + onStart?: (id: string) => void; + /** Fires as each test settles, for incremental/live UI updates. */ + onResult?: (result: SubtestResult) => void; + /** + * Fires once, just before the first manual (human-in-the-loop) test, the + * automatic batch is done. The UI uses this to switch to fullscreen for the + * interactive prompts (the auto tests run inline so resize tests work). + */ + onEnterManual?: () => void | Promise; + /** Channel the runner UI uses to collect human input for manual tests. */ + requestInteraction?: RequestInteraction; +} + +export async function runAll( + app: App, + signals: HostSignals = pendingSignals(), + hooks: RunHooks = {}, +): Promise { + // Run the automatic tests first so the grid fills quickly, then the + // human-in-the-loop (manual) ones, which pause the run for operator input. + const ordered = [...registry].sort( + (a, b) => Number(a.manual) - Number(b.manual), + ); + const results: SubtestResult[] = []; + let enteredManual = false; + for (const def of ordered) { + if (def.manual && !enteredManual) { + enteredManual = true; + await hooks.onEnterManual?.(); + } + hooks.onStart?.(def.id); + const t = new TestContext(app, signals, hooks.requestInteraction); // fresh context per test → isolated cleanups + const start = performance.now(); + let status: Status = "PASS"; + let message: string | undefined; + try { + await withTimeout(Promise.resolve(def.fn(t)), def.timeoutMs); + } catch (e) { + const err = e as Error; + status = + err instanceof AssertionError && /timed out/.test(err.message) + ? "TIMEOUT" + : "FAIL"; + message = err.message; + } finally { + // Restore any host state the test mutated (newest cleanup first), so it + // can't leak into the next test. Cleanup errors never fail the test. + for (const fn of [...t.cleanups].reverse()) { + try { + await fn(); + } catch (e) { + console.error("[conformance] cleanup error:", e); + } + } + } + // An optional-behaviour report (t.info) becomes INFO unless the test failed. + if (status === "PASS" && t.infoMessage !== undefined) { + status = "INFO"; + message = t.infoMessage; + } + const result: SubtestResult = { + id: def.id, + name: def.name, + status, + tag: def.tag, + vantage: def.vantage, + manual: def.manual, + clause: def.clause, + caveat: def.caveat, + message, + durationMs: Math.round(performance.now() - start), + }; + results.push(result); + hooks.onResult?.(result); + } + return results; +} diff --git a/examples/conformance-server/src/tests.ts b/examples/conformance-server/src/tests.ts new file mode 100644 index 000000000..12301d902 --- /dev/null +++ b/examples/conformance-server/src/tests.ts @@ -0,0 +1,715 @@ +/** + * The conformance test catalogue (in-view slice). + * + * This platform certifies HOSTS, so every test is a host test, IDs are + * namespaced by spec capability area (lifecycle/, security/, tools/, …), + * WPT-path style. Each test carries a `vantage` (where it can be observed) and, + * where relevant, a `caveat` warning about what the result can't distinguish. + * + * Everything here is `vantage: "in-view"`, measurable from inside the iframe. + * Requirements needing the server's or the agent's vantage live in the README + * catalogue and will be covered by server-side judging / an agent harness. + */ +import { mcp_test, type TestContext } from "./testharness"; + +// ── lifecycle ────────────────────────────────────────────────────────────── +// After ui/initialize, the host MUST expose its capabilities. +mcp_test( + "lifecycle/initialize-capabilities", + "ui/initialize returns hostCapabilities", + (t: TestContext) => { + const caps = t.app.getHostCapabilities(); + t.assert( + caps != null, + "host must return hostCapabilities after the ui/initialize handshake", + ); + }, + { clause: "MUST", vantage: "in-view" }, +); + +// ── context ──────────────────────────────────────────────────────────────── +// The host SHOULD include hostContext in the ui/initialize result. +mcp_test( + "context/initialize-hostcontext", + "ui/initialize result carries hostContext", + (t: TestContext) => { + const ctx = t.app.getHostContext(); + t.assert( + ctx != null && typeof ctx === "object", + "host should provide hostContext", + ); + }, + { + clause: "SHOULD", + vantage: "in-view", + caveat: + "SHOULD, not MUST, a host may legitimately omit hostContext, which would FAIL here. We assert presence; a richer version would only validate shape when present.", + }, +); + +// ── tools ────────────────────────────────────────────────────────────────── +// The host MUST proxy tools/call from the view to the server and return the +// result (App → Host → Server → Host → App). +mcp_test( + "tools/proxy-call", + "host proxies tools/call to the server", + async (t: TestContext) => { + const res = await t.app.callServerTool({ + name: "conformance_probe", + arguments: { ping: "hello-from-view" }, + }); + const text = (res.content ?? []) + .map((c) => (c.type === "text" ? c.text : "")) + .join(""); + t.assert( + text.includes("hello-from-view"), + `expected the proxied tool result to echo the payload, got: ${JSON.stringify(text)}`, + ); + }, + { + clause: "MUST", + vantage: "in-view", + caveat: + "The server also sees this call directly, so it can be corroborated server-side.", + }, +); + +// ── visibility ───────────────────────────────────────────────────────────── +// The host MUST reject tools/call from an app for a tool that doesn't include +// "app" in its visibility. `model_only_probe` is a model-only fixture tool. +mcp_test( + "visibility/app-tool-call-guard", + "host rejects app call to a model-only tool", + async (t: TestContext) => { + const rejected = await t.expectToolRejected("model_only_probe", { + ping: "x", + }); + t.assert( + rejected, + 'host must reject an app\'s tools/call for a tool lacking "app" visibility', + ); + }, + { + clause: "MUST", + vantage: "in-view", + caveat: + "Covers the app→tool direction only. The complementary `visibility/app-tool-hidden` (tool absent from the *agent's* list) needs the agent vantage and isn't measurable here.", + }, +); + +// ── display ──────────────────────────────────────────────────────────────── +// The host MUST return the resulting mode in the ui/request-display-mode response. +mcp_test( + "display/return-resulting-mode", + "ui/request-display-mode returns the resulting mode", + async (t: TestContext) => { + const original = t.app.getHostContext()?.displayMode ?? "inline"; + t.addCleanup(async () => { + await t.app.requestDisplayMode({ mode: original }); + }); + const res = (await t.app.requestDisplayMode({ mode: "inline" })) as { + mode?: unknown; + }; + t.assert( + typeof res?.mode === "string" && + ["inline", "fullscreen", "pip"].includes(res.mode), + `host must return a valid resulting display mode, got: ${JSON.stringify(res?.mode)}`, + ); + }, + { + clause: "MUST", + vantage: "in-view", + caveat: + "Requests the current mode ('inline') to avoid a disruptive change, but a host may still re-render as a side effect.", + }, +); + +// ── batch A: more in-view tests ────────────────────────────────────────────── + +// security, the sandbox proxy MUST be a different origin from the host, so +// reading the parent's location throws a cross-origin SecurityError. +mcp_test( + "security/sandbox-distinct-origin", + "host and sandbox have different origins", + (t: TestContext) => { + let threw = false; + try { + void window.parent.location.href; + } catch { + threw = true; + } + t.assert( + threw, + "reading window.parent.location must throw (cross-origin sandbox proxy)", + ); + }, + { + clause: "MUST", + vantage: "in-view", + caveat: + "Inferred from a cross-origin SecurityError. If the page were opened top-level (no host), parent === self and this would FAIL, which is correct (it's not in a host).", + }, +); + +// security, the sandbox MUST grant allow-scripts + allow-same-origin. +mcp_test( + "security/sandbox-permissions", + "sandbox grants allow-scripts + allow-same-origin", + (t: TestContext) => { + // This code executing ⇒ allow-scripts. A non-opaque origin ⇒ allow-same-origin. + t.assert( + window.origin !== "null", + "sandbox must grant allow-same-origin (window.origin must not be the opaque 'null')", + ); + }, + { + clause: "MUST", + vantage: "in-view", + caveat: + "Inferred: scripts executing ⇒ allow-scripts; non-opaque window.origin ⇒ allow-same-origin.", + }, +); + +// lifecycle, host MUST send ui/notifications/tool-input after the View inits. +mcp_test( + "lifecycle/tool-input", + "host sends tool-input after initialize", + async (t: TestContext) => { + const params = await t.signals.toolInput; + t.assert( + params !== undefined, + "host must send a ui/notifications/tool-input with the tool arguments", + ); + }, + { + clause: "MUST", + vantage: "in-view", + timeoutMs: 4000, + caveat: + "Captured via the ontoolinput callback (registered before connect). TIMEOUT means the host never sent it for the launching tool.", + }, +); + +// lifecycle, host MUST send ui/notifications/tool-result on completion. +mcp_test( + "lifecycle/tool-result", + "host sends tool-result on completion", + async (t: TestContext) => { + const result = await t.signals.toolResult; + t.assert( + result !== undefined, + "host must send a ui/notifications/tool-result when the tool completes", + ); + }, + { + clause: "MUST", + vantage: "in-view", + timeoutMs: 4000, + caveat: + "Captured via ontoolresult. Some hosts may not replay tool-result for the tool that launched the view, TIMEOUT flags that.", + }, +); + +// lifecycle, host MUST stop sending tool-input-partial once tool-input is sent. +mcp_test( + "lifecycle/tool-input-partial-stop", + "host stops tool-input-partial once tool-input is sent", + async (t: TestContext) => { + // Wait (bounded) for tool-input so "after" is well-defined, then leave a + // brief window to catch any illegal late partial. + await Promise.race([ + t.signals.toolInput, + new Promise((r) => setTimeout(r, 1500)), + ]); + await new Promise((r) => setTimeout(r, 300)); + t.assert( + !t.signals.partials.sawAfterToolInput, + `host must not send ui/notifications/tool-input-partial after tool-input (observed ${t.signals.partials.count} partial(s))`, + ); + }, + { + clause: "MUST", + vantage: "in-view", + timeoutMs: 4000, + caveat: + "Only catches a violation if the host actually streams partials; our launcher tool has no streamable args, so this usually passes vacuously (0 partials observed).", + }, +); + +// lifecycle, streaming tool-input-partial is OPTIONAL (MAY), so this reports a +// capability signal (INFO) rather than pass/fail: does the host stream or not? +mcp_test( + "lifecycle/tool-input-partial", + "host streams tool-input-partial (optional)", + async (t: TestContext) => { + await Promise.race([ + t.signals.toolInput, + new Promise((r) => setTimeout(r, 1500)), + ]); + const n = t.signals.partials.count; + t.info( + n > 0 + ? `streams tool-input-partial, ${n} observed` + : "does not stream tool-input-partial", + ); + }, + { + clause: "MAY", + vantage: "in-view", + timeoutMs: 4000, + caveat: + "MAY, reported as a capability signal (INFO), not pass/fail. Partials only appear when the agent streams tool arguments, which our launcher doesn't induce.", + }, +); + +// display, host MUST NOT switch to a mode absent from availableDisplayModes. +// The runner declares only inline/fullscreen, so 'pip' is undeclared. +mcp_test( + "display/no-undeclared-mode", + "host does not switch to an undeclared display mode", + async (t: TestContext) => { + const original = t.app.getHostContext()?.displayMode ?? "inline"; + t.addCleanup(async () => { + await t.app.requestDisplayMode({ mode: original }); + }); + const res = (await t.app.requestDisplayMode({ mode: "pip" })) as { + mode?: unknown; + }; + t.assert( + res?.mode !== "pip", + "host must not switch the view to a mode it didn't declare (pip)", + ); + }, + { + clause: "MUST NOT", + vantage: "in-view", + caveat: + "We declare only inline/fullscreen in appCapabilities, then request the undeclared 'pip'.", + }, +); + +// display, for an unavailable mode request, host SHOULD return the current mode. +mcp_test( + "display/unavailable-returns-current", + "unavailable mode request returns the current mode", + async (t: TestContext) => { + const current = t.app.getHostContext()?.displayMode ?? "inline"; + t.addCleanup(async () => { + await t.app.requestDisplayMode({ mode: current }); + }); + const res = (await t.app.requestDisplayMode({ mode: "pip" })) as { + mode?: unknown; + }; + t.assertEquals( + res?.mode, + current, + "host should return the current display mode for an unavailable request", + ); + }, + { + clause: "SHOULD", + vantage: "in-view", + caveat: + "SHOULD, assumes the current mode is stable between reading hostContext and the request.", + }, +); + +// ── security · CSP ─────────────────────────────────────────────────────────── +// The runner resource declares `_meta.ui.csp.connectDomains: ["…/modelcontextprotocol.io"]`, +// so these two form a positive/negative pair: the allowed origin proves +// connectivity works, so a block of the undeclared origin can only be the CSP +// (not a network failure). (The "omitted CSP → restrictive default" case, +// security/csp-default-deny, needs a no-CSP resource and is deferred.) +const CSP_ALLOWED = "https://modelcontextprotocol.io/"; +const CSP_UNDECLARED = "https://example.com/"; + +// security, a declared connectDomains origin MUST be permitted (positive control). +mcp_test( + "security/csp-allow-declared", + "declared connectDomains origin is allowed", + async (t: TestContext) => { + const allowed = await t.expectFetchAllowed(CSP_ALLOWED); + t.assert( + allowed, + `a fetch to the declared origin ${CSP_ALLOWED} must be allowed by the host's CSP`, + ); + }, + { + clause: "MUST", + vantage: "in-view", + caveat: `The runner declares connectDomains: ["${CSP_ALLOWED}"]. ⚠️ a network failure also reads as "not allowed", so the origin must be reachable.`, + }, +); + +// security, even with a CSP declared, an UNDECLARED origin MUST stay blocked. +mcp_test( + "security/csp-no-loosening", + "undeclared origin stays blocked when a CSP is declared", + async (t: TestContext) => { + const blocked = await t.expectFetchBlocked(CSP_UNDECLARED); + t.assert( + blocked, + `the host must not allow the undeclared origin ${CSP_UNDECLARED} (no loosening beyond declared domains)`, + ); + }, + { + clause: "MUST NOT", + vantage: "in-view", + caveat: `Backed by csp-allow-declared as the positive control: the declared origin works, so blocking this one is genuinely the CSP, not a blanket fetch failure.`, + }, +); + +// security, the host MUST build the CSP from the declared domains. Reads the +// actual applied policy (not just behaviour) via meta tag / securitypolicyviolation. +mcp_test( + "security/csp-construct-from-domains", + "host constructs the CSP from the declared domains", + async (t: TestContext) => { + const csp = await t.readAppliedCsp(); + t.assert( + csp !== null, + "could not read the applied CSP (no tag and no securitypolicyviolation fired)", + ); + t.assert( + /connect-src[^;]*modelcontextprotocol\.io/i.test(csp!), + `the constructed CSP's connect-src must include the declared domain; got: ${csp}`, + ); + }, + { + clause: "MUST", + vantage: "in-view", + caveat: + "Reads the applied CSP via a tag or the securitypolicyviolation event's originalPolicy. ⚠️ if the host delivers CSP only by HTTP header and no violation fires (or originalPolicy is redacted), it can't be read.", + }, +); + +// ── dimensions ─────────────────────────────────────────────────────────────── +// In flexible mode the host MUST resize the iframe when the view reports a new +// size. The view can't read its outer iframe, but the host's resize changes the +// view's own window.innerHeight (and fires a resize event). We grow the content +// (autoResize reports it) and watch our viewport grow. +mcp_test( + "dimensions/listen-size-changed", + "host resizes the iframe on size-changed (flexible mode)", + async (t: TestContext) => { + const dims = t.app.getHostContext()?.containerDimensions as + | { height?: number } + | undefined; + if (dims && typeof dims.height === "number") { + t.info( + `host pins a fixed height (${dims.height}px), flexible-mode resize not applicable`, + ); + return; + } + const before = window.innerHeight; + const spacer = document.createElement("div"); + spacer.style.height = "320px"; + spacer.setAttribute("aria-hidden", "true"); + document.body.appendChild(spacer); + t.addCleanup(() => spacer.remove()); + // Wait for autoResize → host resize → our viewport to grow. + await new Promise((resolve) => { + const finish = () => { + window.removeEventListener("resize", onResize); + clearTimeout(timer); + resolve(); + }; + const onResize = () => { + if (window.innerHeight > before) finish(); + }; + const timer = setTimeout(finish, 2500); + window.addEventListener("resize", onResize); + }); + t.assert( + window.innerHeight > before, + `host must grow the iframe when the view reports a larger size (was ${before}px, now ${window.innerHeight}px)`, + ); + }, + { + clause: "MUST", + vantage: "in-view", + timeoutMs: 5000, + caveat: + "Flexible mode only (reports INFO if the host pins a fixed height). Relies on autoResize reporting the taller content and the view's window.innerHeight reflecting the resize; the host may clamp to maxHeight.", + }, +); + +// ── capabilities ───────────────────────────────────────────────────────────── +// The host MAY forward non-ui/ MCP methods from the view to the server. We test +// this with resources/list (distinct from tools/proxy-call's tools/call): the +// view calls listServerResources → the host must forward it → we get the +// server's own ui:// resource back. +mcp_test( + "capabilities/server-passthrough", + "host forwards resources/list from the view to the server", + async (t: TestContext) => { + if (!t.app.getHostCapabilities()?.serverResources) { + t.info( + "host does not advertise serverResources, resource passthrough not supported", + ); + return; + } + const res = await t.app.listServerResources(); + const uris = (res.resources ?? []).map((r) => r.uri); + t.assert( + uris.includes("ui://conformance/runner"), + `host must forward resources/list to the server (expected ui://conformance/runner, got: ${JSON.stringify(uris)})`, + ); + }, + { + clause: "SHOULD", + vantage: "in-view", + caveat: + "Exercises resources/list passthrough (distinct from tools/proxy-call's tools/call). Gated on the host advertising serverResources; reports INFO otherwise.", + }, +); + +// ── interactive · manual (human action → capture) ──────────────────────────── +// The host emits ui/notifications/host-context-changed when context fields change. +// The operator changes the theme; we then capture hostContext in-view (updated +// via onhostcontextchanged) and assert it actually changed. +mcp_test( + "context/context-changed", + "host notifies the view when the user changes the theme", + async (t: TestContext) => { + const before = t.app.getHostContext()?.theme; + // Resolve as soon as a host-context-changed carrying a *different* theme + // arrives, the test then passes automatically (no confirmation click). + const themeChanged = new Promise((resolve) => { + const handler = (params: { theme?: unknown }) => { + const next = params?.theme ?? t.app.getHostContext()?.theme; + if (next !== undefined && next !== before) resolve(); + }; + t.app.addEventListener("hostcontextchanged", handler); + t.addCleanup(() => + t.app.removeEventListener("hostcontextchanged", handler), + ); + }); + await t.awaitUserAction( + `Toggle your host's theme (light ⇄ dark), I'll detect it automatically. Current: "${before ?? "unknown"}".`, + themeChanged, + ); + }, + { + clause: "MAY", + vantage: "in-view", + manual: true, + timeoutMs: 0, + caveat: + "Human-in-the-loop but auto-passing: the operator toggles the theme and the runner resolves on the host-context-changed notification (no confirmation click). Skip if the host doesn't emit it.", + }, +); + +// ── interactive · manual (human declaration) ───────────────────────────────── +// The host opens ui/open-link URLs in the user's browser / a new tab. The +// sandboxed view can't observe a new tab (host vantage), so the operator +// triggers the action and declares the outcome. +mcp_test( + "links/open-external", + "ui/open-link opens the URL (human-verified)", + async (t: TestContext) => { + const opened = await t.confirmWithUser( + "Click “Open link”. Your host should open https://modelcontextprotocol.io in a new tab. Did it open?", + { + label: "🔗 Open link", + run: () => t.app.openLink({ url: "https://modelcontextprotocol.io/" }), + }, + ); + t.assert( + opened, + "operator reported the host did not open the link (ui/open-link not honoured)", + ); + }, + { + clause: "SHOULD", + vantage: "host", + manual: true, + timeoutMs: 0, + caveat: + "Human-verified: the sandboxed view can't see the host open a tab, so the operator confirms the outcome after triggering ui/open-link.", + }, +); + +// messages, host adds a ui/message to the conversation. The view can't read the +// host's conversation, so the operator triggers it and confirms it appeared. +mcp_test( + "messages/add-to-conversation", + "ui/message is added to the conversation (human-verified)", + async (t: TestContext) => { + if (!t.app.getHostCapabilities()?.message) { + t.info("host does not advertise ui/message support"); + return; + } + const added = await t.confirmWithUser( + "Click “Send message”, a message from this app should appear in your conversation. Did it?", + { + label: "💬 Send message", + run: () => + t.app.sendMessage({ + role: "user", + content: [ + { + type: "text", + text: "Conformance check: this message was sent by the MCP App via ui/message.", + }, + ], + }), + }, + ); + t.assert( + added, + "operator reported the ui/message was not added to the conversation", + ); + }, + { + clause: "SHOULD", + vantage: "host", + manual: true, + timeoutMs: 0, + caveat: + "Human-verified: the view can't read the host's conversation, so the operator confirms the message appeared (role preserved).", + }, +); + +// messages, the host MAY ask for consent before adding a ui/message. Optional, +// so this reports a capability signal (INFO) from the operator's observation. +mcp_test( + "messages/consent", + "host may request consent before adding a ui/message", + async (t: TestContext) => { + if (!t.app.getHostCapabilities()?.message) { + t.info("host does not advertise ui/message support"); + return; + } + const consented = await t.confirmWithUser( + "Click “Send message”. Your host MAY show a consent prompt first, did one appear?", + { + label: "💬 Send message", + run: () => + t.app.sendMessage({ + role: "user", + content: [ + { + type: "text", + text: "Conformance check: consent-prompt probe.", + }, + ], + }), + }, + ); + t.info( + consented + ? "host showed a consent prompt" + : "host added the message without a consent prompt", + ); + }, + { + clause: "MAY", + vantage: "host", + manual: true, + timeoutMs: 0, + caveat: + "MAY, reported as an INFO signal: consent is optional, so neither answer fails. The operator reports whether a prompt appeared.", + }, +); + +// visibility, app-only tools (visibility lacking "model") must be hidden from +// the agent's tool list. We make the app ask the agent to enumerate its tools +// (via ui/message), then the operator confirms the app-only tool is absent. +mcp_test( + "visibility/app-tool-hidden", + "host hides app-only tools from the agent (human-verified)", + async (t: TestContext) => { + if (!t.app.getHostCapabilities()?.message) { + t.info("host does not advertise ui/message support"); + return; + } + const hidden = await t.confirmWithUser( + "Click “Ask the agent”, then read its reply. The app-only tool `conformance_probe` MUST NOT appear in the agent's tool list, is it correctly absent?", + { + label: "🤖 Ask the agent", + run: () => + t.app.sendMessage({ + role: "user", + content: [ + { + type: "text", + text: "From the MCP Apps Conformance server specifically, list every tool you can call, by name (ignore tools from other connected servers).", + }, + ], + }), + }, + ); + t.assert( + hidden, + "operator reported the app-only tool `conformance_probe` was visible to the agent (must be hidden)", + ); + }, + { + clause: "MUST NOT", + vantage: "host", + manual: true, + timeoutMs: 0, + caveat: + 'Human-verified via the agent\'s own tool enumeration: `conformance_probe` is app-only (visibility ["app"]) so it must not be in the model-facing tools/list. Relies on the agent answering truthfully.', + }, +); + +// NOTE: security/external-domain-warning is intentionally NOT implemented here. +// The spec SHOULD ("Host warns users when UI requires external domain access", +// Security Considerations §CSP) is about the View making network/connect-src +// requests to declared external domains, NOT about ui/open-link. The previous +// open-link-based test validated no real requirement (ui/open-link has no +// warning/confirmation clause) and duplicated links/open-external. The row stays +// ⬜ in the catalogue until a CSP-connectDomains-based scenario is built. + +// model-context, context provided via ui/update-model-context must reach the +// model on a future turn. The app seeds a secret code, then asks the agent for +// it (via ui/message); the operator confirms the agent recalled it. +mcp_test( + "model-context/provide-future-turns", + "ui/update-model-context reaches the model next turn (human-verified)", + async (t: TestContext) => { + if (!t.app.getHostCapabilities()?.updateModelContext) { + t.info("host does not advertise ui/update-model-context support"); + return; + } + const recalled = await t.confirmWithUser( + "Click “Seed + ask”. The app sets a secret code via update-model-context, then asks the agent for it. Did the agent answer with “MCP-APP-7421”?", + { + label: "🧠 Seed + ask", + run: async () => { + await t.app.updateModelContext({ + content: [ + { + type: "text", + text: "The secret conformance code is MCP-APP-7421. Remember it for later.", + }, + ], + }); + await t.app.sendMessage({ + role: "user", + content: [ + { + type: "text", + text: "What is the secret conformance code I gave you?", + }, + ], + }); + }, + }, + ); + t.assert( + recalled, + "operator reported the agent did not receive the model context on the next turn", + ); + }, + { + clause: "SHOULD", + vantage: "host", + manual: true, + timeoutMs: 0, + caveat: + "Multi-turn, human-verified: seeds ui/update-model-context then asks the agent to recall it; confirms the host fed the context to the model on the following turn.", + }, +); diff --git a/examples/conformance-server/src/vite-env.d.ts b/examples/conformance-server/src/vite-env.d.ts new file mode 100644 index 000000000..11f02fe2a --- /dev/null +++ b/examples/conformance-server/src/vite-env.d.ts @@ -0,0 +1 @@ +/// diff --git a/examples/conformance-server/tsconfig.json b/examples/conformance-server/tsconfig.json new file mode 100644 index 000000000..fc3c2101f --- /dev/null +++ b/examples/conformance-server/tsconfig.json @@ -0,0 +1,20 @@ +{ + "compilerOptions": { + "target": "ESNext", + "lib": ["ESNext", "DOM", "DOM.Iterable"], + "module": "ESNext", + "moduleResolution": "bundler", + "allowImportingTsExtensions": true, + "resolveJsonModule": true, + "isolatedModules": true, + "verbatimModuleSyntax": true, + "noEmit": true, + "jsx": "react-jsx", + "strict": true, + "skipLibCheck": true, + "noUnusedLocals": true, + "noUnusedParameters": true, + "noFallthroughCasesInSwitch": true + }, + "include": ["src", "server.ts"] +} diff --git a/examples/conformance-server/tsconfig.server.json b/examples/conformance-server/tsconfig.server.json new file mode 100644 index 000000000..05ddd8ec4 --- /dev/null +++ b/examples/conformance-server/tsconfig.server.json @@ -0,0 +1,17 @@ +{ + "compilerOptions": { + "target": "ES2022", + "lib": ["ES2022"], + "module": "NodeNext", + "moduleResolution": "NodeNext", + "declaration": true, + "emitDeclarationOnly": true, + "outDir": "./dist", + "rootDir": ".", + "strict": true, + "skipLibCheck": true, + "esModuleInterop": true, + "resolveJsonModule": true + }, + "include": ["server.ts"] +} diff --git a/examples/conformance-server/vite.config.ts b/examples/conformance-server/vite.config.ts new file mode 100644 index 000000000..21f2a94ca --- /dev/null +++ b/examples/conformance-server/vite.config.ts @@ -0,0 +1,25 @@ +import react from "@vitejs/plugin-react"; +import { defineConfig } from "vite"; +import { viteSingleFile } from "vite-plugin-singlefile"; + +const INPUT = process.env.INPUT; +if (!INPUT) { + throw new Error("INPUT environment variable is not set"); +} + +const isDevelopment = process.env.NODE_ENV === "development"; + +export default defineConfig({ + plugins: [react(), viteSingleFile()], + build: { + sourcemap: isDevelopment ? "inline" : undefined, + cssMinify: !isDevelopment, + minify: !isDevelopment, + + rollupOptions: { + input: INPUT, + }, + outDir: "dist", + emptyOutDir: false, + }, +}); diff --git a/package-lock.json b/package-lock.json index 29e905875..ac962efea 100644 --- a/package-lock.json +++ b/package-lock.json @@ -89,23 +89,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/basic-host/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/basic-host/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/basic-server-preact": { "name": "@modelcontextprotocol/server-basic-preact", "version": "1.7.4", @@ -133,23 +116,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/basic-server-preact/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/basic-server-preact/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/basic-server-react": { "name": "@modelcontextprotocol/server-basic-react", "version": "1.7.4", @@ -180,23 +146,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/basic-server-react/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/basic-server-react/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/basic-server-solid": { "name": "@modelcontextprotocol/server-basic-solid", "version": "1.7.4", @@ -224,23 +173,6 @@ "vite-plugin-solid": "^2.11.12" } }, - "examples/basic-server-solid/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/basic-server-solid/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/basic-server-svelte": { "name": "@modelcontextprotocol/server-basic-svelte", "version": "1.7.4", @@ -268,23 +200,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/basic-server-svelte/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/basic-server-svelte/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/basic-server-vanillajs": { "name": "@modelcontextprotocol/server-basic-vanillajs", "version": "1.7.4", @@ -310,23 +225,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/basic-server-vanillajs/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/basic-server-vanillajs/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/basic-server-vue": { "name": "@modelcontextprotocol/server-basic-vue", "version": "1.7.4", @@ -354,23 +252,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/basic-server-vue/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/basic-server-vue/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/budget-allocator-server": { "name": "@modelcontextprotocol/server-budget-allocator", "version": "1.7.4", @@ -397,23 +278,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/budget-allocator-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/budget-allocator-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/cohort-heatmap-server": { "name": "@modelcontextprotocol/server-cohort-heatmap", "version": "1.7.4", @@ -444,7 +308,37 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/cohort-heatmap-server/node_modules/@types/node": { + "examples/conformance-server": { + "name": "@modelcontextprotocol/server-conformance", + "version": "1.0.0", + "license": "MIT", + "dependencies": { + "@modelcontextprotocol/ext-apps": "^1.7.4", + "@modelcontextprotocol/sdk": "^1.24.0", + "cors": "^2.8.5", + "express": "^5.1.0", + "react": "^19.2.0", + "react-dom": "^19.2.0", + "zod": "^4.1.13" + }, + "bin": { + "mcp-server-conformance": "dist/index.js" + }, + "devDependencies": { + "@types/cors": "^2.8.19", + "@types/express": "^5.0.0", + "@types/node": "22.10.0", + "@types/react": "^19.2.2", + "@types/react-dom": "^19.2.2", + "@vitejs/plugin-react": "^4.3.4", + "concurrently": "^9.2.1", + "cross-env": "^10.1.0", + "typescript": "^5.9.3", + "vite": "^6.0.0", + "vite-plugin-singlefile": "^2.3.0" + } + }, + "examples/conformance-server/node_modules/@types/node": { "version": "22.10.0", "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", @@ -454,7 +348,7 @@ "undici-types": "~6.20.0" } }, - "examples/cohort-heatmap-server/node_modules/undici-types": { + "examples/conformance-server/node_modules/undici-types": { "version": "6.20.0", "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", @@ -487,23 +381,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/customer-segmentation-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/customer-segmentation-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/debug-server": { "name": "@modelcontextprotocol/server-debug", "version": "1.7.4", @@ -529,23 +406,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/debug-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/debug-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/integration-server": { "version": "1.7.4", "dependencies": { @@ -573,23 +433,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/integration-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/integration-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/lazy-auth-server": { "name": "@modelcontextprotocol/server-lazy-auth", "version": "1.7.4", @@ -615,23 +458,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/lazy-auth-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/lazy-auth-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/map-server": { "name": "@modelcontextprotocol/server-map", "version": "1.7.4", @@ -657,23 +483,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/map-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/map-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/pdf-server": { "name": "@modelcontextprotocol/server-pdf", "version": "1.7.4", @@ -701,23 +510,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/pdf-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/pdf-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/qr-server": { "name": "@modelcontextprotocol/server-qr", "version": "1.7.4", @@ -748,16 +540,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/quickstart/node_modules/@types/node": { - "version": "22.19.5", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.19.5.tgz", - "integrity": "sha512-HfF8+mYcHPcPypui3w3mvzuIErlNOh2OAG+BCeBZCEwyiD5ls2SiCwEyT47OELtf7M3nHxBdu0FsmzdKxkN52Q==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.21.0" - } - }, "examples/say-server": { "name": "@modelcontextprotocol/server-say", "version": "1.7.4", @@ -798,23 +580,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/scenario-modeler-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/scenario-modeler-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/shadertoy-server": { "name": "@modelcontextprotocol/server-shadertoy", "version": "1.7.4", @@ -840,23 +605,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/shadertoy-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/shadertoy-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/sheet-music-server": { "name": "@modelcontextprotocol/server-sheet-music", "version": "1.7.4", @@ -883,23 +631,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/sheet-music-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/sheet-music-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/system-monitor-server": { "name": "@modelcontextprotocol/server-system-monitor", "version": "1.7.4", @@ -927,23 +658,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/system-monitor-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/system-monitor-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/threejs-server": { "name": "@modelcontextprotocol/server-threejs", "version": "1.7.4", @@ -976,23 +690,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/threejs-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/threejs-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/transcript-server": { "name": "@modelcontextprotocol/server-transcript", "version": "1.7.4", @@ -1019,23 +716,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/transcript-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/transcript-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/video-resource-server": { "name": "@modelcontextprotocol/server-video-resource", "version": "1.7.4", @@ -1061,23 +741,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/video-resource-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/video-resource-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "examples/wiki-explorer-server": { "name": "@modelcontextprotocol/server-wiki-explorer", "version": "1.7.4", @@ -1105,23 +768,6 @@ "vite-plugin-singlefile": "^2.3.0" } }, - "examples/wiki-explorer-server/node_modules/@types/node": { - "version": "22.10.0", - "resolved": "https://registry.npmjs.org/@types/node/-/node-22.10.0.tgz", - "integrity": "sha512-XC70cRZVElFHfIUB40FgZOBbgJYFKKMa5nb9lxcwYstFG/Mi+/Y0bGS+rs6Dmhmkpq4pnNiLiuZAbc02YCOnmA==", - "dev": true, - "license": "MIT", - "dependencies": { - "undici-types": "~6.20.0" - } - }, - "examples/wiki-explorer-server/node_modules/undici-types": { - "version": "6.20.0", - "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.20.0.tgz", - "integrity": "sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==", - "dev": true, - "license": "MIT" - }, "node_modules/@babel/code-frame": { "version": "7.28.6", "resolved": "https://registry.npmjs.org/@babel/code-frame/-/code-frame-7.28.6.tgz", @@ -2577,9 +2223,9 @@ "license": "MIT" }, "node_modules/@modelcontextprotocol/ext-apps": { - "version": "1.7.1", - "resolved": "https://registry.npmjs.org/@modelcontextprotocol/ext-apps/-/ext-apps-1.7.1.tgz", - "integrity": "sha512-J3WdG1A4JSSKnSWKyU+895dBVYBV2Utgtf7fUsUK45mlkETm53a/1DR6Pm3hUGKqLLQthZLmpxOg8VPzJi/lyg==", + "version": "1.7.4", + "resolved": "https://registry.npmjs.org/@modelcontextprotocol/ext-apps/-/ext-apps-1.7.4.tgz", + "integrity": "sha512-QQqysE549cf/Y0VabBmAACXhj92EhB3t8yVct2BHbkWiPTFA1S91EqTVjYXXcZEefXU0pmHcdObhsNMcomJIOQ==", "license": "MIT", "workspaces": [ "examples/*" @@ -2685,6 +2331,10 @@ "resolved": "examples/cohort-heatmap-server", "link": true }, + "node_modules/@modelcontextprotocol/server-conformance": { + "resolved": "examples/conformance-server", + "link": true + }, "node_modules/@modelcontextprotocol/server-customer-segmentation": { "resolved": "examples/customer-segmentation-server", "link": true diff --git a/tests/e2e/generate-grid-screenshots.spec.ts b/tests/e2e/generate-grid-screenshots.spec.ts index bbe895927..17ae8453d 100644 --- a/tests/e2e/generate-grid-screenshots.spec.ts +++ b/tests/e2e/generate-grid-screenshots.spec.ts @@ -57,6 +57,11 @@ const ALL_SERVERS = [ name: "Customer Segmentation Server", dir: "customer-segmentation-server", }, + { + key: "conformance-server", + name: "MCP Apps Conformance Server", + dir: "conformance-server", + }, { key: "debug-server", name: "Debug MCP App Server", dir: "debug-server" }, { key: "map-server", name: "CesiumJS Map Server", dir: "map-server" }, { key: "pdf-server", name: "PDF Server", dir: "pdf-server" }, diff --git a/tests/e2e/servers.spec.ts b/tests/e2e/servers.spec.ts index a3c30d8a8..a5b3fe0d9 100644 --- a/tests/e2e/servers.spec.ts +++ b/tests/e2e/servers.spec.ts @@ -116,6 +116,11 @@ const ALL_SERVERS = [ name: "Customer Segmentation Server", dir: "customer-segmentation-server", }, + { + key: "conformance-server", + name: "MCP Apps Conformance Server", + dir: "conformance-server", + }, { key: "debug-server", name: "Debug MCP App Server", dir: "debug-server" }, { key: "map-server", name: "CesiumJS Map Server", dir: "map-server" }, { key: "pdf-server", name: "PDF Server", dir: "pdf-server" }, diff --git a/tests/e2e/servers.spec.ts-snapshots/conformance-server.png b/tests/e2e/servers.spec.ts-snapshots/conformance-server.png new file mode 100644 index 000000000..825af26da Binary files /dev/null and b/tests/e2e/servers.spec.ts-snapshots/conformance-server.png differ