Add a client-side response cache honoring SEP-2549 caching hints#3023
Conversation
…S from the method registry
…solution, eviction
…notification eviction
…wn-key error formatting
There was a problem hiding this comment.
4 issues found across 16 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="docs_src/caching/tutorial003.py">
<violation number="1" location="docs_src/caching/tutorial003.py:27">
P2: `tutorial003.main()` mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.</violation>
</file>
Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.
Fix all with cubic | Re-trigger cubic
| async with Client(mcp) as client: | ||
| tools = await client.list_tools() | ||
| print(f"{len(tools.tools)} tools, fresh for {tools.ttl_ms / 1000:.0f}s, scope={tools.cache_scope}") | ||
| global now |
There was a problem hiding this comment.
P2: tutorial003.main() mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docs_src/caching/tutorial003.py, line 27:
<comment>`tutorial003.main()` mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.</comment>
<file context>
@@ -1,15 +1,34 @@
- async with Client(mcp) as client:
- tools = await client.list_tools()
- print(f"{len(tools.tools)} tools, fresh for {tools.ttl_ms / 1000:.0f}s, scope={tools.cache_scope}")
+ global now
+ async with Client(server, cache=CacheConfig(clock=lambda: now)) as client:
+ await client.list_tools() # fetch 1
</file context>
| global now | |
| global now, fetches | |
| now = 1_000_000.0 | |
| fetches = 0 |
There was a problem hiding this comment.
1 issue found across 7 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="docs_src/caching/tutorial003.py">
<violation number="1" location="docs_src/caching/tutorial003.py:27">
P2: `tutorial003.main()` mutates module-global cache demo state without resetting it, so repeated runs produce incorrect/non-deterministic output.</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic | Re-trigger cubic
There was a problem hiding this comment.
The store set-failure issue I flagged earlier is now fixed at HEAD (the failed-set path purges the warm own-arm entry, with a covering test), and I found no new bugs in this revision — but this PR adds a whole client-side caching subsystem with multi-tenant sharing semantics, so it still needs a human reviewer.
Extended reasoning...
Overview
This PR implements the client half of SEP-2549: a new mcp.client.caching module (store contract, in-memory LRU store, and a ClientResponseCache coordinator), wiring into Client (per-call cache_mode, server-identity hashing, notification-driven eviction), an inbound ttlMs clamp in ClientSession, the move of CacheableMethod/CACHEABLE_METHODS into mcp_types.methods, server-side fixes for dict-shaped handler results, plus extensive docs and ~3,000 lines of new tests.
Status of prior findings
My earlier inline comment flagged that a failed store set during a refresh left the superseded warm entry serving. At the current head, ClientResponseCache.write checks the boolean from _set and purges the own arm on failure (src/mcp/client/caching.py:286-289), and tests/client/test_caching.py adds test_a_failed_set_purges_the_pre_existing_own_arm_entry covering the warm-entry case. The cubic doc-wording comments also appear addressed. This run's bug hunt surfaced no new issues.
Security risks
The cache introduces real cross-tenant sharing surface: partition/arm key construction, the share_public opt-in, server-identity derivation from URLs (userinfo stripping, sha256 before key material), and era scoping. The defaults look conservative (private-by-default, partition-scoped public entries, custom store requires explicit partition), but these are exactly the kinds of decisions a maintainer should sign off on, since a subtle keying mistake leaks one principal's responses to another.
Level of scrutiny
High. This touches the core Client request path for five verbs, changes default behavior (caching on by default for hint-bearing 2026 servers), and adds public API surface (CacheConfig, ResponseCacheStore, cache_mode) that will be hard to change later. Cancellation-shielded store cleanup and eviction-race generation tracking are also concurrency-sensitive.
Other factors
Test coverage is unusually thorough (store contract tests, fault-injecting stores, scripted legacy/modern peers, e2e eviction tests), which raises confidence, but the size, new public API, and multi-tenant semantics put this well outside the bar for shadow approval.
There was a problem hiding this comment.
Thanks for addressing the failed-set purge — the fix in ClientResponseCache.write (with the warm-entry regression test) resolves my earlier inline comment, and I found no new issues in this revision. That said, this is a large change to the client's core verbs with security-relevant cache partitioning semantics, so it deserves a maintainer's review rather than an automated approval.
Extended reasoning...
Overview
This PR adds the full client half of SEP-2549: a new mcp.client.caching module (store contract, in-memory LRU store, ClientResponseCache coordinator), wiring through Client (cache= kwarg, per-call cache_mode, server-identity derivation, notification-driven eviction), a ttlMs clamp at the session parse seam, tool-listing absorption/pruning changes in ClientSession, server-side fixes for dict-shaped handler results, plus extensive docs and ~3,000 lines of new tests.
Prior feedback status
My earlier inline comment (ignored _set return leaving a superseded warm entry servable after a failed store set) is fixed at HEAD: write() now purges the own arm when _set returns False, mirroring the failed-opposite-arm-delete path, and test_a_failed_set_purges_the_pre_existing_own_arm_entry covers the warm-entry case. The cubic findings on userinfo stripping and the docs qualifications were also addressed in subsequent commits. No new issues were identified in this pass.
Security risks
The cache introduces genuinely security-relevant surface: partition isolation of private entries in shared stores, the share_public cross-tenant opt-in, server-identity hashing (credentials stripped from URLs before keying), and era-scoping of entries. The implementation appears careful (partition-scoped public arm by default, sha256 of identity, constructor-only sharing widening, scope verification on read), and these properties are pinned by tests — but the consequences of a mistake here are cross-tenant data leakage, which is exactly the kind of judgment call a human maintainer should sign off on.
Level of scrutiny
High. This touches every cacheable Client verb's hot path, changes default behavior (cache on by default, even though hint-less servers see identical traffic), and adds a public API surface (CacheConfig, ResponseCacheStore, cache_mode) the project will need to maintain. There are also design decisions worth a maintainer's eye: deviating from the TypeScript SDK on public-entry partitioning, the meta-implies-refresh rule, and the no-coalescing/no-stale-if-error rulings.
Other factors
Test coverage is unusually thorough (store contract, fault-injecting stores, cancellation/shielding paths, e2e against scripted legacy/modern peers), which raises confidence in correctness, but the sheer size and the API/design decisions keep this firmly outside the scope of automated approval.
Implements the client half of SEP-2549:
Clientnow honors thettlMs/cacheScopehints servers stamp on the six cacheable results, serving fresh repeat calls from a response cache instead of refetching. Builds on #3015 (the server-side authoring surface); part of #2899.Surface
Client(cache=...):None(default) = on with a per-client in-memory store; aCacheConfigto customize (pluggable asyncResponseCacheStore, partition, server-identity override, default TTL, injectable clock);False= off entirely.cache_mode: "use" | "refresh" | "bypass"onlist_tools,list_prompts,list_resources,list_resource_templates,read_resource.mcp.clientexports the config and store-contract types;CacheableMethod/CACHEABLE_METHODSmoved tomcp_types.methods(derived from the method registry, re-exported frommcp.server.cachingunchanged).Semantics worth reviewing
ttlMs/cacheScopekeys on the wire (the 2025 surfaces validate-and-discard unknown keys), so wire presence is not an era signal — the cache gates on the negotiated version and treats legacy results as hint-absent.CacheConfig.target_id; neverserverInfo) plus the caller-supplied partition."public"entries are partition-scoped by default;CacheConfig.share_public(constructor-only) opts into cross-partition serving. A custom store requires an explicit partition; a custom store with no derivable server identity is a construction-time error.read_resourcecalls skip the cache entirely,input_requiredresults are never stored, and terminal results reached through driver rounds are not cached.INVALID_PARAMSrejection evicts the method's entry (spec SHOULD).*_list_changedevicts the matching list(s),resources/updatedevicts that URI (exact-URI, both arms).metaalways reach the server — a progress token or tracing key is never swallowed by a cache hit.tools/listcache hit re-applies the session's post-fetch pipeline (x-mcp-header MUST filter,Mcp-Param-*header maps, output schemas), so served and fetched listings behave identically on latertools/calls.ttlMsis now treated as 0 at the parse seams (spec SHOULD) instead of failing the call — this also fixesmode="auto"silently downgrading to the legacy handshake when a discover result carried a negative TTL.Server-side fixes that ride along
cache_hintsnow apply to mapping-shaped handler returns per-field (previously a dict return on a hinted method failed the request at 2026, since the surface requires the fields and the stamp only ran on model instances).validate_cache_hintsreports non-string unknown keys with the intendedValueErrorinstead of aTypeErrorfromsorted/join.Docs & tests
docs/advanced/caching.mdclient section rewritten around the real cache (was "no built-in response cache"), tutorial003 is now a runnable cache demo with an injected clock, migration note added. ~130 new tests (unit, e2e againstServer(cache_hints=...), scripted legacy/modern peers, fault-injecting stores), 100% branch coverage maintained.AI Disclaimer