Skip to content

applyCaching gate keyed on model-name substring, not capability — non-Anthropic cacheable models (openrouter/openai-compatible/copilot) never get cache_control #965

Description

@sumleo

Description

Found via static analysis of prompt-cache anti-patterns (CacheLint), then confirmed by hand on main @ f0fb1e1.

ProviderTransform.message() decides whether to call applyCaching() using a hard-coded model-name gate (packages/opencode/src/provider/transform.ts:285-297):

if (
  (model.providerID === "anthropic" ||
    model.providerID === "google-vertex-anthropic" ||
    model.providerID === "altimate-backend" ||
    model.api.id.includes("anthropic") ||
    model.api.id.includes("claude") ||
    model.id.includes("anthropic") ||
    model.id.includes("claude") ||
    model.api.npm === "@ai-sdk/anthropic") &&
  model.api.npm !== "@ai-sdk/gateway"
) {
  msgs = applyCaching(msgs, model)
}

But applyCaching() itself already defines cache directives for five providers, not just Anthropic (transform.ts:196-210):

const providerOptions = {
  anthropic:        { cacheControl:         { type: "ephemeral" } },
  openrouter:       { cacheControl:         { type: "ephemeral" } },
  bedrock:          { cachePoint:           { type: "default"   } },
  openaiCompatible: { cache_control:        { type: "ephemeral" } },
  copilot:          { copilot_cache_control:{ type: "ephemeral" } },
}

So a cacheable model served through openrouter / openai-compatible / copilot whose id contains neither claude nor anthropic (e.g. a GPT / Gemini / Qwen / Kimi routed through those providers) never enters applyCaching() and never gets a cache breakpoint — even though the function clearly intends to cache it.

This is an under-claim: caching silently fails to engage. It is not cache-busting, and Anthropic-named models are unaffected.

Impact

For an affected model, the system prefix (system prompt + tool schema + earlier turns) is re-sent at full input price on every turn instead of being read from cache. On long agentic loops that is roughly the usual cached-prefix discount forgone each turn, plus higher TTFB — for exactly the self-hosted / BYO-LLM users the project targets. The blast radius is bounded to non-Anthropic-named models that genuinely support explicit cache_control via openrouter/openai-compatible/copilot.

Steps to reproduce

  1. Configure a cacheable model through openrouter (or openai-compatible / copilot) whose id does not contain claude/anthropic (e.g. an OpenRouter-served model that honors cache_control).
  2. Run a multi-turn session.
  3. Observe that no cacheControl/cache_control provider option is stamped on the system/last-user blocks (the applyCaching branch is skipped), so the prefix is billed as fresh input every turn.

Suggested fix (for discussion)

Decouple the applyCaching gate from the model-name list and drive it off an explicit capability/provider-support signal, so the gate matches the set of providers applyCaching already knows how to cache:

  • introduce a capabilities.caching flag on the model (the capabilities schema in provider.ts:787 currently has no caching field), populated from the model registry; or
  • gate on a per-provider "supports prompt cache" set covering anthropic / openrouter / bedrock / openaiCompatible / copilot (the same five keys already in providerOptions).

I want to flag the design angle rather than send a drive-by PR: this is a hot path, and #891 was deliberately deferred for the same "needs design + careful testing, don't regress gateway cache-hit rates" reason. It also overlaps with #891's goal of having a single source of truth for the cache-control gate. I'm happy to open a PR if a maintainer confirms the preferred shape (capability flag vs provider set) and that emitting these directives for non-Anthropic providers is intended.

Caveat: confirmed present and unguarded on main @ f0fb1e1; line numbers may drift.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions