Skip to content

fix(auth): retry token exchange on transient failures#129

Merged
zfarrell merged 4 commits into
mainfrom
fix/retry-token-exchange-113
Jun 27, 2026
Merged

fix(auth): retry token exchange on transient failures#129
zfarrell merged 4 commits into
mainfrom
fix/retry-token-exchange-113

Conversation

@zfarrell

Copy link
Copy Markdown
Contributor

Closes #113. The JWT token-exchange (POST /v1/auth/jwt) now retries transient failures — 5xx responses and transport errors — with a bounded budget (3 attempts) and exponential backoff plus jitter, while 4xx stays fatal; the cached-JWT fast path is served lock-free so a retrying mint can't block callers that need no mint.

Comment thread hotdata/_auth.py Outdated
_MAX_ATTEMPTS = 3 # one initial attempt + up to two retries
_BACKOFF_BASE = 0.1 # seconds -- first retry waits ~this, doubling thereafter
_BACKOFF_MAX = 2.0 # cap on a single backoff so a flapping host can't stall us
_BACKOFF_JITTER = 0.5 # +/- fraction of jitter added to spread retries out

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: the comment says +/- fraction but the jitter is purely additive — _backoff_delay computes base * (1 + _BACKOFF_JITTER * random.random()), giving a delay in [base, base * 1.5] (never below base). The _backoff_delay docstring describes it correctly as additive [0, _BACKOFF_JITTER * base]; only this constant's inline comment is off. Consider # +fraction of jitter added ... to match. (not blocking)

claude[bot]
claude Bot previously approved these changes Jun 27, 2026

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retry logic is correct and thoroughly tested: retries=False makes the explicit loop the sole arbiter of the attempt budget, 4xx stays fatal, 5xx + transport errors retry with bounded backoff+jitter, and the lock-free fast path correctly avoids serializing cache hits behind an in-flight mint. One super-nit comment inline.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retry logic is correct and the lock-free fast path is sound. The prior jitter-comment nit is resolved. Tests cover the transient cases (5xx/transport retry, budget exhaustion, 4xx no-retry, refresh fallback, lock-free fast path) meaningfully.

@zfarrell zfarrell merged commit d073681 into main Jun 27, 2026
4 checks passed
@zfarrell zfarrell deleted the fix/retry-token-exchange-113 branch June 27, 2026 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Retry token exchange on transient server errors

1 participant