feat(webapp): billing limits — pause, reject, recovery, and settings UI by kathiekiwi · Pull Request #3996 · triggerdotdev/trigger.dev

kathiekiwi · 2026-06-19T13:09:43Z

Summary

Adds Billing Limits to the webapp.

Customers can set a monthly spend cap. When usage crosses the limit, billable environments enter a grace period. If the limit is not resolved before grace expires, new triggers are rejected until the organization increases or removes the limit.

The webapp consumes billing-limit state from the billing platform and enforces it across environments, queues, and trigger creation.

Depends on the matching cloud billing PR.

User-facing changes

Billing Limits settings

New /settings/billing-limits page replaces the standalone billing-alerts page.
Configure:
- plan limit
- custom limit
- no limit
Configure billing alerts and notification emails.
Resolve active billing limits by increasing or removing the limit.

Org-wide banners

Adds banners for:

grace period
rejected state
billing limits not configured
upgrade prompts

Usage page

Shows the configured billing limit on the spend chart.

Enforcement

Billable environments are paused when an org enters grace.
New triggers are rejected once grace expires.
Billing-limit pauses cannot be manually resumed.
New environments created during grace/rejected inherit the correct paused state.
Recovery supports:
- resuming queued runs
- cancelling queued runs and starting fresh
- optional cancellation of in-progress runs when a limit is reached

Infrastructure

Adds billing-limit workers and reconciliation.
Adds admin endpoints used by the billing platform.
Adds BILLING_LIMIT as an environment pause source.

Test plan

Configure limits, alerts, and emails.
Verify grace and rejected flows.
Verify trigger rejection after grace expiry.
Verify recovery flows (queue and new_only).
Verify new environments created during grace start paused.
Verify billing-limit pauses cannot be manually resumed.
Verify billing limit marker on the usage chart.

Notes

isConfigured: false means no billing limit has been configured yet.
mode: "none" means the customer explicitly opted out.
Grace pauses execution but still accepts triggers.
Rejected blocks new triggers.

changeset-bot · 2026-06-19T13:09:48Z

⚠️ No Changeset found

Latest commit: 6608555

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-06-19T13:10:03Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
✅ Review completed - (🔄 Check again to review again)

Walkthrough

This PR implements a billing limits feature that lets organizations configure a monthly compute spend cap. When the cap is reached, a billing platform webhook triggers a grace period during which billable environments are paused; new task triggers are rejected once the grace period ends. A recovery flow lets users increase or remove the limit and choose to resume queued runs or accept cancellation. A Redis-backed worker periodically reconciles environment pause state against billing platform data. The feature adds a new /settings/billing-limits settings page that consolidates limit configuration, alert thresholds, and the recovery panel, replacing the old /settings/billing-alerts route (which now redirects). The unified OrgBanner component replaces the prior UpgradePrompt and EnvironmentBanner components with a selector-driven switch over all banner states. Environment pause-source tracking now distinguishes billing-limit-enforced pauses from manual pauses, gating pause/resume operations accordingly.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 21.53% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and specifically summarizes the main changes: implementing billing limits with pause, reject, recovery, and settings UI for the webapp.
Description check	✅ Passed	The PR description is comprehensive and well-structured, covering summary, user-facing changes, enforcement, infrastructure, test plan, and implementation notes.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/billing-limits

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

pkg-pr-new · 2026-06-22T09:08:33Z

Open in StackBlitz

@trigger.dev/build

npm i https://pkg.pr.new/@trigger.dev/build@cb4df39

trigger.dev

npm i https://pkg.pr.new/trigger.dev@cb4df39

@trigger.dev/core

npm i https://pkg.pr.new/@trigger.dev/core@cb4df39

@trigger.dev/python

npm i https://pkg.pr.new/@trigger.dev/python@cb4df39

@trigger.dev/react-hooks

npm i https://pkg.pr.new/@trigger.dev/react-hooks@cb4df39

@trigger.dev/redis-worker

npm i https://pkg.pr.new/@trigger.dev/redis-worker@cb4df39

@trigger.dev/rsc

npm i https://pkg.pr.new/@trigger.dev/rsc@cb4df39

@trigger.dev/schema-to-json

npm i https://pkg.pr.new/@trigger.dev/schema-to-json@cb4df39

@trigger.dev/sdk

npm i https://pkg.pr.new/@trigger.dev/sdk@cb4df39

commit: cb4df39

Add the EnvironmentPauseSource enum and migration, plus the billing-limit platform client wrappers and schemas.

Configure a spend limit, manage billing alerts, and surface org-wide banners.

Converge billable environments to paused via webhook and a reconciliation worker; block manual resume.

Reject triggers with a 422 once entitlement reports no access, and bust the entitlement cache on state changes.

Recovery UI and durable resolve: cancel queued runs before unpausing, with reconciliation as a safety net.

Optionally cancel in-progress runs on limit hit via a deduplicated bulk-cancel job.

devin-ai-integration

Devin Review found 1 new potential issue.

View 1 additional finding in Devin Review.

devin-ai-integration · 2026-06-25T21:10:34Z

+        const existing = await prismaClient.bulkActionGroup.findFirst({
+          where: {
+            environmentId: environment.id,
+            type: BulkActionType.CANCEL,
+            AND: [
+              {
+                params: {
+                  path: ["source"],
+                  equals: options.source,
+                },
+              },
+              {
+                params: {
+                  path: ["dedupeKey"],
+                  equals: options.dedupeKey,
+                },
+              },
+            ],
+          },
+          select: { id: true, friendlyId: true },
+        });


🚩 Bulk cancel dedupe query scans JSONB params without an index

The BillingLimitBulkCancelService at apps/webapp/app/v3/services/billingLimit/BillingLimitBulkCancelService.server.ts:114-134 deduplicates cancel actions by querying bulkActionGroup.params JSONB path filters (path: ["source"] and path: ["dedupeKey"]). Without a GIN index on the params column of BulkActionGroup, this requires a sequential scan of all cancel bulk actions for the environment. For orgs with many historical bulk actions, this could be slow during billing limit events. The query is scoped to a single environmentId and type: CANCEL, which limits the scan somewhat.

Was this helpful? React with 👍 or 👎 to provide feedback.

… tests Add the usage-bar marker, documentation, and test coverage.

CI unit-test workers have no global Postgres/Redis on localhost (testcontainers use random ports). Two latent fragilities surface once new test files shift the shard layout: - Modules build a Redis-backed singleton at import (auto-increment counter via triggerTask.server) and throw during collection when REDIS_HOST is unset. - Shared background singletons (OrganizationDataStoresRegistry) poll the global database at startup and reject async, which vitest flags as unhandled. Set harmless REDIS_HOST/PORT defaults, swallow only the Prisma P1001 "can't reach database" unhandled rejection (other rejections stay fatal), and inject a runs-repository stub in the dedupe unit test so it does not reach the production clickhouse factory. Temporary infra workaround; owner: platform.

devin-ai-integration

Devin Review found 1 new potential issue.

devin-ai-integration · 2026-06-25T22:57:33Z

+  if (resumeMode === "new_only") {
+    await BillingLimitBulkCancelService.cancelQueuedRuns(organizationId, {
+      dedupeKey: buildBillingLimitResolveDedupeKey(organizationId, resolvedAt),
+    });
+  }
+
+  await convergeBillingLimitEnvironmentsForOrg(organizationId, "ok");


🔴 Queued runs can start executing before cancellation completes when user chooses 'Cancel queued runs' during billing limit resolve

Environments are unpaused (convergeBillingLimitEnvironmentsForOrg at billingLimitConvergeResolve.server.ts:25) immediately after the cancel job is merely enqueued (cancelQueuedRuns at billingLimitConvergeResolve.server.ts:20), so queued runs can be dequeued and start executing before the bulk-cancel worker processes them.

Impact: When a user explicitly chooses "Cancel queued runs" during billing limit resolve, some of those queued runs may execute anyway, potentially incurring charges the user was trying to avoid.

Async bulk cancel is enqueued but environments are unpaused inline before it runs

The convergeBillingLimitResolve function handles the new_only resume mode (user chose to cancel queued runs):

BillingLimitBulkCancelService.cancelQueuedRuns at billingLimitConvergeResolve.server.ts:20-22 creates BulkActionGroup records and enqueues processBulkAction jobs via the common worker (BillingLimitBulkCancelService.server.ts:170). The await resolves when the job is enqueued, not when cancellation is complete.

Immediately after, convergeBillingLimitEnvironmentsForOrg(organizationId, "ok") at billingLimitConvergeResolve.server.ts:25 unpauses all billing-limit-paused environments, restoring their concurrency limits via updateEnvConcurrencyLimits (billingLimitConvergeEnvironments.server.ts:193).

The run queue can now dequeue PENDING runs from those environments.

The bulk cancel worker job hasn't executed yet — it searches for runs with QUEUED_STATUSES (BillingLimitBulkCancelService.server.ts:59), but runs that were already dequeued in step 3 have transitioned to DEQUEUED/EXECUTING and escape cancellation.

The invariant that should hold for resumeMode === "new_only" is: no queued runs from the billing-limit pause window should be dequeued. The current ordering violates this because concurrency is restored before the cancel job runs.

Prompt for agents

The problem is in convergeBillingLimitResolve (billingLimitConvergeResolve.server.ts:19-25). When resumeMode is 'new_only', the function enqueues bulk-cancel jobs for queued runs, then immediately unpauses environments. But the bulk-cancel is async (processed by the common worker later), so the run queue can dequeue runs before the cancel job executes. The fix should ensure that queued runs are cancelled (or at least prevented from being dequeued) BEFORE environments are unpaused. Several approaches: 1. Process the cancel inline (synchronously) instead of via the async worker, then unpause environments. This is the most reliable but may be slow for large backlogs. 2. Use a two-phase approach: first cancel queued runs inline or wait for the bulk cancel to complete, then enqueue a separate job to unpause environments. 3. If inline cancel is too expensive, consider keeping environments paused until the bulk cancel job completes, and have the bulk cancel job trigger the unpause as its final step. The key constraint is that environment concurrency must not be restored until the cancel has processed all queued runs, so the ordering must be: cancel first, then unpause.

Was this helpful? React with 👍 or 👎 to provide feedback.