Skip to content

Brisket Runbook

Brisket is the Next.js 16 frontend (apps/brisket/). CI lives in .github/workflows/brisket.yml (lint, typecheck, test only — no deploy step). Deploy is performed by Railway, which builds from apps/brisket/Dockerfile and watches apps/brisket/** (apps/brisket/railway.json: builder: DOCKERFILE, watchPatterns: ["/apps/brisket/**"], healthcheckPath: /api/health). This runbook documents the common operational tasks.

CI Pipeline

/.github/workflows/brisket.yml triggers on push / pull_request to main and release when apps/brisket/** or shared ESLint configs change. Three jobs run in parallel on ubuntu-latest, Node 22, pnpm 10.26.0:

JobCommandWorking dir
brisket-lintpnpm lintapps/brisket
brisket-typecheckpnpm typecheck (tsc --noEmit)apps/brisket
brisket-testpnpm test (vitest run)apps/brisket

All three must pass before merge. Build (next build --turbopack) is exercised by Railway on deploy, not by the GitHub workflow — a green CI does not guarantee a successful production build. Railway watches apps/brisket/** (railway.json) and rebuilds whenever a tracked path changes on the branch wired to a given environment; TODO(@marty): document the exact branch → Railway environment mapping (staging vs. production) and whether production requires a manual promote.

Deploy

flowchart LR
PR[Pull Request] --> CI[brisket.yml<br/>lint/typecheck/test]
CI --> Merge[Merge to main]
Merge --> Stage[Railway: staging deploy]
Stage --> Promote[Merge main → release]
Promote --> Prod[Railway: production deploy]
  1. Staging: merging to main triggers Railway to build (pnpm --filter brisket build) and deploy to the staging environment. TODO(@marty): staging hostname (not encoded in the repo).
  2. Production: open a release PR main → release; merge fast-forwards production. Watch the Railway deploy log for the new revision; smoke-test /api/health and a logged-in /explore page.
  3. Env updates: any new env var must be added to the Railway environment before deploy, otherwise Zod validation in env.js aborts the build with Invalid environment variables.

Rollback

Rollback options, in order of preference:

  1. Railway “Redeploy previous” — pick the prior healthy revision in the Railway UI. Fastest; no code changes.
  2. Git revertgit revert <sha> on release, push, let Railway redeploy. Use when the bad revision must be excised from history.
  3. Hotfix branch — branch from the last known-good commit on release, cherry-pick the fix, force-merge with maintainer approval. Reserve for urgent partial fixes.

Confirm rollback by tailing Railway logs and validating /api/health plus a tRPC round-trip (subscription.getCountryCode is cheap and protected — good canary).

Build / Cache Issues

Symptoms and fixes:

  • Invalid environment variables at build time → variable missing in Railway. Add it; redeploy. To unblock a hotfix, set SKIP_ENV_VALIDATION=1 on the build (Docker/Railway) but never run runtime with it set.
  • Turbopack Module not found after dependency bump → stale .next/ or pnpm overrides drift. Locally: rm -rf apps/brisket/.next apps/brisket/node_modules && pnpm install. On Railway: clear build cache via the service settings.
  • React 19 type errors after dep bump → check package.json pnpm.overrides block for forced versions. Realign with the override.
  • OOM during next build → Turbopack memory spike. Bump Railway build container memory or add NODE_OPTIONS=--max-old-space-size=4096 to the build env.

ISR / Data Cache Flush

Brisket’s only known long-TTL fetch is the Strapi FAQ section in server/api/routers/faq.ts with next: { revalidate: 3600 }.

To flush:

  • Per-request: deploy to bump the build id; all data caches are scoped per build.
  • Targeted: implement revalidateTag / revalidatePath in a server action triggered by a Strapi webhook. No such handler exists today (verified — no revalidateTag/revalidatePath calls and no Strapi webhook route under apps/brisket/src); FAQ updates currently wait up to one hour.
  • Browser SW / static assets: bump the deploy. Next.js writes content-hashed asset URLs, so users pick up new bundles after their next navigation.

Common Incidents

Sign-in failures across the board

  1. Check Clerk dashboard health and verify the publishable + secret key pair points to the same Clerk instance as the deploy stage (brisket-env).
  2. Confirm /api/webhook/clerk is reachable — Clerk webhook page shows recent successful deliveries.
  3. Inspect middleware redirects in Axiom (http.target=/sign-in) for loops.

Checkout broken

  1. Tail tRPC spans subscription.createPrimerCheckout in Axiom. Look for Security check failed (Turnstile) or downstream sirloin Connect errors.
  2. Confirm CHARGE_BEE_API_KEY and NEXT_PUBLIC_CHARGE_BEE_* align (publishable + site slug + server key must be the same Chargebee account).
  3. Verify Primer status if errors mention dunning / payment intent.
  4. See brisket-billing-flow for the full sequence.

Sirloin upstream flapping

Symptoms: bursty code: 14 (Unavailable) or code: 4 (DeadlineExceeded) on most procedures. The Connect transport refreshes itself every 30 min (server/api/sirloin-api.ts), but a sirloin-side outage requires escalation to the sirloin on-call. Brisket has no fallback path.

Webhook signature failures

POST /api/webhook/clerk returning 400 Invalid signature repeatedly: check CLERK_WEBHOOK_SECRET matches the value in the Clerk dashboard’s webhook config for that endpoint URL; rotating Clerk keys requires updating both sides simultaneously.

Build green, runtime broken

Indicates env var present at build but absent or different at runtime. Check Railway service vs build environment scoping. Compare BRISKET_STAGE between build and runtime.

Reference

  • CI: .github/workflows/brisket.yml
  • Build: next build --turbopack (apps/brisket/package.json)
  • Local: see brisket-local-dev
  • Env: see brisket-env
  • Errors: see brisket-errors
  • Operations anchor: operations/observability, operations/testing-strategy