Brisket Runbook
Brisket is the Next.js 16 frontend (apps/brisket/). CI lives in .github/workflows/brisket.yml (lint, typecheck, test only — no deploy step). Deploy is performed by Railway, which builds from apps/brisket/Dockerfile and watches apps/brisket/** (apps/brisket/railway.json: builder: DOCKERFILE, watchPatterns: ["/apps/brisket/**"], healthcheckPath: /api/health). This runbook documents the common operational tasks.
CI Pipeline
/.github/workflows/brisket.yml triggers on push / pull_request to main and release when apps/brisket/** or shared ESLint configs change. Three jobs run in parallel on ubuntu-latest, Node 22, pnpm 10.26.0:
| Job | Command | Working dir |
|---|---|---|
brisket-lint | pnpm lint | apps/brisket |
brisket-typecheck | pnpm typecheck (tsc --noEmit) | apps/brisket |
brisket-test | pnpm test (vitest run) | apps/brisket |
All three must pass before merge. Build (next build --turbopack) is exercised by Railway on deploy, not by the GitHub workflow — a green CI does not guarantee a successful production build. Railway watches apps/brisket/** (railway.json) and rebuilds whenever a tracked path changes on the branch wired to a given environment; TODO(@marty): document the exact branch → Railway environment mapping (staging vs. production) and whether production requires a manual promote.
Deploy
flowchart LR PR[Pull Request] --> CI[brisket.yml<br/>lint/typecheck/test] CI --> Merge[Merge to main] Merge --> Stage[Railway: staging deploy] Stage --> Promote[Merge main → release] Promote --> Prod[Railway: production deploy]- Staging: merging to
maintriggers Railway to build (pnpm --filter brisket build) and deploy to the staging environment. TODO(@marty): staging hostname (not encoded in the repo). - Production: open a release PR
main → release; merge fast-forwards production. Watch the Railway deploy log for the new revision; smoke-test/api/healthand a logged-in/explorepage. - Env updates: any new env var must be added to the Railway environment before deploy, otherwise Zod validation in
env.jsaborts the build withInvalid environment variables.
Rollback
Rollback options, in order of preference:
- Railway “Redeploy previous” — pick the prior healthy revision in the Railway UI. Fastest; no code changes.
- Git revert —
git revert <sha>onrelease, push, let Railway redeploy. Use when the bad revision must be excised from history. - Hotfix branch — branch from the last known-good commit on
release, cherry-pick the fix, force-merge with maintainer approval. Reserve for urgent partial fixes.
Confirm rollback by tailing Railway logs and validating /api/health plus a tRPC round-trip (subscription.getCountryCode is cheap and protected — good canary).
Build / Cache Issues
Symptoms and fixes:
Invalid environment variablesat build time → variable missing in Railway. Add it; redeploy. To unblock a hotfix, setSKIP_ENV_VALIDATION=1on the build (Docker/Railway) but never run runtime with it set.- Turbopack
Module not foundafter dependency bump → stale.next/or pnpm overrides drift. Locally:rm -rf apps/brisket/.next apps/brisket/node_modules && pnpm install. On Railway: clear build cache via the service settings. - React 19 type errors after dep bump → check
package.jsonpnpm.overridesblock for forced versions. Realign with the override. - OOM during
next build→ Turbopack memory spike. Bump Railway build container memory or addNODE_OPTIONS=--max-old-space-size=4096to the build env.
ISR / Data Cache Flush
Brisket’s only known long-TTL fetch is the Strapi FAQ section in server/api/routers/faq.ts with next: { revalidate: 3600 }.
To flush:
- Per-request: deploy to bump the build id; all data caches are scoped per build.
- Targeted: implement
revalidateTag/revalidatePathin a server action triggered by a Strapi webhook. No such handler exists today (verified — norevalidateTag/revalidatePathcalls and no Strapi webhook route underapps/brisket/src); FAQ updates currently wait up to one hour. - Browser SW / static assets: bump the deploy. Next.js writes content-hashed asset URLs, so users pick up new bundles after their next navigation.
Common Incidents
Sign-in failures across the board
- Check Clerk dashboard health and verify the publishable + secret key pair points to the same Clerk instance as the deploy stage (
brisket-env). - Confirm
/api/webhook/clerkis reachable — Clerk webhook page shows recent successful deliveries. - Inspect middleware redirects in Axiom (
http.target=/sign-in) for loops.
Checkout broken
- Tail tRPC spans
subscription.createPrimerCheckoutin Axiom. Look forSecurity check failed(Turnstile) or downstream sirloin Connect errors. - Confirm
CHARGE_BEE_API_KEYandNEXT_PUBLIC_CHARGE_BEE_*align (publishable + site slug + server key must be the same Chargebee account). - Verify Primer status if errors mention dunning / payment intent.
- See
brisket-billing-flowfor the full sequence.
Sirloin upstream flapping
Symptoms: bursty code: 14 (Unavailable) or code: 4 (DeadlineExceeded) on most procedures. The Connect transport refreshes itself every 30 min (server/api/sirloin-api.ts), but a sirloin-side outage requires escalation to the sirloin on-call. Brisket has no fallback path.
Webhook signature failures
POST /api/webhook/clerk returning 400 Invalid signature repeatedly: check CLERK_WEBHOOK_SECRET matches the value in the Clerk dashboard’s webhook config for that endpoint URL; rotating Clerk keys requires updating both sides simultaneously.
Build green, runtime broken
Indicates env var present at build but absent or different at runtime. Check Railway service vs build environment scoping. Compare BRISKET_STAGE between build and runtime.
Reference
- CI:
.github/workflows/brisket.yml - Build:
next build --turbopack(apps/brisket/package.json) - Local: see
brisket-local-dev - Env: see
brisket-env - Errors: see
brisket-errors - Operations anchor:
operations/observability,operations/testing-strategy