Skip to content

Flank On-Call

This page is the on-call quick-reference for flank. For step-by-step procedures see flank-runbook. For error semantics see flank-errors.

What flank is and isn’t

  • Is: the visual workflow editor — a React/ReactFlow app on port 3100 where admins author the workflow graphs that brain executes. Stateless; all data lives in brain.
  • Isn’t: the workflow engine, or the source of truth for workflows, executions, adapters, or secrets — those live in brain. Flank does not run executions.

Blame propagation order: editor → flank server functions → brain. If brain is down, flank cannot meaningfully serve traffic.

Top alerts

AlertLikely causeFirst actionEscalate to
beef-flank Railway healthcheck failingProcess crash or boot failure (port collision, build error)Open Railway logs; look for server.started. Redeploy if last healthy commit is recent.flank owner (@law)
Editor loads but workflow lists empty / saves failFlank can’t reach brain — BRAIN_API_URL wrong, brain down, or JWT rejectedVerify BRAIN_API_URL on beef-flank; check brain health; check Clerk session/ADMIN role. See runbook → “Editor can’t reach brain”.brain on-call (brain down), platform/auth (Clerk)
Brain fetch errors flooding flank logs (brain … failed)Brain unavailable or returning 5xxCheck brain health and BRAIN_API_URL.brain on-call
Executions stuck or failingEngine/storage problem in brain — flank only reads tracesInvestigate in brain; brain re-enqueues RUNNING executions on boot.brain on-call
401/403 from brain on workflow operationsExpired Clerk session, missing ADMIN role, or Clerk keys rotated without redeployVerify operator has ADMIN in brain; check CLERK_* env on beef-flank.platform/auth owner
401/Unauthorized flood on UI server functionsClerk outage or CLERK_SECRET_KEY rotated without redeployCheck Clerk status; verify env on beef-flank.platform/auth owner

Decision tree

  1. Is brain healthy? If no, defer flank investigation — flank can’t read or write any workflow data without brain.
  2. Is the flank pod up? Railway dashboard. If down, check last deploy and roll back if a recent change correlates.
  3. Can flank reach brain? Check BRAIN_API_URL and tail flank logs for brain … failed fetch errors.
  4. Is auth working? 401/403 from brain → Clerk session or ADMIN-role problem.
  5. Are executions the real issue? If executions are stuck or failing, that’s a brain incident — escalate to brain on-call.

Escalation

DomainOwnerWhen to page
Flank editor app, deploy, Railway service@lawDefault flank on-call
Brain workflow engine, executions, adapters, secretsbrain on-callExecutions stuck/failing, storage errors, adapter issues
Railway platform / networkingplatform on-callHealthcheck, internal DNS, deploy pipeline
Clerk / authplatform/auth ownerUI auth flood, brain 401/403, key rotation

When you’ve handled an incident, update the runbook with anything missing and bump last_reviewed on this page.