Brain Inference Providers
Brain Inference Providers
Brain delegates image and video generation to external inference providers. Each provider has one or more adapter classes that handle scheduling async tasks, polling for completion, and mapping provider-specific responses into a common status model.
Provider inventory
The InferenceProvider enum (apps/brain/src/modules/domain/generation/types/common.ts) defines all recognized providers:
| Provider | Enum value | Image edit | Image sequence | Image-to-video | Text-to-video |
|---|---|---|---|---|---|
| AtlasCloud | ATLAS | Seedream V4.5 edit | Seedream V4.5 edit-sequential | Seedance 2.0 | Seedance 2.0 |
| WaveSpeed | WAVESPEED | Seedream V4.5 edit | Seedream V4.5 edit-sequential | Custom i2v | Seedance 2.0 |
| RunPod | RUNPOD | Seedream V4.5 edit | — | — | — |
| FAL | FAL | — | — | Dedicated adapter | — |
Not every provider supports every generation type. Executors silently skip providers that have no adapter for the requested operation.
Adapter architecture
All provider adapters extend InferenceAdapter (apps/brain/src/modules/domain/generation/adapters/inference-adapter.base.ts). The base class provides:
- Input file resolution via pluggable strategies (
SignedUrl,PublicUrl,Preupload) for converting internal storage paths to URLs the provider can fetch. - Adapter identity tracking for the generation dashboard, recording which adapter, model URL, and payload were used.
- Storage access for downloading provider outputs and uploading them to R2.
Each provider has its own base adapter that adds provider-specific auth, HTTP client setup, and status mapping:
| Module path | Base adapter | Concrete adapters |
|---|---|---|
apps/brain/src/modules/application/atlas/ | AtlasAdapter | AtlasSeedreamV4EditAdapter, AtlasSeedreamV4EditSequentialAdapter, AtlasImageToVideoAdapter, AtlasTextToVideoAdapter |
apps/brain/src/modules/application/wavespeed/ | WaveSpeedAdapter | WaveSpeedSeedreamV4EditAdapter, WaveSpeedSeedreamV4EditSequentialAdapter, WaveSpeedImageToVideoAdapter, WaveSpeedTextToVideoAdapter, WaveSpeedFlux2KleinEditAdapter, WaveSpeedFlux2KleinEditLoraAdapter |
apps/brain/src/modules/application/runpod/ | RunPodAdapter | RunPodSeedreamV4EditAdapter, RunPodImageToVideoAdapter |
All adapters follow the same two-phase pattern: schedule (POST a generation request, receive a task ID) then poll (GET the task status until completed or failed). The poll response is normalized to AsyncTaskStatus values (QUEUED, PROCESSING, COMPLETED, FAILED). Content moderation failures from providers are mapped to ContentModerationException.
Executor fallback chains
Each generation type has a dedicated executor that iterates an ordered list of provider attempts. When a provider returns FAILED at schedule time, the executor logs a warning and moves to the next provider. Thrown errors (network failures, content moderation) abort the chain immediately.
| Executor | Key | Default chain |
|---|---|---|
ImageGenerationExecutor | image-generation | ATLAS, WAVESPEED, RUNPOD |
ImageSequenceGenerationExecutor | image-sequence-generation | ATLAS, WAVESPEED |
ImageToVideoExecutor | — | Determined by jobData.videoProcessor (single provider, no fallback chain) |
ReferenceToVideoExecutor | — | Determined by jobData.videoProcessor; supports WAVESPEED and ATLAS |
Image and image-sequence executors support configurable chain ordering via the traffic split setting described below. Video executors use the provider specified in the job data and do not participate in traffic splitting.
Traffic split setting
The inference-traffic-split application setting controls provider ordering for image and image-sequence executors. It allows progressive rollout of a new provider without code deploys.
Schema
Stored as a JSON document in the application_settings table with id inference-traffic-split. Validated by inferenceTrafficSplitDataSchema in apps/brain/src/modules/application/applicationSettings/application-settings.schemas.ts.
{ "image-generation": [ { "order": ["ATLAS", "WAVESPEED", "RUNPOD"], "probability": 10 }, { "order": ["WAVESPEED", "ATLAS", "RUNPOD"], "probability": 90 } ], "image-sequence-generation": [ { "order": ["ATLAS", "WAVESPEED"], "probability": 100 }, { "order": ["WAVESPEED", "ATLAS"], "probability": 0 } ]}Rules:
- Each executor key maps to exactly 2 chains.
- Each chain has an
order(array ofInferenceProviderenum values) and aprobability(integer 0—100). - The two probabilities must sum to 100.
- Providers in
orderthat the executor does not recognize are silently skipped at runtime. - If the setting is missing or the executor key is absent, the executor falls back to its hardcoded default chain.
How it works
flowchart TD Execute["Executor.execute()"] ReadSetting["getSettingsEntry('inference-traffic-split')"] HasKey{Executor key exists?} DefaultChain["Use default chain"] Roll["Math.random() * 100"] ChainA["Chain A"] ChainB["Chain B"] BuildAttempts["Build attempts from selected chain, skip unknown providers"] Iterate["Try providers in order until success"]
Execute --> ReadSetting ReadSetting --> HasKey HasKey -->|no| DefaultChain HasKey -->|yes| Roll Roll -->|"roll < A.probability"| ChainA Roll -->|else| ChainB ChainA --> BuildAttempts ChainB --> BuildAttempts DefaultChain --> BuildAttempts BuildAttempts --> IterateEach executor maps providers to attempt factories. The selected chain’s order determines which factories are instantiated and in what sequence. The executor logs the selected chain and probability for observability.
Administration
Update the setting via the Brain REST API:
PUT /application-settings/inference-traffic-splitAuthorization: Bearer <clerk-token>Content-Type: application/json
{ "data": { "image-generation": [ { "order": ["ATLAS", "WAVESPEED", "RUNPOD"], "probability": 50 }, { "order": ["WAVESPEED", "ATLAS", "RUNPOD"], "probability": 50 } ], "image-sequence-generation": [ { "order": ["ATLAS", "WAVESPEED"], "probability": 50 }, { "order": ["WAVESPEED", "ATLAS"], "probability": 50 } ] }}The endpoint is Clerk-authenticated and role-gated. Zod validation rejects payloads with invalid provider names, wrong chain count, or probabilities not summing to 100.
Default seed
The migration at apps/brain/prisma/migrations/20260519120000_seed_inference_traffic_split/migration.sql seeds the setting with 100% Atlas-first, preserving the behavior established before the traffic split was introduced.
Adding a new provider
- Add the enum value to
InferenceProviderinapps/brain/src/modules/domain/generation/types/common.ts. - Create a provider module under
apps/brain/src/modules/application/<provider>/with a base adapter and concrete adapters for each supported generation type. - Register the module in
apps/brain/src/modules/domain/generation/generation.module.ts. - Add entries to
providerAttemptFactoriesin the relevant executors. - Update the
inference-traffic-splitsetting to include the new provider in chain orders. - Add the provider API key to
apps/brain/.env.exampleand document it in Brain Environment Variables.