Brain Inference Providers

Brain delegates image and video generation to external inference providers. Each provider has one or more adapter classes that handle scheduling async tasks, polling for completion, and mapping provider-specific responses into a common status model.

Provider inventory

The InferenceProvider enum (apps/brain/src/modules/domain/generation/types/common.ts) defines all recognized providers:

Provider	Enum value	Image edit	Image sequence	Image-to-video	Text-to-video
AtlasCloud	`ATLAS`	Seedream V4.5 edit	Seedream V4.5 edit-sequential	Seedance 2.0	Seedance 2.0
WaveSpeed	`WAVESPEED`	Seedream V4.5 edit	Seedream V4.5 edit-sequential	Custom i2v	Seedance 2.0
RunPod	`RUNPOD`	Seedream V4.5 edit	—	—	—
FAL	`FAL`	—	—	Dedicated adapter	—

Not every provider supports every generation type. Executors silently skip providers that have no adapter for the requested operation.

Adapter architecture

All provider adapters extend InferenceAdapter (apps/brain/src/modules/domain/generation/adapters/inference-adapter.base.ts). The base class provides:

Input file resolution via pluggable strategies (SignedUrl, PublicUrl, Preupload) for converting internal storage paths to URLs the provider can fetch.
Adapter identity tracking for the generation dashboard, recording which adapter, model URL, and payload were used.
Storage access for downloading provider outputs and uploading them to R2.

Each provider has its own base adapter that adds provider-specific auth, HTTP client setup, and status mapping:

Module path	Base adapter	Concrete adapters
`apps/brain/src/modules/application/atlas/`	`AtlasAdapter`	`AtlasSeedreamV4EditAdapter`, `AtlasSeedreamV4EditSequentialAdapter`, `AtlasImageToVideoAdapter`, `AtlasTextToVideoAdapter`
`apps/brain/src/modules/application/wavespeed/`	`WaveSpeedAdapter`	`WaveSpeedSeedreamV4EditAdapter`, `WaveSpeedSeedreamV4EditSequentialAdapter`, `WaveSpeedImageToVideoAdapter`, `WaveSpeedTextToVideoAdapter`, `WaveSpeedFlux2KleinEditAdapter`, `WaveSpeedFlux2KleinEditLoraAdapter`
`apps/brain/src/modules/application/runpod/`	`RunPodAdapter`	`RunPodSeedreamV4EditAdapter`, `RunPodImageToVideoAdapter`

All adapters follow the same two-phase pattern: schedule (POST a generation request, receive a task ID) then poll (GET the task status until completed or failed). The poll response is normalized to AsyncTaskStatus values (QUEUED, PROCESSING, COMPLETED, FAILED). Content moderation failures from providers are mapped to ContentModerationException.

Executor fallback chains

Each generation type has a dedicated executor that iterates an ordered list of provider attempts. When a provider returns FAILED at schedule time, the executor logs a warning and moves to the next provider. Thrown errors (network failures, content moderation) abort the chain immediately.

Executor	Key	Default chain
`ImageGenerationExecutor`	`image-generation`	ATLAS, WAVESPEED, RUNPOD
`ImageSequenceGenerationExecutor`	`image-sequence-generation`	ATLAS, WAVESPEED
`ImageToVideoExecutor`	—	Determined by `jobData.videoProcessor` (single provider, no fallback chain)
`ReferenceToVideoExecutor`	—	Determined by `jobData.videoProcessor`; supports WAVESPEED and ATLAS

Image and image-sequence executors support configurable chain ordering via the traffic split setting described below. Video executors use the provider specified in the job data and do not participate in traffic splitting.

Traffic split setting

The inference-traffic-split application setting controls provider ordering for image and image-sequence executors. It allows progressive rollout of a new provider without code deploys.

Schema

Stored as a JSON document in the application_settings table with id inference-traffic-split. Validated by inferenceTrafficSplitDataSchema in apps/brain/src/modules/application/applicationSettings/application-settings.schemas.ts.

{
  "image-generation": [
    { "order": ["ATLAS", "WAVESPEED", "RUNPOD"], "probability": 10 },
    { "order": ["WAVESPEED", "ATLAS", "RUNPOD"], "probability": 90 }
  ],
  "image-sequence-generation": [
    { "order": ["ATLAS", "WAVESPEED"], "probability": 100 },
    { "order": ["WAVESPEED", "ATLAS"], "probability": 0 }
  ]
}

Rules:

Each executor key maps to exactly 2 chains.
Each chain has an order (array of InferenceProvider enum values) and a probability (integer 0—100).
The two probabilities must sum to 100.
Providers in order that the executor does not recognize are silently skipped at runtime.
If the setting is missing or the executor key is absent, the executor falls back to its hardcoded default chain.

How it works

flowchart TD
    Execute["Executor.execute()"]
    ReadSetting["getSettingsEntry('inference-traffic-split')"]
    HasKey{Executor key exists?}
    DefaultChain["Use default chain"]
    Roll["Math.random() * 100"]
    ChainA["Chain A"]
    ChainB["Chain B"]
    BuildAttempts["Build attempts from selected chain, skip unknown providers"]
    Iterate["Try providers in order until success"]

    Execute --> ReadSetting
    ReadSetting --> HasKey
    HasKey -->|no| DefaultChain
    HasKey -->|yes| Roll
    Roll -->|"roll < A.probability"| ChainA
    Roll -->|else| ChainB
    ChainA --> BuildAttempts
    ChainB --> BuildAttempts
    DefaultChain --> BuildAttempts
    BuildAttempts --> Iterate

Each executor maps providers to attempt factories. The selected chain’s order determines which factories are instantiated and in what sequence. The executor logs the selected chain and probability for observability.

Administration

Update the setting via the Brain REST API:

PUT /application-settings/inference-traffic-split
Authorization: Bearer <clerk-token>
Content-Type: application/json

{
  "data": {
    "image-generation": [
      { "order": ["ATLAS", "WAVESPEED", "RUNPOD"], "probability": 50 },
      { "order": ["WAVESPEED", "ATLAS", "RUNPOD"], "probability": 50 }
    ],
    "image-sequence-generation": [
      { "order": ["ATLAS", "WAVESPEED"], "probability": 50 },
      { "order": ["WAVESPEED", "ATLAS"], "probability": 50 }
    ]
  }
}

The endpoint is Clerk-authenticated and role-gated. Zod validation rejects payloads with invalid provider names, wrong chain count, or probabilities not summing to 100.

Default seed

The migration at apps/brain/prisma/migrations/20260519120000_seed_inference_traffic_split/migration.sql seeds the setting with 100% Atlas-first, preserving the behavior established before the traffic split was introduced.

Adding a new provider

Add the enum value to InferenceProvider in apps/brain/src/modules/domain/generation/types/common.ts.
Create a provider module under apps/brain/src/modules/application/<provider>/ with a base adapter and concrete adapters for each supported generation type.
Register the module in apps/brain/src/modules/domain/generation/generation.module.ts.
Add entries to providerAttemptFactories in the relevant executors.
Update the inference-traffic-split setting to include the new provider in chain orders.
Add the provider API key to apps/brain/.env.example and document it in Brain Environment Variables.