Skip to content

Brain Inference Providers

Brain Inference Providers

Brain delegates image and video generation to external inference providers. Each provider has one or more adapter classes that handle scheduling async tasks, polling for completion, and mapping provider-specific responses into a common status model.

Provider inventory

The InferenceProvider enum (apps/brain/src/modules/domain/generation/types/common.ts) defines all recognized providers:

ProviderEnum valueImage editImage sequenceImage-to-videoText-to-video
AtlasCloudATLASSeedream V4.5 editSeedream V4.5 edit-sequentialSeedance 2.0Seedance 2.0
WaveSpeedWAVESPEEDSeedream V4.5 editSeedream V4.5 edit-sequentialCustom i2vSeedance 2.0
RunPodRUNPODSeedream V4.5 edit
FALFALDedicated adapter

Not every provider supports every generation type. Executors silently skip providers that have no adapter for the requested operation.

Adapter architecture

All provider adapters extend InferenceAdapter (apps/brain/src/modules/domain/generation/adapters/inference-adapter.base.ts). The base class provides:

  • Input file resolution via pluggable strategies (SignedUrl, PublicUrl, Preupload) for converting internal storage paths to URLs the provider can fetch.
  • Adapter identity tracking for the generation dashboard, recording which adapter, model URL, and payload were used.
  • Storage access for downloading provider outputs and uploading them to R2.

Each provider has its own base adapter that adds provider-specific auth, HTTP client setup, and status mapping:

Module pathBase adapterConcrete adapters
apps/brain/src/modules/application/atlas/AtlasAdapterAtlasSeedreamV4EditAdapter, AtlasSeedreamV4EditSequentialAdapter, AtlasImageToVideoAdapter, AtlasTextToVideoAdapter
apps/brain/src/modules/application/wavespeed/WaveSpeedAdapterWaveSpeedSeedreamV4EditAdapter, WaveSpeedSeedreamV4EditSequentialAdapter, WaveSpeedImageToVideoAdapter, WaveSpeedTextToVideoAdapter, WaveSpeedFlux2KleinEditAdapter, WaveSpeedFlux2KleinEditLoraAdapter
apps/brain/src/modules/application/runpod/RunPodAdapterRunPodSeedreamV4EditAdapter, RunPodImageToVideoAdapter

All adapters follow the same two-phase pattern: schedule (POST a generation request, receive a task ID) then poll (GET the task status until completed or failed). The poll response is normalized to AsyncTaskStatus values (QUEUED, PROCESSING, COMPLETED, FAILED). Content moderation failures from providers are mapped to ContentModerationException.

Executor fallback chains

Each generation type has a dedicated executor that iterates an ordered list of provider attempts. When a provider returns FAILED at schedule time, the executor logs a warning and moves to the next provider. Thrown errors (network failures, content moderation) abort the chain immediately.

ExecutorKeyDefault chain
ImageGenerationExecutorimage-generationATLAS, WAVESPEED, RUNPOD
ImageSequenceGenerationExecutorimage-sequence-generationATLAS, WAVESPEED
ImageToVideoExecutorDetermined by jobData.videoProcessor (single provider, no fallback chain)
ReferenceToVideoExecutorDetermined by jobData.videoProcessor; supports WAVESPEED and ATLAS

Image and image-sequence executors support configurable chain ordering via the traffic split setting described below. Video executors use the provider specified in the job data and do not participate in traffic splitting.

Traffic split setting

The inference-traffic-split application setting controls provider ordering for image and image-sequence executors. It allows progressive rollout of a new provider without code deploys.

Schema

Stored as a JSON document in the application_settings table with id inference-traffic-split. Validated by inferenceTrafficSplitDataSchema in apps/brain/src/modules/application/applicationSettings/application-settings.schemas.ts.

{
"image-generation": [
{ "order": ["ATLAS", "WAVESPEED", "RUNPOD"], "probability": 10 },
{ "order": ["WAVESPEED", "ATLAS", "RUNPOD"], "probability": 90 }
],
"image-sequence-generation": [
{ "order": ["ATLAS", "WAVESPEED"], "probability": 100 },
{ "order": ["WAVESPEED", "ATLAS"], "probability": 0 }
]
}

Rules:

  • Each executor key maps to exactly 2 chains.
  • Each chain has an order (array of InferenceProvider enum values) and a probability (integer 0—100).
  • The two probabilities must sum to 100.
  • Providers in order that the executor does not recognize are silently skipped at runtime.
  • If the setting is missing or the executor key is absent, the executor falls back to its hardcoded default chain.

How it works

flowchart TD
Execute["Executor.execute()"]
ReadSetting["getSettingsEntry('inference-traffic-split')"]
HasKey{Executor key exists?}
DefaultChain["Use default chain"]
Roll["Math.random() * 100"]
ChainA["Chain A"]
ChainB["Chain B"]
BuildAttempts["Build attempts from selected chain, skip unknown providers"]
Iterate["Try providers in order until success"]
Execute --> ReadSetting
ReadSetting --> HasKey
HasKey -->|no| DefaultChain
HasKey -->|yes| Roll
Roll -->|"roll < A.probability"| ChainA
Roll -->|else| ChainB
ChainA --> BuildAttempts
ChainB --> BuildAttempts
DefaultChain --> BuildAttempts
BuildAttempts --> Iterate

Each executor maps providers to attempt factories. The selected chain’s order determines which factories are instantiated and in what sequence. The executor logs the selected chain and probability for observability.

Administration

Update the setting via the Brain REST API:

PUT /application-settings/inference-traffic-split
Authorization: Bearer <clerk-token>
Content-Type: application/json
{
"data": {
"image-generation": [
{ "order": ["ATLAS", "WAVESPEED", "RUNPOD"], "probability": 50 },
{ "order": ["WAVESPEED", "ATLAS", "RUNPOD"], "probability": 50 }
],
"image-sequence-generation": [
{ "order": ["ATLAS", "WAVESPEED"], "probability": 50 },
{ "order": ["WAVESPEED", "ATLAS"], "probability": 50 }
]
}
}

The endpoint is Clerk-authenticated and role-gated. Zod validation rejects payloads with invalid provider names, wrong chain count, or probabilities not summing to 100.

Default seed

The migration at apps/brain/prisma/migrations/20260519120000_seed_inference_traffic_split/migration.sql seeds the setting with 100% Atlas-first, preserving the behavior established before the traffic split was introduced.

Adding a new provider

  1. Add the enum value to InferenceProvider in apps/brain/src/modules/domain/generation/types/common.ts.
  2. Create a provider module under apps/brain/src/modules/application/<provider>/ with a base adapter and concrete adapters for each supported generation type.
  3. Register the module in apps/brain/src/modules/domain/generation/generation.module.ts.
  4. Add entries to providerAttemptFactories in the relevant executors.
  5. Update the inference-traffic-split setting to include the new provider in chain orders.
  6. Add the provider API key to apps/brain/.env.example and document it in Brain Environment Variables.