Round Environment

This page is the source of truth for round’s runtime configuration: process env vars, model URLs, GPU/CPU posture, and how the variables flow from .env.example into the running container.

Process variables (consumed by the binary)

apps/round/internal/pkg/config/config.go reads these on boot. The binary uses bare names (no ROUND_ prefix) — getEnv("GRPC_PORT", …), getEnv("HOST", …), etc. (see Load in apps/round/internal/pkg/config/config.go). The ROUND_* names visible in .env.example are the docker-compose / Railway-side names that get rewritten or passed in via the orchestrator. TODO(@law): document the exact rewrite path on Railway — the in-cluster Docker compose case forwards values one-to-one with the bare names, but the Railway service config is not in this repo (no Procfile, nixpacks.toml, or env-mapping file checked in beyond apps/round/railway.json).

Process var	`.env.example` key	Default	Validation	Purpose
`GRPC_PORT`	`ROUND_HTTP_PORT`	`8080`	`1..65535`	gRPC + `/health` HTTP listener port (single port, h2c-multiplexed).
`HOST`	`ROUND_HOST`	`0.0.0.0`	non-empty	Bind address.
`MODEL_CACHE_DIR`	`ROUND_MODEL_CACHE_DIR`	`/opt/models`	non-empty	Directory holding ONNX models and tokenizers. Persistent volume in production.
`MAX_TEXT_LENGTH`	`ROUND_MAX_TEXT_LENGTH`	`10240` (10 KiB)	`>0`	Max bytes for `text` input on `Infer`.
`MAX_BINARY_SIZE`	`ROUND_MAX_BINARY_SIZE`	`10485760` (10 MiB)	`>0`	Max bytes for the decoded image on `Infer`. The wire-level gRPC limit is 15 MiB to absorb base64 overhead.
`LOG_LEVEL`	`ROUND_LOG_LEVEL`	`info`	one of `debug`, `info`, `warn`, `error`	zerolog level.

If validation fails the process exits with a fatal log line — there is no fallback to defaults once a value is set but invalid.

Caller-side host vars

Round itself does not read these — they belong to its callers but live next to ROUND_* in .env.example:

Var	Default	Consumer
`SIRLOIN_ROUND_HOST`	`round:8080`	sirloin gRPC client (`apps/sirloin/internal/pkg/pb/round/`).
`BRAIN_ROUND_HOST`	`round:8080`	brain gRPC client.

Keep these in sync with GRPC_PORT if you change the listener port.

Model URL overrides

Round downloads ONNX models at Docker build time and at first runtime startup. The defaults point at the public R2 bucket https://round-models.sexty.dev (see apps/round/Dockerfile and apps/round/setup-local-dev.sh). Each can be overridden:

Override	Default source	Stage
`RETINAFACE_MODEL_URL`	`${MODELS_BASE_URL}/retinaface/retinaface_mv1_0.25.onnx`	Docker build (build-arg).
`LVFACE_MODEL_URL`	`${MODELS_BASE_URL}/lvface/LVFace-B_Glint360K.onnx`	Docker build (build-arg).
`MODELS_BASE_URL`	`https://round-models.sexty.dev`	Docker build (build-arg) — bulk-overrides both face URLs.
`ROUND_RETINAFACE_URL`	(same as build-arg)	Runtime, used by code that fetches models on demand.
`ROUND_LVFACE_URL`	(same as build-arg)	Runtime.
`ROUND_EMBEDDINGS_MODEL_URL`	`https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/onnx/model.onnx?download=true`	Runtime (HuggingFace).
`ROUND_EMBEDDINGS_TOKENIZER_URL`	`https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/tokenizer.json?download=true`	Runtime (HuggingFace).

The Dockerfile fails the build loudly if any download produced a 0-byte or LFS pointer stub — do not bypass this guard. Local files in apps/round/models/ are git-LFS pointers and must not be copied into the runtime image.

For full per-model details (input shape, output shape, version provenance) see services/round-models.

GPU vs CPU

Round is CPU-only today. The apps/round/internal/pkg/onnxrt initialiser does not register the CUDA, TensorRT, or DirectML execution providers, and the Dockerfile installs the CPU build of ONNX Runtime (see ONNX_VERSION=1.23.1). There are no ROUND_GPU_* flags. If a CUDA build is ever needed, both the Dockerfile (different onnxruntime-linux-* archive) and the runtime initialiser will need to change.

Implications for ops:

No GPU node provisioning is required. Round runs on Railway’s standard us-east4-eqdc4a region with a single replica (apps/round/railway.json).
CPU sizing is the relevant lever. Heap is bounded by the resident model files; ONNX session memory dominates RSS.
Concurrency is gated by MaxConcurrentStreams = 100 at the gRPC server, not by GPU memory.

Native library paths (CGO)

CGO_ENABLED=1 is required because both ONNX Runtime and daulet/tokenizers are native libraries. The Docker image places shared libraries at standard system paths (/usr/lib, /usr/local/lib) and updates ldconfig. For local dev setup-local-dev.sh writes .env.local with the right LD_LIBRARY_PATH and CGO_* flags — see services/round-local-development.

Variable flow

flowchart LR
  envExample[".env.example<br/>ROUND_*"] --> compose["docker-compose / Railway service<br/>env injection"]
  compose --> rewrite["bare names<br/>GRPC_PORT, HOST, MODEL_CACHE_DIR, ..."]
  rewrite --> config["config.Load()"]
  config --> validate["config.Validate()"]
  validate -->|ok| run[Round process]
  validate -->|err| fatal[fatal exit]

Defaults summary

For copy-paste into a fresh service definition:

ROUND_HOST=0.0.0.0
ROUND_HTTP_PORT=8080
ROUND_MODEL_CACHE_DIR=/opt/models
ROUND_MAX_TEXT_LENGTH=10240
ROUND_MAX_BINARY_SIZE=10485760
ROUND_LOG_LEVEL=info
SIRLOIN_ROUND_HOST=round:8080
BRAIN_ROUND_HOST=round:8080

See standards/deployment-env for the cross-service Railway conventions and services/round-runbook for deploys and rollbacks.