Round Environment
Round Environment
This page is the source of truth for round’s runtime configuration: process env vars, model URLs, GPU/CPU posture, and how the variables flow from .env.example into the running container.
Process variables (consumed by the binary)
apps/round/internal/pkg/config/config.go reads these on boot. The binary uses bare names (no ROUND_ prefix) — getEnv("GRPC_PORT", …), getEnv("HOST", …), etc. (see Load in apps/round/internal/pkg/config/config.go). The ROUND_* names visible in .env.example are the docker-compose / Railway-side names that get rewritten or passed in via the orchestrator. TODO(@law): document the exact rewrite path on Railway — the in-cluster Docker compose case forwards values one-to-one with the bare names, but the Railway service config is not in this repo (no Procfile, nixpacks.toml, or env-mapping file checked in beyond apps/round/railway.json).
| Process var | .env.example key | Default | Validation | Purpose |
|---|---|---|---|---|
GRPC_PORT | ROUND_HTTP_PORT | 8080 | 1..65535 | gRPC + /health HTTP listener port (single port, h2c-multiplexed). |
HOST | ROUND_HOST | 0.0.0.0 | non-empty | Bind address. |
MODEL_CACHE_DIR | ROUND_MODEL_CACHE_DIR | /opt/models | non-empty | Directory holding ONNX models and tokenizers. Persistent volume in production. |
MAX_TEXT_LENGTH | ROUND_MAX_TEXT_LENGTH | 10240 (10 KiB) | >0 | Max bytes for text input on Infer. |
MAX_BINARY_SIZE | ROUND_MAX_BINARY_SIZE | 10485760 (10 MiB) | >0 | Max bytes for the decoded image on Infer. The wire-level gRPC limit is 15 MiB to absorb base64 overhead. |
LOG_LEVEL | ROUND_LOG_LEVEL | info | one of debug, info, warn, error | zerolog level. |
If validation fails the process exits with a fatal log line — there is no fallback to defaults once a value is set but invalid.
Caller-side host vars
Round itself does not read these — they belong to its callers but live next to ROUND_* in .env.example:
| Var | Default | Consumer |
|---|---|---|
SIRLOIN_ROUND_HOST | round:8080 | sirloin gRPC client (apps/sirloin/internal/pkg/pb/round/). |
BRAIN_ROUND_HOST | round:8080 | brain gRPC client. |
Keep these in sync with GRPC_PORT if you change the listener port.
Model URL overrides
Round downloads ONNX models at Docker build time and at first runtime startup. The defaults point at the public R2 bucket https://round-models.sexty.dev (see apps/round/Dockerfile and apps/round/setup-local-dev.sh). Each can be overridden:
| Override | Default source | Stage |
|---|---|---|
RETINAFACE_MODEL_URL | ${MODELS_BASE_URL}/retinaface/retinaface_mv1_0.25.onnx | Docker build (build-arg). |
LVFACE_MODEL_URL | ${MODELS_BASE_URL}/lvface/LVFace-B_Glint360K.onnx | Docker build (build-arg). |
MODELS_BASE_URL | https://round-models.sexty.dev | Docker build (build-arg) — bulk-overrides both face URLs. |
ROUND_RETINAFACE_URL | (same as build-arg) | Runtime, used by code that fetches models on demand. |
ROUND_LVFACE_URL | (same as build-arg) | Runtime. |
ROUND_EMBEDDINGS_MODEL_URL | https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/onnx/model.onnx?download=true | Runtime (HuggingFace). |
ROUND_EMBEDDINGS_TOKENIZER_URL | https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/tokenizer.json?download=true | Runtime (HuggingFace). |
The Dockerfile fails the build loudly if any download produced a 0-byte or LFS pointer stub — do not bypass this guard. Local files in apps/round/models/ are git-LFS pointers and must not be copied into the runtime image.
For full per-model details (input shape, output shape, version provenance) see services/round-models.
GPU vs CPU
Round is CPU-only today. The apps/round/internal/pkg/onnxrt initialiser does not register the CUDA, TensorRT, or DirectML execution providers, and the Dockerfile installs the CPU build of ONNX Runtime (see ONNX_VERSION=1.23.1). There are no ROUND_GPU_* flags. If a CUDA build is ever needed, both the Dockerfile (different onnxruntime-linux-* archive) and the runtime initialiser will need to change.
Implications for ops:
- No GPU node provisioning is required. Round runs on Railway’s standard
us-east4-eqdc4aregion with a single replica (apps/round/railway.json). - CPU sizing is the relevant lever. Heap is bounded by the resident model files; ONNX session memory dominates RSS.
- Concurrency is gated by
MaxConcurrentStreams = 100at the gRPC server, not by GPU memory.
Native library paths (CGO)
CGO_ENABLED=1 is required because both ONNX Runtime and daulet/tokenizers are native libraries. The Docker image places shared libraries at standard system paths (/usr/lib, /usr/local/lib) and updates ldconfig. For local dev setup-local-dev.sh writes .env.local with the right LD_LIBRARY_PATH and CGO_* flags — see services/round-local-development.
Variable flow
flowchart LR envExample[".env.example<br/>ROUND_*"] --> compose["docker-compose / Railway service<br/>env injection"] compose --> rewrite["bare names<br/>GRPC_PORT, HOST, MODEL_CACHE_DIR, ..."] rewrite --> config["config.Load()"] config --> validate["config.Validate()"] validate -->|ok| run[Round process] validate -->|err| fatal[fatal exit]Defaults summary
For copy-paste into a fresh service definition:
ROUND_HOST=0.0.0.0ROUND_HTTP_PORT=8080ROUND_MODEL_CACHE_DIR=/opt/modelsROUND_MAX_TEXT_LENGTH=10240ROUND_MAX_BINARY_SIZE=10485760ROUND_LOG_LEVEL=infoSIRLOIN_ROUND_HOST=round:8080BRAIN_ROUND_HOST=round:8080See standards/deployment-env for the cross-service Railway conventions and services/round-runbook for deploys and rollbacks.