Load Testing
Load Testing
Comprehensive k6 load testing suite for the BEEF platform, testing service limits across Sirloin, Round, Brain, and Cinder APIs.
Overview
This suite includes 11 load test scenarios designed to find performance limits of various platform services:
| Scenario | Service | Target QPS | Description |
|---|---|---|---|
| 01 | Cinder | 50 | Onboarding nudity detection (30-60s response) |
| 02 | Cinder | 800 | Generation nudity detection (30-60s response) |
| 03 | Sirloin/Brain | 800 | Image generation submission + polling |
| 04 | Sirloin/Brain | 100 | Video generation submission + polling |
| 05 | Sirloin | 10,000 | Explore list (pagination) |
| 06 | Sirloin | 1,000 | Explore search queries |
| 07 | Sirloin | 1,000 | Explore filtering |
| 08 | Round | 1,000 | Text embeddings generation |
| 09 | Round | 100 | Face detection (base64 images) |
| 10 | Hive | 50 | Celebrity recognition (standalone, not in run-all) |
| 11 | Hive | 50 | Content moderation (standalone, not in run-all) |
Test Families
This page covers two different test surfaces:
load-tests/: k6 load and performance scenarios for service-limit discovery.tests/: Playwright API and e2e tests configured bytests/playwright.config.ts.
Use [Testing](/operations/testing/) for the per-service unit, lint, and typecheck command matrix.
Playwright API And E2E Tests
Playwright tests live under tests/ and are configured by tests/playwright.config.ts. They exercise API and browser flows rather than sustained load. Install dependencies from the tests directory and use the package scripts there:
cd testsnpm installnpx playwright installnpm run test:apiUse Playwright when validating request/response behavior, user journeys, browser compatibility, traces, screenshots, or API regression coverage. Use k6 load tests when measuring throughput, latency, saturation, or service-limit behavior.
Prerequisites
Required Software
-
k6 - Load testing tool
Terminal window # macOSbrew install k6# Linuxsudo gpg -ksudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg \--keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" | \sudo tee /etc/apt/sources.list.d/k6.listsudo apt-get updatesudo apt-get install k6 -
Node.js 18+ - For utility scripts
Terminal window node --version # Should be 18 or higher -
ts-proto - For proto code generation
Terminal window npm install -g ts-proto
Required Access
- API Keys: Brain API key, Cinder API key, Hive API keys (scenarios 10 and 11 only)
- R2 Credentials: Access to Cloudflare R2 bucket (or S3-compatible storage)
- Service Access: Network access to staging/dev environment
- Test Data: Valid user IDs and character IDs from target environment
Setup
1. Install Dependencies
cd load-testsnpm install2. Generate Proto Code
From the repository root:
make generate-protoThis generates gRPC client code in load-tests/generated/ for both Sirloin and Round services.
3. Prepare Dataset
Download and prepare ~1,000 images for testing:
npm run setupThis script offers three options:
- Option 1: CelebA-HQ (manual download required)
- Option 2: LFW dataset (auto-download)
- Option 3: Use your own images
Images will be placed in data/images/.
4. Upload Images to R2
First, configure your environment (see Configuration section below), then:
npm run upload-imagesThis uploads all images from data/images/ to your configured R2 bucket under load-test-images/YYYY-MM-DD/.
5. Generate Presigned URLs
npm run generate-urlsThis creates data/presigned-urls.json with presigned URLs for all uploaded images. Valid for 7 days.
IMPORTANT: This file is gitignored and should never be committed. Regenerate weekly for ongoing testing.
Configuration
Create .env File
cp .env.example .envRequired Configuration
Edit .env and fill in all values:
# Target EnvironmentSIRLOIN_GRPC_HOST=staging.example.com:9920ROUND_GRPC_HOST=staging.example.com:8080BRAIN_API_KEY=your_brain_api_keyCINDER_API_KEY=your_cinder_api_key
# R2 StorageR2_BUCKET_NAME=your-bucket-nameR2_ENDPOINT=https://account-id.r2.cloudflarestorage.comR2_ACCESS_KEY=your_access_keyR2_SECRET_KEY=your_secret_keyR2_PUBLIC_URL=https://your-r2-public-domain.com
# Test Data (from staging environment)TEST_USER_IDS=user_2abc123,user_2def456TEST_CHARACTER_IDS=uuid1,uuid2,uuid3
# Load Test Settings (optional, defaults shown)RAMP_UP_DURATION=2mSUSTAIN_DURATION=15mRAMP_DOWN_DURATION=2m
# Hive API (scenarios 10 and 11 only)HIVE_CELEBRITY_API_KEY=your_hive_api_keyHIVE_MODERATION_API_KEY=your_hive_moderation_api_key# HIVE_CELEBRITY_QPS=50 # Target QPS (default 50)# HIVE_MODERATION_QPS=50 # Target QPS (default 50)Validate Configuration
npm run validate-configRunning Tests
Run All Scenarios
npm test# ornode run-all.jsThis runs all 9 scenarios sequentially and generates an aggregate report.
Run Single Scenario
k6 run scenarios/01-cinder-onboarding.jsk6 run scenarios/05-explore-list.jsk6 run scenarios/10-hive-celebrity.js # Hive API - requires HIVE_CELEBRITY_API_KEYk6 run scenarios/11-hive-moderation.js # Hive API - requires HIVE_MODERATION_API_KEY# etc.Run Selected Scenarios
node run-all.js --scenarios 01,02,05# Runs only scenarios 01, 02, and 05Custom Duration
Override test duration via environment variables:
SUSTAIN_DURATION=5m k6 run scenarios/01-cinder-onboarding.jsOr for all tests:
SUSTAIN_DURATION=5m node run-all.jsInterpreting Results
During Test Execution
k6 displays real-time metrics:
running (15m00s), 2250/2250 VUs, 180000 complete and 0 interrupted iterations ✓ cinder: status is 200 ✗ cinder: response has body- VUs: Virtual users (concurrent simulated users)
- Iterations: Completed requests
- ✓/✗: Check pass/fail status
After Test Completion
Each scenario outputs a summary:
========================================Scenario 01: Cinder Onboarding========================================Target QPS: 50Virtual Users: 2250Total Requests: 45000Avg Response Time: 45.23sp95 Response Time: 58.12sp99 Response Time: 63.45sError Rate: 0.12%Nudity Detected: 2.34%========================================Aggregate Report
After running all scenarios:
Scenario Status Requests p95 Errors────────────────────────────────────────────────────────────────────────────Cinder Onboarding ✓ PASS 45,000 58.12s 0.12%Cinder Generation ✓ PASS 720,000 62.34s 0.23%...Results are saved to:
- Individual:
results/01-cinder-onboarding.json, etc. - Aggregate:
results/aggregate-report.json
Success Criteria
A scenario passes if:
- ✅ Target QPS sustained during 15-minute steady state
- ✅ Error rate < 1%
- ✅ p95 latency within defined thresholds
- ✅ No client-side crashes
A scenario fails if:
- ❌ Error rate ≥ 1%
- ❌ p95 latency exceeds threshold
- ❌ k6 errors or crashes
Test Scenarios Deep Dive
01-02: Cinder Nudity Detection
Purpose: Test Cinder API capacity for synchronous nudity detection
Key Metrics:
- Response time (30-60s expected)
- Nudity detection rate
- HTTP error rate
Notes:
- Each request uses a unique presigned URL
- Response time includes full ML processing
- Workflow types:
user_onboardingvsgeneration.response
03-04: Image/Video Generation
Purpose: Test Sirloin GenerateMedia submission capacity
Key Metrics:
- Submission latency (<2s expected)
- gRPC error rate
Notes:
- Tests submission rate, not completion rate
- Actual generation takes 60-120s (images) or 600-1200s (videos)
- Polls for 10s to catch fast completions
- Most generations won’t complete during test - this is expected
05-07: Explore Endpoints
Purpose: Test Sirloin read-only endpoints for discovery/search
Key Metrics:
- Response time (p95 < 200-500ms)
- Results returned
- gRPC error rate
Notes:
- High QPS (1k-10k)
- Read-only, no side effects
- Tests pagination, search, and filtering
10: Hive Celebrity Recognition
Purpose: Test Hive API capacity for celebrity recognition (standalone scenario, not in run-all)
Key Metrics:
- Response time (p95 < 35s)
- HTTP error rate (< 1%)
Notes:
- Requires
HIVE_CELEBRITY_API_KEY(source.envbefore k6 or use-e HIVE_CELEBRITY_API_KEY=xxx) - Requires
data/presigned-urls.json(runnpm run setup,upload-images,generate-urls) - Override QPS:
HIVE_CELEBRITY_QPS=25
11: Hive Content Moderation
Purpose: Test Hive API capacity for content moderation (standalone scenario, not in run-all)
Key Metrics:
- Response time (p95 < 35s)
- HTTP error rate (< 1%)
Notes:
- Requires
HIVE_MODERATION_API_KEY(source.envbefore k6 or use-e HIVE_MODERATION_API_KEY=xxx) - Requires
data/presigned-urls.json(runnpm run setup,upload-images,generate-urls) - Override QPS:
HIVE_MODERATION_QPS=25 - Run:
pnpm run test:hive-moderation
08-09: Round Service
Purpose: Test Round inference endpoints
Key Metrics:
- Inference time (p95 < 500ms-2s)
- Embedding dimensions / faces detected
- gRPC error rate
Notes:
- Embeddings: Varying text lengths (10-1000 chars)
- Face Detection: Base64-encoded images (~15MB max)
- Direct gRPC to Round (not via Sirloin)
Troubleshooting
gRPC Connection Errors
WARN[0001] Request Failed error="rpc error: code = Unavailable"Solutions:
- Check
SIRLOIN_GRPC_HOST/ROUND_GRPC_HOSTin.env - Verify network connectivity:
telnet staging.example.com 9920 - Check firewall rules
- Ensure services are running
Authentication Failures
error="rpc error: code = Unauthenticated desc = invalid token"Solutions:
- Verify
BRAIN_API_KEY/CINDER_API_KEYin.env - Check API key expiry
- Ensure API keys are for correct environment
Presigned URL Errors
Error: No presigned URLs availableSolutions:
- Run
npm run generate-urls - Check URLs haven’t expired (7-day limit)
- Verify
data/presigned-urls.jsonexists - Regenerate if older than 7 days
Image Loading Errors
Failed to load images: ENOENT: no such file or directorySolutions:
- Run
npm run setupto download images - Manually place images in
data/images/ - Verify at least 100 images exist
Out of Memory (OOM)
FATAL: JavaScript heap out of memorySolutions:
- Reduce
SUSTAIN_DURATION(try5minstead of15m) - Run scenarios individually instead of all at once
- Reduce target QPS via environment variables
- Increase Node.js memory:
NODE_OPTIONS="--max-old-space-size=4096" node run-all.js
High Error Rates
If error rate > 1%:
- Check service logs for backend errors
- Reduce QPS to find sustainable rate
- Verify test data (user IDs, character IDs) are valid
- Check database connection limits
- Monitor CPU/memory on backend services
Advanced Usage
Custom QPS Targets
Override any scenario’s target QPS:
CINDER_ONBOARDING_QPS=25 k6 run scenarios/01-cinder-onboarding.jsIMAGE_GENERATION_QPS=400 k6 run scenarios/03-image-generation.jsCustom Timeouts
IMAGE_GENERATION_TIMEOUT=300 k6 run scenarios/03-image-generation.jsDebug Mode
DEBUG=true k6 run scenarios/01-cinder-onboarding.jsCloud Execution
Run tests from k6 Cloud for distributed load:
k6 cloud scenarios/05-explore-list.jsMaintenance
Weekly Tasks
- Regenerate presigned URLs:
npm run generate-urls(URLs expire after 7 days)
As Needed
- Update test data: Edit
data/test-data.jsonwith new search queries, prompts, etc. - Refresh dataset: Re-run
npm run setupif images change - Update proto code: Run
make generate-protofrom repo root after proto changes
Architecture
Directory Structure
load-tests/├── data/ # Test data (gitignored)│ ├── images/ # Local image dataset│ ├── presigned-urls.json # Generated presigned URLs│ └── test-data.json # Search queries, prompts, etc.├── generated/ # Generated proto code (gitignored)│ ├── sirloin/ # Sirloin gRPC types│ └── round/ # Round gRPC types├── results/ # Test results (gitignored)├── scenarios/ # k6 test scenarios├── scripts/ # Setup scripts├── utils/ # Shared utilities├── .env # Environment config (gitignored)├── .env.example # Example configuration├── package.json # Dependencies└── run-all.js # Master test runnerLoad Profile
All scenarios use gradual ramp-up:
VUs | | ┌─────────────────┐ | / \ | / \ | / \ |____/ \____ └────────────────────────────────── Time 2m 15m 2m ramp-up sustain ramp-downThis prevents sudden traffic spikes and provides more realistic load patterns.
Contributing
When adding new scenarios:
- Create scenario file in
scenarios/ - Define thresholds in
utils/metrics.js - Add to scenarios array in
run-all.js - Update README with scenario description
- Test individually before adding to suite
Support
For issues or questions:
- Check troubleshooting section above
- Review k6 documentation: https://k6.io/docs/
- Check service logs for backend errors
- Verify configuration with
npm run validate-config