Skip to content

Load Testing

Load Testing

Comprehensive k6 load testing suite for the BEEF platform, testing service limits across Sirloin, Round, Brain, and Cinder APIs.

Overview

This suite includes 11 load test scenarios designed to find performance limits of various platform services:

ScenarioServiceTarget QPSDescription
01Cinder50Onboarding nudity detection (30-60s response)
02Cinder800Generation nudity detection (30-60s response)
03Sirloin/Brain800Image generation submission + polling
04Sirloin/Brain100Video generation submission + polling
05Sirloin10,000Explore list (pagination)
06Sirloin1,000Explore search queries
07Sirloin1,000Explore filtering
08Round1,000Text embeddings generation
09Round100Face detection (base64 images)
10Hive50Celebrity recognition (standalone, not in run-all)
11Hive50Content moderation (standalone, not in run-all)

Test Families

This page covers two different test surfaces:

  • load-tests/: k6 load and performance scenarios for service-limit discovery.
  • tests/: Playwright API and e2e tests configured by tests/playwright.config.ts.

Use [Testing](/operations/testing/) for the per-service unit, lint, and typecheck command matrix.

Playwright API And E2E Tests

Playwright tests live under tests/ and are configured by tests/playwright.config.ts. They exercise API and browser flows rather than sustained load. Install dependencies from the tests directory and use the package scripts there:

Terminal window
cd tests
npm install
npx playwright install
npm run test:api

Use Playwright when validating request/response behavior, user journeys, browser compatibility, traces, screenshots, or API regression coverage. Use k6 load tests when measuring throughput, latency, saturation, or service-limit behavior.

Prerequisites

Required Software

  • k6 - Load testing tool

    Terminal window
    # macOS
    brew install k6
    # Linux
    sudo gpg -k
    sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg \
    --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
    echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" | \
    sudo tee /etc/apt/sources.list.d/k6.list
    sudo apt-get update
    sudo apt-get install k6
  • Node.js 18+ - For utility scripts

    Terminal window
    node --version # Should be 18 or higher
  • ts-proto - For proto code generation

    Terminal window
    npm install -g ts-proto

Required Access

  • API Keys: Brain API key, Cinder API key, Hive API keys (scenarios 10 and 11 only)
  • R2 Credentials: Access to Cloudflare R2 bucket (or S3-compatible storage)
  • Service Access: Network access to staging/dev environment
  • Test Data: Valid user IDs and character IDs from target environment

Setup

1. Install Dependencies

Terminal window
cd load-tests
npm install

2. Generate Proto Code

From the repository root:

Terminal window
make generate-proto

This generates gRPC client code in load-tests/generated/ for both Sirloin and Round services.

3. Prepare Dataset

Download and prepare ~1,000 images for testing:

Terminal window
npm run setup

This script offers three options:

  • Option 1: CelebA-HQ (manual download required)
  • Option 2: LFW dataset (auto-download)
  • Option 3: Use your own images

Images will be placed in data/images/.

4. Upload Images to R2

First, configure your environment (see Configuration section below), then:

Terminal window
npm run upload-images

This uploads all images from data/images/ to your configured R2 bucket under load-test-images/YYYY-MM-DD/.

5. Generate Presigned URLs

Terminal window
npm run generate-urls

This creates data/presigned-urls.json with presigned URLs for all uploaded images. Valid for 7 days.

IMPORTANT: This file is gitignored and should never be committed. Regenerate weekly for ongoing testing.

Configuration

Create .env File

Terminal window
cp .env.example .env

Required Configuration

Edit .env and fill in all values:

Terminal window
# Target Environment
SIRLOIN_GRPC_HOST=staging.example.com:9920
ROUND_GRPC_HOST=staging.example.com:8080
BRAIN_API_KEY=your_brain_api_key
CINDER_API_KEY=your_cinder_api_key
# R2 Storage
R2_BUCKET_NAME=your-bucket-name
R2_ENDPOINT=https://account-id.r2.cloudflarestorage.com
R2_ACCESS_KEY=your_access_key
R2_SECRET_KEY=your_secret_key
R2_PUBLIC_URL=https://your-r2-public-domain.com
# Test Data (from staging environment)
TEST_USER_IDS=user_2abc123,user_2def456
TEST_CHARACTER_IDS=uuid1,uuid2,uuid3
# Load Test Settings (optional, defaults shown)
RAMP_UP_DURATION=2m
SUSTAIN_DURATION=15m
RAMP_DOWN_DURATION=2m
# Hive API (scenarios 10 and 11 only)
HIVE_CELEBRITY_API_KEY=your_hive_api_key
HIVE_MODERATION_API_KEY=your_hive_moderation_api_key
# HIVE_CELEBRITY_QPS=50 # Target QPS (default 50)
# HIVE_MODERATION_QPS=50 # Target QPS (default 50)

Validate Configuration

Terminal window
npm run validate-config

Running Tests

Run All Scenarios

Terminal window
npm test
# or
node run-all.js

This runs all 9 scenarios sequentially and generates an aggregate report.

Run Single Scenario

Terminal window
k6 run scenarios/01-cinder-onboarding.js
k6 run scenarios/05-explore-list.js
k6 run scenarios/10-hive-celebrity.js # Hive API - requires HIVE_CELEBRITY_API_KEY
k6 run scenarios/11-hive-moderation.js # Hive API - requires HIVE_MODERATION_API_KEY
# etc.

Run Selected Scenarios

Terminal window
node run-all.js --scenarios 01,02,05
# Runs only scenarios 01, 02, and 05

Custom Duration

Override test duration via environment variables:

Terminal window
SUSTAIN_DURATION=5m k6 run scenarios/01-cinder-onboarding.js

Or for all tests:

Terminal window
SUSTAIN_DURATION=5m node run-all.js

Interpreting Results

During Test Execution

k6 displays real-time metrics:

running (15m00s), 2250/2250 VUs, 180000 complete and 0 interrupted iterations
✓ cinder: status is 200
✗ cinder: response has body
  • VUs: Virtual users (concurrent simulated users)
  • Iterations: Completed requests
  • ✓/✗: Check pass/fail status

After Test Completion

Each scenario outputs a summary:

========================================
Scenario 01: Cinder Onboarding
========================================
Target QPS: 50
Virtual Users: 2250
Total Requests: 45000
Avg Response Time: 45.23s
p95 Response Time: 58.12s
p99 Response Time: 63.45s
Error Rate: 0.12%
Nudity Detected: 2.34%
========================================

Aggregate Report

After running all scenarios:

Scenario Status Requests p95 Errors
────────────────────────────────────────────────────────────────────────────
Cinder Onboarding ✓ PASS 45,000 58.12s 0.12%
Cinder Generation ✓ PASS 720,000 62.34s 0.23%
...

Results are saved to:

  • Individual: results/01-cinder-onboarding.json, etc.
  • Aggregate: results/aggregate-report.json

Success Criteria

A scenario passes if:

  • ✅ Target QPS sustained during 15-minute steady state
  • ✅ Error rate < 1%
  • ✅ p95 latency within defined thresholds
  • ✅ No client-side crashes

A scenario fails if:

  • ❌ Error rate ≥ 1%
  • ❌ p95 latency exceeds threshold
  • ❌ k6 errors or crashes

Test Scenarios Deep Dive

01-02: Cinder Nudity Detection

Purpose: Test Cinder API capacity for synchronous nudity detection

Key Metrics:

  • Response time (30-60s expected)
  • Nudity detection rate
  • HTTP error rate

Notes:

  • Each request uses a unique presigned URL
  • Response time includes full ML processing
  • Workflow types: user_onboarding vs generation.response

03-04: Image/Video Generation

Purpose: Test Sirloin GenerateMedia submission capacity

Key Metrics:

  • Submission latency (<2s expected)
  • gRPC error rate

Notes:

  • Tests submission rate, not completion rate
  • Actual generation takes 60-120s (images) or 600-1200s (videos)
  • Polls for 10s to catch fast completions
  • Most generations won’t complete during test - this is expected

05-07: Explore Endpoints

Purpose: Test Sirloin read-only endpoints for discovery/search

Key Metrics:

  • Response time (p95 < 200-500ms)
  • Results returned
  • gRPC error rate

Notes:

  • High QPS (1k-10k)
  • Read-only, no side effects
  • Tests pagination, search, and filtering

10: Hive Celebrity Recognition

Purpose: Test Hive API capacity for celebrity recognition (standalone scenario, not in run-all)

Key Metrics:

  • Response time (p95 < 35s)
  • HTTP error rate (< 1%)

Notes:

  • Requires HIVE_CELEBRITY_API_KEY (source .env before k6 or use -e HIVE_CELEBRITY_API_KEY=xxx)
  • Requires data/presigned-urls.json (run npm run setup, upload-images, generate-urls)
  • Override QPS: HIVE_CELEBRITY_QPS=25

11: Hive Content Moderation

Purpose: Test Hive API capacity for content moderation (standalone scenario, not in run-all)

Key Metrics:

  • Response time (p95 < 35s)
  • HTTP error rate (< 1%)

Notes:

  • Requires HIVE_MODERATION_API_KEY (source .env before k6 or use -e HIVE_MODERATION_API_KEY=xxx)
  • Requires data/presigned-urls.json (run npm run setup, upload-images, generate-urls)
  • Override QPS: HIVE_MODERATION_QPS=25
  • Run: pnpm run test:hive-moderation

08-09: Round Service

Purpose: Test Round inference endpoints

Key Metrics:

  • Inference time (p95 < 500ms-2s)
  • Embedding dimensions / faces detected
  • gRPC error rate

Notes:

  • Embeddings: Varying text lengths (10-1000 chars)
  • Face Detection: Base64-encoded images (~15MB max)
  • Direct gRPC to Round (not via Sirloin)

Troubleshooting

gRPC Connection Errors

WARN[0001] Request Failed error="rpc error: code = Unavailable"

Solutions:

  • Check SIRLOIN_GRPC_HOST / ROUND_GRPC_HOST in .env
  • Verify network connectivity: telnet staging.example.com 9920
  • Check firewall rules
  • Ensure services are running

Authentication Failures

error="rpc error: code = Unauthenticated desc = invalid token"

Solutions:

  • Verify BRAIN_API_KEY / CINDER_API_KEY in .env
  • Check API key expiry
  • Ensure API keys are for correct environment

Presigned URL Errors

Error: No presigned URLs available

Solutions:

  • Run npm run generate-urls
  • Check URLs haven’t expired (7-day limit)
  • Verify data/presigned-urls.json exists
  • Regenerate if older than 7 days

Image Loading Errors

Failed to load images: ENOENT: no such file or directory

Solutions:

  • Run npm run setup to download images
  • Manually place images in data/images/
  • Verify at least 100 images exist

Out of Memory (OOM)

FATAL: JavaScript heap out of memory

Solutions:

  • Reduce SUSTAIN_DURATION (try 5m instead of 15m)
  • Run scenarios individually instead of all at once
  • Reduce target QPS via environment variables
  • Increase Node.js memory: NODE_OPTIONS="--max-old-space-size=4096" node run-all.js

High Error Rates

If error rate > 1%:

  • Check service logs for backend errors
  • Reduce QPS to find sustainable rate
  • Verify test data (user IDs, character IDs) are valid
  • Check database connection limits
  • Monitor CPU/memory on backend services

Advanced Usage

Custom QPS Targets

Override any scenario’s target QPS:

Terminal window
CINDER_ONBOARDING_QPS=25 k6 run scenarios/01-cinder-onboarding.js
IMAGE_GENERATION_QPS=400 k6 run scenarios/03-image-generation.js

Custom Timeouts

Terminal window
IMAGE_GENERATION_TIMEOUT=300 k6 run scenarios/03-image-generation.js

Debug Mode

Terminal window
DEBUG=true k6 run scenarios/01-cinder-onboarding.js

Cloud Execution

Run tests from k6 Cloud for distributed load:

Terminal window
k6 cloud scenarios/05-explore-list.js

Maintenance

Weekly Tasks

  • Regenerate presigned URLs: npm run generate-urls (URLs expire after 7 days)

As Needed

  • Update test data: Edit data/test-data.json with new search queries, prompts, etc.
  • Refresh dataset: Re-run npm run setup if images change
  • Update proto code: Run make generate-proto from repo root after proto changes

Architecture

Directory Structure

load-tests/
├── data/ # Test data (gitignored)
│ ├── images/ # Local image dataset
│ ├── presigned-urls.json # Generated presigned URLs
│ └── test-data.json # Search queries, prompts, etc.
├── generated/ # Generated proto code (gitignored)
│ ├── sirloin/ # Sirloin gRPC types
│ └── round/ # Round gRPC types
├── results/ # Test results (gitignored)
├── scenarios/ # k6 test scenarios
├── scripts/ # Setup scripts
├── utils/ # Shared utilities
├── .env # Environment config (gitignored)
├── .env.example # Example configuration
├── package.json # Dependencies
└── run-all.js # Master test runner

Load Profile

All scenarios use gradual ramp-up:

VUs
|
| ┌─────────────────┐
| / \
| / \
| / \
|____/ \____
└────────────────────────────────── Time
2m 15m 2m
ramp-up sustain ramp-down

This prevents sudden traffic spikes and provides more realistic load patterns.

Contributing

When adding new scenarios:

  1. Create scenario file in scenarios/
  2. Define thresholds in utils/metrics.js
  3. Add to scenarios array in run-all.js
  4. Update README with scenario description
  5. Test individually before adding to suite

Support

For issues or questions:

  • Check troubleshooting section above
  • Review k6 documentation: https://k6.io/docs/
  • Check service logs for backend errors
  • Verify configuration with npm run validate-config