Three-Server Validation Runbook
Roles
Server A: Registry and image/build source
Server B: Public control plane, app, auth, teams, scoreboard, admin
Server C: VPN and routing plane for per-principal / per-team access
Preflight
Run python cli.py env validate role-a --app-env production --json
Run python cli.py env validate role-b --app-env production --json
Run python cli.py env validate role-c --app-env production --json
Run python cli.py db bootstrap --app-env production --json
Run python cli.py health all --app-env production --json
Functional Staging
Server A
Verify registry host, credentials, and local image availability with python cli.py registry check --json
Build and push challenge images with python cli.py challenge build <slug> then python cli.py challenge push <slug>
Confirm missing-image failure behavior before enabling fresh spawns
Server B
Bootstrap the schema and admin user
Validate registration, login, reset, solo principal creation, team create/join, challenge access, scoreboard, admin pages, and supported appearance presets
Validate local or remote instance lifecycle with:
python cli.py instance spawn <slug> --principal-id <id>
python cli.py instance reset <slug> --principal-id <id>
python cli.py instance destroy <slug> --principal-id <id>
Server C
Validate WireGuard access and peer visibility with:
python cli.py vpn status --json
python cli.py vpn test --json
python cli.py vpn reconcile --json
Confirm strict routing: one team/principal may only reach its own assigned subnet
Load and Soak
Target 50-100 teams and 200-400 concurrent VPN clients
Blend traffic:
25% solo principals
75% team principals
60% control-plane browsing
30% active challenge traffic
10% spawn/reset/reconnect churn
Observe:
login latency
spawn latency
active WireGuard peers
container start time
Redis and database latency
CPU, memory, and network saturation on B and C
Failure Drills
Stop registry access on Server A and verify fresh spawn failures are clear and recoverable
Restart Server B and verify session continuity and no duplicate instance creation
Restart Server C and verify peer reconciliation and route restoration
Force one stale WireGuard peer, one expired lease, and one wrong registry credential and confirm cleanup plus audit visibility
Acceptance Criteria
Solo users receive one personal principal and private challenge access
Teams never exceed 4 active members
Lease cleanup occurs at the configured TTL boundary
No cross-team subnet reachability exists
Supported appearance presets render correctly on public, player, and admin pages
CLI commands remain deterministic, automation-friendly, and JSON-capable
Back to top