TAB Production Deployment Playbook

Version: 2.0 | Last Updated: February 2026 | Classification: Internal

Phase 1: Pre-Deployment Checklist

API keys rotated and stored in secrets manager
CORS origins restricted to production domains
Rate limiting configured per tier (Free: 10/min, Pro: 100/min, Enterprise: 1000/min)
Spider-Sense screening enabled and patterns reviewed (config/spider_sense_patterns.json)
Tool Gateway validation active with JSONL audit logging
JWT secret key generated (min 256-bit)
Admin panel access restricted to allowlisted IPs

OpenAI API key configured (GPT-4.1, GPT-4.1-Mini, GPT-4.1-Nano, O3, O3-Mini, O4-Mini)
Anthropic API key configured (Claude Sonnet 4, Claude Opus 4)
Provider fallback chain tested
Cost budgets and alerts configured per-user and per-agent

Phase 2: Database & Migrations

Run all pending migrations: python -m alembic upgrade head
Verify spider_sense_events table exists with 11 columns
Verify all 88 total models (48 active, 40 deprecated; 400+ available via OpenRouter) registered in ORM
Seed benchmark data: python seed_benchmarks.py
Verify marketplace categories populated
Create admin user account

Phase 3: Application Deployment

Verify unified benchmark executor loads all categories
Run smoke test: 3 test cases on ToolMaster Pro agent
Verify citation-grounding benchmarks (ALCE, AttrScore, HAGRID) available
Verify function-calling benchmarks (BFCL, K-Interop, K-Interop-Enterprise) available
Confirm Trust Seal computation completes within 30s

Phase 4: External Agent Support

Phase 5: Monitoring & Observability

Structured logging to stdout (JSON format)
JSONL audit logs rotated daily, retained 90 days
Error alerting configured (PagerDuty/Slack)
Dashboard endpoints verified:
- Security Dashboard: /static/security-dashboard.html
- Spider-Sense: /static/spider-sense-dashboard.html
- Tool Gateway: /static/tool-gateway-dashboard.html
- Analytics: /static/analytics-dashboard.html
Uptime monitoring on /api/health (interval: 60s)

Phase 6: Go-Live Checklist

Setting	Value	Notes
Spider-Sense L2 threshold	`50`	Suspicion score triggering L2 escalation
Spider-Sense sliding window	`50 actions`	Per-agent behavioral window
Rate limit (Free tier)	`10 req/min`	Per-user API rate limit
Rate limit (Pro tier)	`100 req/min`	Per-user API rate limit
JWT expiry	`24h`	Access token lifetime
Max upload size	`50 MB`	Agent code upload limit
LLM timeout	`30s`	Spider-Sense L3 / semantic validation
Reproducibility threshold	`≥ 85`	Score needed for Reproducible badge

📋 TAB Production Deployment Playbook