← Back to Developer Portal
Loading...
General
Tested in Sandbox
Tested Live
🔒 Reproducible
🕷️ Spider-Sense
Agent Configuration
Model
?
Loading models...
System Prompt
?
Temperature:
0.7
?
Harnesses
Minimal
Basic
Robust
Full
Agent Skills
Save Skills
Saved!
Security Permissions
Read Files
Write Files
Make API Calls
Send Emails
Database Access
Execute Code
Risk Level
Low
Moderate
High
Confirm destructive actions
Confirm external API calls
Cost threshold confirmation
Documents (RAG)
Drop files or click to upload (PDF, TXT, MD, JSON, CSV)
Benchmark Selection
0 benchmarks selected | 0 total tests
Select Benchmarks
0 harnesses active | 0 benchmarks selected | 0 total tests
Run Tests
Send to Canvas →
+ Add Another Agent
How many tests?
Choose a test size for this run.
Quick Test (3)
Standard (25)
Thorough (100)
Full Suite (all)
Custom:
Run with custom count
Cancel
Running test 1 of N...