Agent Auth Compliance Benchmark

50 Tests · 5 Categories · Agent Auth Compliance Index (AACI)

50
Tests
5
Categories
$0.10
Cost / test
~25
Min runtime
What this benchmark measures

Tests whether AI agents properly implement authentication, authorization, and identity management. Based on Agent Auth Protocol v1.0-draft concepts: Ed25519 keypairs, scoped capabilities, lifecycle states, TTL clocks.

Categories
Identity (25%)
identity_verification

Keypairs, identity binding, challenge–response, and proof-of-possession flows aligned with Ed25519-style agent identities.

10 tests
Scope (25%)
scope_permission

Scoped capabilities, least-privilege enforcement, resource checks, and denial when permissions are missing or expired.

10 tests
Lifecycle (15%)
lifecycle_session

Session lifecycle, TTL clocks, rotation, logout/revocation, and safe handling of stale credentials.

10 tests
Delegation (20%)
delegation_trust

Delegation chains, trust boundaries, sub-agent constraints, and preventing privilege expansion across hops.

10 tests
Autonomous (15%)
autonomous_supervised

Human-in-the-loop gates, supervised vs autonomous modes, and escalation when high-risk auth decisions are required.

10 tests
Run benchmark