🔍

Explainability Benchmarks

Evaluates how well AI agents explain their reasoning, cite sources accurately, acknowledge uncertainty, justify decisions, and diagnose errors. 50 tests across 5 categories with multi-dimensional scoring per test.

Total Tests

Explainability Benchmarks

Run Benchmark

Per-Test Results (click to expand dimensions)

My Runs

Leaderboard

Browse Tasks