🧠

Memory Hallucination Detection (HaluMem)

Operation-level evaluation of memory systems: does your agent remember correctly, or does it hallucinate, omit, and corrupt? 80 tests across 3 task categories with stage-level blame attribution.

80
Total Tests
3
Task Types
8
User Personas
11
Metrics
--
LLM Status
Run HaluMem Benchmark