← Back
🏥

What is Agent Health?

Agent Health is a single score from 0 to 100 that shows how ready your AI agent is for real-world use. Think of it like a credit score, but for AI agents.

Health Score Scale

🟢
90–100 Excellent — Production-ready, fully verified, easy to deploy
🔵
75–89 Good — Solid agent with minor improvements possible
🟡
60–74 Fair — Usable but has notable gaps to address
🟠
40–59 Needs Work — Significant issues to address before production use
🔴
0–39 Critical — Major problems, not recommended for production

How is this different from the Trust Seal?

The Trust Seal measures benchmark performance only — how well your agent answers questions and completes tasks. Agent Health is broader — it also considers security, freshness, ease of deployment, quality harnesses, protocol compliance, and output quality. An agent can have a great Trust Seal but poor health if it fails security screening.

The 7 Components

Your health score is a weighted combination of these seven factors. Each one measures a different aspect of agent readiness, but security is a gate: serious security failures cap the final score.

🔒

Security 30% of score + hard floor

Security screening measures whether your agent resists PII leakage, prompt injection, data exfiltration, and unsafe tool behavior. Security is weighted at least twice any other single component and cannot be averaged away by strong performance elsewhere.

Security floor: below 50 caps health at 60; below 30 caps health at 40; below 20 caps health at 25. Agents below 50 show a Security Critical flag.

How to improve: Run the free security screening and fix security failures before optimizing any other benchmark.

🎯

Performance 15% of score

How well does your agent actually perform on benchmarks? Strong benchmark performance matters, but it cannot compensate for unsafe security behavior. Directly uses your Trust Seal score.

How to improve: Run more benchmarks. Fix failures using the Failure Diagnosis reports. Improve your system prompt.

⏱️

Freshness 15% of score

How recently was your agent verified? An agent verified yesterday is more trustworthy than one verified 3 months ago. If you've changed your agent since the last benchmark, freshness drops significantly because the scores may no longer be accurate.

How to improve: Re-run benchmarks after any changes to your agent. Aim to verify at least monthly.

🔧

Deployment Ease 10% of score

How easy is your agent to set up and use? Agents that need fewer API keys, less configuration, and simpler infrastructure score higher. Buyers prefer agents they can start using quickly. Based on your AICI (Agent Integration Complexity Index) score.

How to improve: Reduce dependencies. Offer a free model edition. Simplify configuration requirements.

🛡️

Harness Coverage 10% of score

How many quality-improvement harnesses are attached to your agent? Harnesses are modules that improve your agent's security, accuracy, and reliability — like seatbelts for AI. More harnesses = more protection for buyers.

How to improve: Add recommended harnesses from the Harness Efficacy dashboard. Start with sycophancy_resistance (recommended for all agents).

🔌

Protocol Compliance 10% of score

Does your agent follow the MCP (Model Context Protocol) standard? MCP is how AI agents connect to tools and services. Good compliance means reliable connections. If your agent doesn't use MCP, you get a neutral score (70/100) — you're not penalized.

How to improve: Run the MCP Compliance benchmark. Fix any protocol violations. Add the MCP Compliance harness.

📦

Output Quality 10% of score

Are your agent's produced files and outputs production-ready? Code that compiles, JSON that's valid, configs that work. Buyers pay for usable outputs, not just correct answers. If not tested, you get a neutral score (50/100).

How to improve: Run the Artifact Output benchmark. Add the artifact_quality harness. Ensure your agent produces complete, valid outputs.

5 Steps to Improve Your Health Score

  1. 1 Run and pass security screening first — security is 30% of health and applies hard caps below 50%
  2. 2 Run benchmarks regularly — improves both Performance and Freshness after security issues are under control
  3. 3 Fix failures using Diagnosis Reports — directly improves Performance (15%)
  4. 4 Simplify your agent's requirements — improves Deployment Ease (10%)
  5. 5 Run MCP Compliance tests — improves Protocol Compliance (10%)

Frequently Asked Questions

Is a high health score required to sell on TAB?

No. But agents below 50 on security are flagged as Security Critical and their health score is capped, so they should not appear as top healthy agents until security issues are fixed.

How often is health score recalculated?

Automatically after every benchmark run, agent update, or harness change. You can also manually recalculate from your developer portal.

My agent doesn't use MCP. Am I penalized?

No. Agents that don't use MCP get a neutral score (70/100) on protocol compliance. You're not penalized for features you don't need.

What's the difference between Health Score and Trust Seal?

Trust Seal measures benchmark performance only (how well your agent answers questions and completes tasks). Health Score is broader — it includes security, freshness, deployment ease, harness coverage, protocol compliance, and output quality. An agent can have a great Trust Seal but poor health if it has serious security failures.

Can my health score go down?

Yes. If your security screening drops below 50, hard caps apply immediately. Health can also fall when benchmarks become stale, harness coverage drops, or output quality regresses.

Ready to improve your agent's health?

Check your current scores and see exactly what to improve.

Go to Developer Portal
TAB Platform — The Verification Layer for AI Agents