Technical AnalysisD8 Reliability

SLAs and Uptime Guarantees: Why 99.9% Isn't Enough for Agent Readiness

D8 Reliability carries a 0.13 weightin the Agent Readiness Score — the second-highest dimension. But “99.9% uptime” on a pricing page scores almost nothing. Agents need machine-readable SLAs, real-time uptime data, and programmatic incident history. The difference between “we are reliable” and proving it.

AgentHermes Research

April 15, 202613 min read

The Reliability Paradox

Almost every SaaS company claims high uptime. “99.9% availability” appears on pricing pages, in sales decks, and in enterprise contracts. For human buyers, this works — they read the number, check a few reviews, and make a trust decision based on brand reputation.

AI agents do not trust brands. They trust data. When an agent evaluates whether to route traffic through your service, it looks for evidence — not claims. A “99.9% uptime” statement in prose is worth almost nothing because the agent cannot verify it, cannot monitor it, and cannot hold you accountable programmatically if it turns out to be false.

This creates a paradox: the most reliable services often score poorly on D8 because they prove their reliability to humans (reputation, case studies, testimonials) rather than to machines (structured data, APIs, monitoring endpoints). A service with 99.99% actual uptime but no status API scores lower than a service with 99.5% uptime and a comprehensive reliability infrastructure.

0.13

D8 Reliability weight

80%

of businesses: no SLA

15%

have any status page

<3%

have structured SLA data

Four Levels of SLA Maturity for Agents

Not all reliability infrastructure is created equal. Here is how AgentHermes evaluates SLA maturity, from invisible to fully agent-ready.

Level 0: No SLA

0/100 on D8

No uptime commitment anywhere. No status page. The business might be up, might be down — the agent has no way to know until a request fails. This is where 80% of businesses sit.

Level 1: Marketing SLA

15/100 on D8

"99.9% uptime guaranteed" on the pricing page. No machine-readable format. No historical data. No incident log. The agent reads this as prose and cannot verify or act on it. Better than nothing, but barely.

Level 2: Status Page

35/100 on D8

A status page (StatusPage.io, Instatus, custom) showing current system status. Maybe an RSS feed. The agent can poll the page, but the format varies wildly. Partial credit for trying.

Level 3: Structured SLA

85/100 on D8

Machine-readable SLA document (JSON or YAML). Real-time uptime percentage endpoint. Incident history API with timestamps, severity, and resolution. Planned maintenance calendar. Compensation terms for downtime. This is agent-ready reliability.

Five Components of an Agent-Ready SLA

A fully agent-ready reliability infrastructure includes five machine-readable components. Together, they let agents make informed routing decisions without trusting marketing claims.

Machine-Readable SLA Document

A JSON or YAML file at a well-known path (e.g., /.well-known/sla.json) that declares uptime commitment, response time guarantees, support hours, escalation procedures, and compensation terms. Agents parse this before deciding to depend on a service.

Example: { "uptime_sla": "99.95%", "response_time_p99_ms": 200, "support_hours": "24/7", "compensation": { "below_99_9": "10% credit", "below_99_5": "25% credit" } }

Real-Time Uptime Endpoint

An API endpoint that returns current uptime percentage over 30/90/365 day windows. Not a status page that says "Operational" — an endpoint that returns { "uptime_30d": 99.97, "uptime_90d": 99.94 }. Agents use this to make real-time routing decisions.

Example: GET /api/status/uptime → { "uptime_30d": 99.97, "uptime_90d": 99.94, "last_incident": "2026-04-01T..." }

Incident History API

A structured log of past incidents with timestamps, severity levels, affected services, root causes, and resolution times. Agents analyze this to assess reliability trends — is the service getting more reliable or less? Are incidents clustered around certain times?

Example: GET /api/status/incidents → [{ "id": "INC-042", "severity": "major", "started": "...", "resolved": "...", "services": ["api", "webhooks"] }]

Planned Maintenance Calendar

An endpoint or calendar feed (iCal/JSON) listing scheduled maintenance windows. Agents use this to avoid routing traffic during maintenance, schedule jobs around downtime, and set user expectations proactively.

Example: GET /api/status/maintenance → [{ "window": "2026-04-20T02:00Z/2026-04-20T04:00Z", "services": ["api"], "impact": "partial" }]

Compensation Terms API

Machine-readable compensation rules: what happens when the SLA is violated? Agents need to know if they can claim credits automatically, what the threshold is, and how to initiate a claim. This moves SLA enforcement from legal documents to programmatic verification.

Example: GET /api/sla/compensation → { "eligible": true, "current_uptime": 99.89, "sla_target": 99.95, "credit_percentage": 10 }

Human Trust vs Agent Trust: Side-by-Side

Humans and agents evaluate reliability through completely different lenses. Here is how the same reliability information is consumed by each.

Aspect

Human Evaluation

Agent Evaluation

Uptime Claim

"99.9% uptime" on pricing page

JSON SLA at /.well-known/sla.json with exact terms

Current Status

Green dot on status page

GET /api/status → { "status": "operational", "uptime_30d": 99.97 }

Incident Info

Blog post about the outage

GET /api/incidents → structured log with timestamps and severity

Maintenance

Email 24 hours before

Calendar feed at /api/maintenance with machine-readable windows

Compensation

Email support, cite SLA, wait

GET /api/sla/compensation → auto-calculated credit

Reliability Score

Trust based on reputation

Computed from uptime data, incident frequency, MTTR

The pattern is clear: agents need structured, verifiable data at every point where humans use judgment. A human reads a post-mortem blog post and decides if the company handled the outage well. An agent reads an incident history API and computes mean time to resolution. Both are evaluating reliability — but only the agent approach scales to thousands of decisions per second.

What AgentHermes Checks for D8 Reliability

The AgentHermes scanner evaluates D8 Reliability by checking for evidence of reliability infrastructure — not by monitoring your uptime directly. Here is what contributes to your D8 score.

Status Page Presence

Up to 20 points

Any status page at /status, status.yourdomain.com, or linked from your footer. HTML status pages score less than JSON/API status endpoints.

SLA Documentation

Up to 25 points

Documented SLA terms anywhere on the site — pricing page, legal page, or dedicated SLA page. Machine-readable format (JSON) scores significantly higher than prose.

Incident History

Up to 20 points

Historical incident data accessible via any format. API endpoint with structured data scores highest. A changelog or blog with outage reports scores partial credit.

Response Time Evidence

Up to 15 points

Server-Timing headers, response time documentation, or performance monitoring integration. Shows you measure and care about latency, not just uptime.

Maintenance Communication

Up to 10 points

Any evidence of planned maintenance communication — calendar, RSS, email list, or status page maintenance schedule. Proactive communication scores higher.

Monitoring Integration

Up to 10 points

Evidence of third-party monitoring (Datadog, PagerDuty, Better Uptime badges). Shows professional infrastructure investment in reliability.

The scoring principle: AgentHermes checks for SLA documentation, not actual uptime. We are not a monitoring service — we evaluate whether your reliability infrastructure is agent-accessible. A business with 100% uptime but zero documentation scores lower than a business with 99.5% uptime and comprehensive reliability APIs. The documentation is what makes the reliability useful to agents.

Who Does Reliability Infrastructure Well?

In our 500-business scan, the highest D8 scores came from developer infrastructure companies. This makes sense — their customers are engineers who demand measurable reliability, not marketing promises.

Stripe scores well on D8: public status page with JSON API, documented SLA for enterprise, incident history, and Server-Timing headers on API responses. Vercel has a real-time status page with component-level status and incident history. Supabase publishes uptime data and has a status page with RSS feed.

The gap is massive when you move outside developer tools. E-commerce platforms, local services, and most SaaS products have zero reliability infrastructure visible to agents. They might be perfectly reliable — but agents cannot tell. This is the difference between being reliable and being provably reliable.

Frequently Asked Questions

Why does D8 Reliability have a 0.13 weight — higher than API quality?

Because an unreliable service is worse than no service. If an agent integrates with your API and it goes down, the agent fails in front of the user. Agents will preferentially route to services with proven reliability. A service with a 60-score API and 90-score reliability beats a service with a 90-score API and 30-score reliability every time.

Is a status page enough for agent readiness?

A status page gets you to about 35/100 on D8. It shows intent but lacks the structured data agents need. StatusPage.io does have a JSON API, which helps — but most status pages are HTML-only. The gap between "we have a status page" and "we have machine-readable reliability infrastructure" is the gap between 35 and 85 on D8.

What is the difference between claiming 99.9% and proving it?

Claiming it means putting text on a webpage. Proving it means exposing a real-time uptime endpoint that returns actual measured data over rolling windows. Any business can claim 99.9%. Only businesses with monitoring infrastructure can prove it. Agents only trust what they can verify programmatically.

Do agents actually check SLA data before making routing decisions?

Increasingly, yes. Multi-agent orchestrators like LangChain and CrewAI already support tool reliability scoring. When an agent has multiple tools that can accomplish the same task, it routes to the most reliable one. Exposing your reliability data makes you the preferred choice in these routing decisions.

How does AgentHermes check for SLA documentation?

The scanner looks for: status page presence (any format), JSON status endpoints, /.well-known/sla.json, incident history pages or APIs, response time headers (Server-Timing), and documented SLA terms on pricing or legal pages. Each element contributes to the D8 Reliability score. The more structured and machine-readable, the higher the score.

Continue Reading

Technical Guide

Is your reliability infrastructure agent-ready?

See your D8 Reliability score and find out what agents can verify about your uptime. Free scan in 60 seconds.

Check My Score Connect My Business

Share this article:

Complete Guide