Technical Deep Dive20% Score Impact

Error Handling for AI Agents: Why Your 500 Page Matters More Than Your Homepage

When an AI agent hits an error on your API, it needs structured guidance — not a pretty HTML page with a sad robot illustration. The difference between <!DOCTYPE html><title>Something went wrong</title> and {"error":"Internal Error","code":"INTERNAL","request_id":"abc123","retry_after":5} is 20% of the Agent Readiness Score.

AgentHermes Research

April 16, 202612 min read

Agents Cannot Read Your Error Page

When a human hits a 500 error, they see a branded page, maybe read a message, and either refresh or come back later. The experience is frustrating but manageable. When an AI agent hits a 500 error, it receives an HTML document where it expected JSON. It cannot parse it reliably. It does not know whether to retry, wait, change parameters, or give up. It crashes — or worse, it retries in an infinite loop, burning API quota and degrading your service.

This is not an edge case. Agents encounter errors on every integration. Authentication expires. Rate limits trigger. Parameters validate incorrectly. Endpoints go down. The question is not whether agents will hit your errors — it is whether your errors will help them recover or leave them stranded.

Error handling directly impacts two dimensions that together carry 20% of the Agent Readiness Score: D6 Data Quality (10%) and D9 Agent Experience (10%). A third dimension, D8 Reliability (13%), rewards graceful degradation patterns. That means up to 33% of your score is influenced by how you fail, not just how you succeed.

20%

Direct score impact

33%

Total influenced

Error codes that matter

10 LOC

Middleware to fix it

The 5 Error Patterns Every Agent-Ready API Needs

Every agent-facing API must handle five error codes with structured JSON responses. Not HTML. Not plain text. Not a redirect. Structured JSON with machine-readable fields that tell the agent exactly what went wrong and exactly what to do next.

400 Bad Request

The agent sent invalid parameters. The response must tell the agent exactly which field failed and what is expected.

Agent-ready response

{"error":"Validation failed","code":"INVALID_PARAMS","details":[{"field":"email","message":"Must be a valid email address","received":"not-an-email"}],"request_id":"req_abc123"}

What agents actually get

<html><body><h1>400 Bad Request</h1></body></html>

Agent recovery action:Agent reads the details array, fixes the specific field, and retries immediately.

401 Unauthorized

The agent's credentials are missing, expired, or invalid. The response must indicate which auth method is expected and how to refresh.

Agent-ready response

{"error":"Token expired","code":"AUTH_EXPIRED","auth_type":"Bearer","refresh_url":"/oauth/token","request_id":"req_def456"}

What agents actually get

Redirects to /login HTML page

Agent recovery action:Agent calls refresh_url with its refresh token, gets a new access token, and retries the original request.

403 Forbidden

The agent is authenticated but lacks permission. The response must say which permission is required and how to request it.

Agent-ready response

{"error":"Insufficient scope","code":"FORBIDDEN","required_scope":"write:deployments","current_scopes":["read:deployments"],"request_id":"req_ghi789"}

What agents actually get

<html><body><h1>Access Denied</h1><p>You do not have permission.</p></body></html>

Agent recovery action:Agent compares current_scopes to required_scope, requests elevated permissions or reports the gap to the user.

404 Not Found

The requested resource does not exist. The response should confirm the resource type and suggest alternatives.

Agent-ready response

{"error":"Product not found","code":"NOT_FOUND","resource_type":"product","resource_id":"prod_xyz","suggestion":"Use GET /products to list available products","request_id":"req_jkl012"}

What agents actually get

<html><body><h1>404</h1><p>Page not found</p><a href="/">Go home</a></body></html>

Agent recovery action:Agent reads the suggestion field, calls the alternative endpoint, and discovers the correct resource ID.

429 Too Many Requests

The agent exceeded rate limits. The response must include retry timing so the agent can schedule the next attempt precisely.

Agent-ready response

{"error":"Rate limit exceeded","code":"RATE_LIMITED","retry_after":5,"limit":100,"remaining":0,"reset":"2026-04-16T10:05:00Z","request_id":"req_mno345"}

What agents actually get

<html><body><h1>Too Many Requests</h1><p>Please try again later.</p></body></html>

Agent recovery action:Agent reads retry_after, waits exactly 5 seconds, then retries. No guessing, no exponential backoff needed.

Exact Score Impact: D6 + D9 + D8 = Up to 33%

Error handling is not a single-dimension feature. It ripples across three dimensions of the Agent Readiness Score. Understanding where the points come from makes the ROI obvious — and as detailed in our D6 Data Quality analysis and D9 Agent Experience deep dive, these dimensions are often the easiest to improve.

D6 Data Quality (10%)

Up to 3 points from error handling alone

Structured error responses directly contribute to D6. Consistent JSON envelopes with typed error codes, machine-readable details, and request IDs demonstrate data quality even in failure states.

D9 Agent Experience (10%)

Up to 4 points from error handling alone

Request IDs for debugging, actionable error messages, retry_after headers, and structured details arrays are all D9 signals. The difference between "Something went wrong" and a structured error object is the difference between an agent that crashes and one that self-heals.

D8 Reliability (13%)

Up to 2 points from error handling patterns

How gracefully you fail is a reliability signal. Structured 500 errors with request IDs and incident references demonstrate operational maturity. A raw HTML stack trace demonstrates the opposite.

Real-World Error Handling: From Stripe to Cash App

The correlation between error handling quality and overall agent readiness score is nearly linear. Companies that return structured errors score Silver. Companies that return HTML errors score under 20.

Company

Score

Tier

Error Quality

Stripe

Silver

Perfect JSON on every error code. Type, code, message, param, doc_url. The gold standard.

GitHub

Silver

Structured JSON with message, documentation_url, and status. Missing detailed field-level validation.

Vercel

Silver

JSON errors with error.code, error.message, and request ID. Clean and consistent.

Cash App

Not Scored

HTML error pages on every failure. No API, no structured responses, no recovery path.

Local business (avg)

Not Scored

Custom 404 page with "Go back to homepage" link. No API endpoints exist to error on.

The pattern is clear:Stripe returns perfect error JSON on every response and scores 68. Cash App returns HTML on every failure and scores 12. The difference is not API surface area or feature count — it is whether the failure path is as structured as the success path. For a deeper look at what makes Stripe's errors exemplary, see our rate limiting analysis.

The 10-Line Middleware Fix

The fix is not a rewrite. It is a single middleware function that wraps your existing error handling. Every request gets a request ID. Every error response gets a JSON envelope. Every 429 gets a retry_after header. Every 401 gets an auth_type indicator.

Error response envelope standard:

{
  "error": "Human-readable error message",
  "code": "MACHINE_READABLE_CODE",
  "request_id": "req_unique_identifier",
  "details": [
    {
      "field": "specific_field",
      "message": "What is wrong",
      "received": "What was sent"
    }
  ],
  "retry_after": 5,
  "doc_url": "https://docs.example.com/errors/CODE"
}

Every response gets a request ID

Generate a UUID for every request. Include it in the X-Request-ID response header AND in the error JSON body. Agents use this for debugging, logging, and support escalation.

Every error is JSON, never HTML

Catch all errors at the middleware level. If the Accept header indicates JSON or the request is to an API path, always return JSON. Never let framework-default HTML error pages reach API consumers.

Every 429 includes retry timing

The Retry-After header is a standard HTTP header. Include it on every 429 response, plus retry_after in the JSON body. Agents use the more specific value — JSON body if available, header as fallback.

Every 401 includes auth guidance

Include auth_type (Bearer, API Key, OAuth) and refresh_url (if applicable) in 401 responses. This lets agents self-heal expired credentials without human intervention.

This pattern is framework-agnostic. Express, Next.js, FastAPI, Rails, Django — every framework supports error middleware. The implementation takes 30 minutes. The score impact is 5-9 points across three dimensions. That is the highest ROI change in the entire Agent Readiness framework — more than adding an OpenAPI spec, more than publishing agent-card.json, more than building an MCP server.

Frequently Asked Questions

Why does error handling matter so much for agent readiness?

Agents encounter errors constantly — invalid parameters, expired tokens, rate limits, server issues. A human sees an error page and adapts. An agent needs structured guidance to self-correct. If your errors return HTML, the agent crashes or makes random retry decisions. Structured errors enable self-healing: the agent reads the error code, understands what went wrong, and takes the correct next action. This is worth 20% of the total score across D6 and D9.

What is the minimum viable error response for agents?

At minimum, every error response should be a JSON object with three fields: error (human-readable message), code (machine-readable error type), and request_id (for debugging). This takes 10 lines of middleware to implement and immediately lifts D6 and D9 scores. Adding details (array of field-level errors) and retry_after (for 429s) pushes it to excellent.

Should I return JSON errors for browser requests too?

Yes — or use content negotiation. If the Accept header includes application/json, return JSON. If it includes text/html, return your pretty error page. Most API frameworks support this natively. The key insight is that your API endpoints should always return JSON errors, even if your server-rendered pages return HTML errors. Agents hit API endpoints, not page routes.

How does AgentHermes test error handling?

The scanner sends intentionally malformed requests and checks the response format. It sends requests with missing auth headers to test 401 handling. It sends invalid parameters to test 400 handling. It sends requests to nonexistent paths to test 404 handling. In each case, it checks whether the response is structured JSON with error codes, or unstructured HTML. The format of the error response — not just the status code — determines the score impact.

Continue Reading

Dimensions Deep Dive

Test your error handling now

Our scanner sends intentionally malformed requests to test your error responses. See exactly how your API handles 400, 401, 403, 404, and 429 — and how it impacts your Agent Readiness Score.

Scan My Error Handling

Share this article:

Complete Guide

Error Handling for AI Agents: Why Your 500 Page Matters More Than Your Homepage

Agents Cannot Read Your Error Page

The 5 Error Patterns Every Agent-Ready API Needs

400 Bad Request

401 Unauthorized

403 Forbidden

404 Not Found

429 Too Many Requests

Exact Score Impact: D6 + D9 + D8 = Up to 33%

D6 Data Quality (10%)

D9 Agent Experience (10%)

D8 Reliability (13%)

Real-World Error Handling: From Stripe to Cash App

The 10-Line Middleware Fix

Every response gets a request ID

Every error is JSON, never HTML

Every 429 includes retry timing

Every 401 includes auth guidance

Frequently Asked Questions

Why does error handling matter so much for agent readiness?

What is the minimum viable error response for agents?

Should I return JSON errors for browser requests too?

How does AgentHermes test error handling?

Continue Reading

Data Quality: Why Structured Responses Win (D6 = 10%)

Agent Experience (D9): The Dimension That Measures Usability

Rate Limiting for AI Agents: Why X-RateLimit-Remaining Matters

Test your error handling now

Related Articles

What Is Agent Readiness? The Complete Guide

State of Agent Readiness: Most Businesses Score Under 40

Why Stripe Scores 68 Silver