The CTO's Guide to Agent Readiness: Technical Decisions That Impact Your Score
Your Agent Readiness Score is not random. It is the direct result of 10 architectural decisions your engineering team made (or did not make). Each one maps to a specific dimension with a specific point impact. A CTO who reads this article can estimate their score before running a single scan.
Why Architecture Determines Agent Readiness
After scanning 500 businesses, one pattern is clear: agent readiness is an architecture outcome, not a marketing choice. The businesses that score Silver and Gold did not set out to be “agent-ready.” They made sound API architecture decisions that happen to be exactly what AI agents need.
Stripe scores 68. Resend scores 75. Vercel scores 70. None of them has an “agent readiness team.” They have engineering teams that chose API-first architecture, published OpenAPI specs, used Bearer auth, returned structured errors, and built status pages. These are CTO decisions.
Conversely, businesses that score under 20 made the opposite choices: website-first architecture, session cookies, HTML error pages, no API documentation. These are also CTO decisions — or decisions made by not having a CTO at all.
Here are the 10 decisions, each mapped to the dimension it impacts and the approximate points it contributes. Read through them, tally your answers, and you will have a reasonable estimate of your score before running a scan.
The 10 Decisions
#1: API-First vs Website-First Architecture
Website-first: build HTML pages, add API later (or never)
API-first: build the API, then build the website on top of it
API-first businesses score 45+ on D2 alone. Website-first businesses score 0-5. This is the single biggest architectural decision for agent readiness because D2 carries the highest weight of any dimension.
#2: OpenAPI Spec vs Ad-Hoc Documentation
Ad-hoc docs: Markdown pages, Notion wiki, or PDF guides
OpenAPI 3.0+ spec: machine-readable, auto-discoverable, generates client SDKs
An OpenAPI spec at /openapi.json or /swagger.json is the single most impactful file you can publish. Agents parse it instantly to understand every endpoint, parameter, and response shape. Ad-hoc docs require LLMs to interpret natural language — slower, less reliable, and prone to hallucinated endpoints.
#3: Bearer Token Auth vs Session Cookies
Session cookies: set-cookie header, CSRF tokens, browser-dependent state
Bearer tokens: Authorization: Bearer <token> header, stateless, machine-friendly
AI agents do not have browsers. They cannot store cookies, handle CSRF tokens, or maintain session state. Bearer token authentication is the only auth pattern that works reliably for agent-to-API communication. OAuth 2.0 client_credentials flow is the gold standard.
#4: Structured JSON Errors vs HTML Error Pages
HTML 500 page: pretty for humans, meaningless to agents
JSON errors: { "error": "message", "code": "INVALID_PARAM", "request_id": "abc" }
When an agent hits an error, it needs three things: what went wrong (error), a machine-parseable code (code), and a way to reference the failure (request_id). HTML error pages provide none of these. Agents that receive HTML errors cannot self-correct — they either retry blindly or give up.
#5: Cursor Pagination vs Offset Pagination
Offset: ?page=2&limit=20 — breaks when data changes between pages
Cursor: ?after=abc123&limit=20 — stable, no skipped or duplicated records
Agents iterate through datasets automatically, often processing thousands of records. Offset pagination causes duplicate or skipped items when records are inserted or deleted during iteration. Cursor-based pagination is deterministic regardless of data changes. Agents trust cursor pagination; they work around offset pagination.
#6: Webhook Events vs Polling
Polling: agents must call GET /resource every N seconds to check for changes
Webhooks: POST to agent endpoint when state changes, with HMAC signing and retry
Polling burns agent compute budget and misses events between intervals. Webhooks push state changes in real time. AgentHermes checks for webhook documentation, event catalog, HMAC signature verification, and retry logic. The top scorers (Stripe, GitHub, Slack) all publish comprehensive webhook systems.
#7: Sandbox Mode vs Production-Only
Production-only: agents must use real data and real money to test integrations
Sandbox mode: test credentials, fake data, same API surface, no real consequences
Agents will not risk real transactions while learning your API. Stripe test mode (sk_test_*) is the gold standard: identical API behavior with fake money. Without a sandbox, the agent integration cost is too high — one wrong API call with real data is unrecoverable.
#8: Public Status Page vs No Monitoring
No public monitoring: agents discover outages by hitting errors
Status page: status.domain.com or /status with uptime history and incident log
Before delegating work to your API, agents check if you are operational. A status page at a well-known URL (status.domain.com, /health, /status) lets agents make this check instantly. Without one, agents must infer reliability from error rates — and they learn quickly which APIs to avoid.
#9: Versioned APIs vs Breaking Changes
Unversioned: endpoints change behavior without notice, breaking agent integrations
Versioned: /v1/ prefix, Accept-Version header, 2-year backward compatibility
Agents hardcode API interaction patterns. A breaking change that renames a field or restructures a response causes agent failures that are never manually fixed — the agent just stops using your API. Stripe maintains backward compatibility for years. Unversioned APIs permanently lose agent traffic after the first breaking change.
#10: MCP Server + Agent Card vs Nothing
No agent-native infrastructure: rely entirely on OpenAPI discovery
MCP server with tools + agent-card.json at /.well-known/ for A2A discovery
The agent-native bonus is the newest scoring dimension. It rewards businesses that go beyond APIs to provide MCP servers (tool-calling protocol for agents), agent-card.json (A2A discovery), and llms.txt (LLM-readable business summary). Only 2 of 500 businesses scanned have any of these. Shipping all three is 30 minutes of work for +5-8 points.
Estimate Your Score
Count how many of the 10 decisions you have made correctly. Here is where you likely land:
Website-only, no API, session cookies, HTML errors. Completely invisible to agents.
Has an API but no spec. Basic auth. No sandbox, no status page, no versioning. Agents can find you but struggle to use you.
OpenAPI spec published. Bearer auth. Some structured errors. Missing webhooks, sandbox, and agent-native files.
Full OpenAPI + Bearer + structured errors + status page + versioned API + webhooks. Missing MCP server and agent-card.json.
All 10 decisions made correctly. MCP server + agent-card + llms.txt + sandbox + cursor pagination + HMAC webhooks.
Gold + x402 micropayments + sub-100ms p95 latency + automated onboarding + multi-protocol support. Nobody has achieved this yet.
The math: If you made 0-2 correct decisions, expect 0-25. Three to five correct decisions typically produce 30-50. Six to eight put you in 50-70. All 10 correct decisions push toward 75+. The exact score depends on implementation quality (not just presence), but the decisions themselves account for 70-80% of the variance across 500 scans.
What the Top Scorers All Have in Common
Every business scoring Silver or above in our 500-business scan shares the same foundation: API-first architecture with published OpenAPI spec, Bearer token auth, structured JSON errors, a status page, and versioned endpoints. That is decisions 1, 2, 3, 4, 8, and 9 — accounting for 70% of the scoring weight.
The top scorer (Resend, 75 Gold) has all 10. The next tier (Vercel 70, Supabase 69, Stripe 68) has 8-9 of 10. The difference between Silver and Gold is always the agent-native files: MCP server, agent-card.json, and llms.txt. These take an afternoon to ship and represent the easiest path from Silver to Gold.
Meanwhile, the Fortune 500 averages 37 — below Bronze. Not because they lack engineering resources, but because their architecture was built for human-first web experiences. The decision to be website-first (Decision 1 wrong) caps everything else.
The Priority Order: What to Ship First
If you are starting from zero, the implementation order matters. Here is the sequence that produces the fastest score improvement based on weight-per-effort:
API-first architecture (Decision 1)
OpenAPI spec (Decision 2)
Bearer token auth (Decision 3)
Structured JSON errors (Decision 4)
Status page (Decision 8)
Agent-native files (Decision 10)
Frequently Asked Questions
How long does it take to implement all 10 decisions?
It depends on your starting point. If you already have an API, adding an OpenAPI spec (Decision 2), structured errors (Decision 4), and agent-card.json (Decision 10) takes a single sprint — maybe 2-3 days of engineering time. If you are website-only (Decision 1), the API-first migration is the foundation and takes 2-6 months depending on complexity. The good news: each decision is independent. You can ship them in any order and see incremental score improvements after each one.
Which decision should I prioritize first?
If you have no API: Decision 1 (API-first) is the prerequisite for everything else. If you have an API but no spec: Decision 2 (OpenAPI) is the highest-leverage single file you can ship. If you already have an OpenAPI spec: Decision 10 (MCP + agent-card) is the fastest path to the agent-native bonus that separates Silver from Gold.
How accurate is the score estimate in this article?
The estimates are based on scoring 500 businesses and identifying the patterns that separate each tier. Your actual score also depends on factors like response latency, documentation quality, and pricing transparency that are not covered by these 10 decisions. However, these 10 decisions account for roughly 70-80% of the variance between high and low scorers. Run a free scan at /audit to see your exact score.
Do I need to be API-first to score well?
Technically no, but practically yes. The highest-scoring non-API-first business in our 500-business scan scored 38 — below Bronze. API-first architecture is not just one decision; it is the foundation that makes every other decision possible. Without callable endpoints, there is nothing for agents to discover, authenticate against, or use. The score caps at 29 without callable endpoints.
See your actual score
You have estimated it. Now verify it. Run a free Agent Readiness Scan and see exactly how your architecture maps to all 9 dimensions. 60 seconds. No signup.