Skip to main content
Dimensions Deep DiveD1 — 12% Weight

Discoverability and Agent Readiness: Why 40% of Businesses Fail D1

D1 Discoverability carries a 0.12 weight in the Agent Readiness Score. Simple idea: if agents cannot find you, nothing else matters. Across 500 businesses scanned, 199 of them (40%) score so low on D1 they never escape the Unaudited tier — regardless of how strong their API, auth, or pricing happen to be.

AH
AgentHermes Research
April 15, 202612 min read

The 500-Business D1 Data

AgentHermes has scanned 500 businesses across 27 verticals. The score distribution is bluntly skewed: 1 Gold (Resend at 75), 52 Silver, 249 Bronze, and 199 below the Bronze threshold. The 199 below Bronze mostly share one trait — they cannot be found by agents.

Zero of 500 publish an agent-card.json. Two of 500 publish an MCP server. Fewer than 5% publish an llms.txt. The businesses in the Unaudited tier tend to lack all three, often combined with a broken sitemap or a Cloudflare challenge wall that rejects agent traffic outright. D1 can account for up to 12 points. When it contributes near zero, a business that is otherwise a solid 45 lands at 33 — below the Bronze threshold — and gets tagged Not Scored.

500
businesses scanned
199
below Bronze (mostly D1)
0
publish agent-card.json
12%
D1 weight in score

The Six D1 Checks

D1 Discoverability is not one check — it is a bundle. Each sub-check contributes to the D1 score, and the full 12% is only earned when an agent can actually find and understand the business through standard discovery paths.

DNS resolves cleanly

The domain has a valid A or AAAA record, responds within 2 seconds, and does not require a Cloudflare challenge page before agents can read anything. Broken DNS caps every downstream check.

robots.txt allows AI crawlers

The big three — GPTBot, anthropic-ai, Google-Extended — are either explicitly allowed or not explicitly blocked. Bonus credit for a declared crawl delay and sitemap reference.

sitemap.xml is present and valid

A sitemap under /sitemap.xml or referenced from robots.txt, returning 200, with at least a few URLs. Agents use sitemaps to enumerate crawlable surface without guessing.

/.well-known/agent-card.json exists

The A2A protocol discovery file that declares agent capabilities, skills, and endpoints. The single strongest positive signal for agent discovery — and exactly zero of the 500 businesses we scanned ship it.

llms.txt at the root

Markdown summary of what the business does, for LLM consumption. Lives at /llms.txt. Detected on fewer than 5% of scanned businesses. Cheap to add, meaningful score bump.

Structured Open Graph metadata

og:title, og:description, og:type, og:site_name on the homepage. Agents fall back to OG tags when richer discovery is missing — it is the lowest-effort signal to ship.

The first three checks (DNS, robots, sitemap) are SEO-era hygiene. Agents inherit them from the crawler lineage they grew out of. The last three (agent-card.json, llms.txt, structured OG) are agent-era additions. Most businesses nail the first three and fail every single one of the last three — that is the exact shape of the D1 gap.

Why D1 Gates Everything Else

In the six-step agent journey, FIND is step one. It is the only step where, if the agent fails, nothing downstream can save you. If the agent cannot find you, it does not matter that your OpenAPI spec is flawless, your D7 Security is Platinum-grade, or your D5 Payment supports x402. The agent moves on to a competitor and never knew you existed.

This is why D1 has a disproportionate impact at the low end of the score distribution. Businesses clustered in the 30-40 range tend to have decent APIs but terrible discovery. They are full-stack functional — but invisible. Fixing the D1 gap is usually a single afternoon of work and moves them from Not Scored to Bronze overnight.

The 12% floor rule: across the 500-business scan, businesses that scored at least 9 of 12 on D1 never landed in Unaudited. The D1 score is a near-perfect predictor of whether a business escapes Not Scored, independent of the other eight dimensions. Nail D1 and you clear the floor.

The Five Patterns That Keep Sinking D1

After running 500 scans, the same failures appear over and over. None of them are technically hard to fix. All of them cost real score points.

Blocking GPTBot in robots.txt

Some businesses added User-agent: GPTBot / Disallow: / after the AI-training panic of 2023. Whether or not that helps with training, it also blocks in-session retrieval by agents running Claude, ChatGPT, and Perplexity. You are blocking your customers.

No sitemap at all

Over half the businesses we scanned serve no sitemap. Agents cannot enumerate what exists. They fall back to guessing from the homepage, which is guaranteed to miss most of your content.

Cloudflare challenge walls for bots

Cloudflare Super Bot Fight Mode and the "Challenge" rule bucket agent traffic alongside scrapers. The agent hits a JavaScript challenge, cannot solve it, gives up. You are invisible by default.

No agent-card.json, no llms.txt

0 of 500 have agent-card.json. Fewer than 5% have llms.txt. These are the two files agents explicitly look for, and almost nobody ships them. The first-mover advantage is literally sitting in /well-known.

JavaScript-only homepage

SPAs that render nothing without client-side JS look empty to most crawlers. The agent sees an empty body tag and concludes there is nothing to index. Add SSR or a pre-rendered meta-tag fallback.

Notice the theme: these are not technical failures, they are policy failures. Someone at the business decided to block bots, or never bothered shipping a sitemap, or assumed JavaScript-only would be fine, or did not realize agents needed special discovery files. Each decision was rational in a pre-agent world. None of them survive contact with the agent economy.

The One-Afternoon D1 Fix

A complete D1 fix is genuinely an afternoon of work for a team that already owns the domain. Here is the ordered sequence that moves the most points per minute.

1

Audit robots.txt

Remove any Disallow rules for GPTBot, anthropic-ai, or Google-Extended. Add a Sitemap: line pointing to your sitemap. Test with a headless curl as a bot user-agent to confirm access.

2

Verify sitemap.xml

Make sure /sitemap.xml exists, returns 200, and enumerates every important URL. Most platforms generate this automatically (Next.js, WordPress with Yoast, Shopify) — but verify it actually works.

3

Ship llms.txt at root

Plain markdown at /llms.txt describing your business, core products, pricing, API endpoints, and contact info. 200 words minimum. Fewer than 5% of scanned businesses have this — instant differentiation.

4

Ship agent-card.json at /.well-known/

The A2A discovery file. AgentHermes generates this for you in the /connect wizard, or you can hand-write one from the spec. Zero of 500 scanned businesses have this — total greenfield.

5

Disable Cloudflare bot challenges for AI agents

In Cloudflare Security settings, exempt known AI user-agents from Challenge. You can still block malicious scrapers — just do not accidentally block the customers you want.

6

Add Open Graph tags

og:title, og:description, og:type, og:site_name on every page. Cheap, universally supported, and the fallback signal agents use when nothing else is present.

7

Rescan

Run your domain through /audit again. Watch the D1 score climb from single digits to 9-11 out of 12. If your baseline was Not Scored, you are now firmly in Bronze.

None of these steps require a developer to refactor your stack. Steps 3 and 4 are static files. Steps 1, 2, and 6 are one-line edits. Step 5 is a Cloudflare checkbox. The total is a half-day of work for a payoff that moves most businesses from Not Scored into Bronze on the next scan.

Frequently Asked Questions

Why is D1 Discoverability weighted at exactly 0.12?

D1 sits in Tier 2 of the Agent Readiness Score weighting model at 0.12 — high enough to matter, not high enough to dominate. The logic: if agents cannot find you, the other eight dimensions are moot, so D1 acts as a gate. But once you clear the discoverability bar, the remaining 88% of the score comes from things agents care about after they find you (API quality, security, reliability, data quality). Discoverability is necessary, not sufficient.

How do you know 199 of 500 fail D1?

Across the 500-business scan, 199 businesses scored below the Bronze threshold of 40 — most of them because D1 came in so low that no amount of strength on other dimensions could lift them out. When a site has no sitemap, no llms.txt, no agent-card.json, and a Cloudflare challenge wall, D1 contributes nearly zero points, and the 0.12 weight becomes a dead loss. The business is effectively invisible, regardless of what the rest of its infrastructure looks like.

Is blocking GPTBot really that bad for agent readiness?

It depends what you are trying to protect. Blocking GPTBot prevents OpenAI from using your content in training — which is a legitimate choice. But it also blocks ChatGPT from retrieving your content during a user's session, which is a different system. If a user asks ChatGPT about your business and GPTBot is blocked, the agent returns "I cannot access this page" and recommends a competitor. Consider allowing GPTBot for browsing while blocking it for training via the new opt-out signals in robots.txt.

What does a minimal agent-card.json look like?

At its simplest: {"name": "Business Name", "description": "What you do in one sentence", "url": "https://yourdomain.com", "capabilities": ["get_info", "search", "check_availability"], "contact": {"email": "hello@yourdomain.com"}}. That is enough for agents to discover you and understand at a high level what you offer. You can iterate from there — add skills, endpoints, auth flow, pricing, support URL. The file lives at /.well-known/agent-card.json and follows the A2A protocol discovery spec.

Will fixing D1 actually move me out of the Unaudited tier?

In most cases, yes — but only if the rest of your score is not also zero. If you have a working API, TLS, and a functioning pricing page, fixing D1 reliably moves Bronze-adjacent scores into Bronze. If your site is a static marketing page with no backend, fixing D1 alone will not get you to 40. But it is always the first dimension to fix, because it is the cheapest and it gates everything else.

Does AgentHermes auto-generate the D1 files?

Yes. When you run the /connect wizard, AgentHermes generates a llms.txt, an agent-card.json, and an agent-hermes.json tuned to your business details. You download them, upload them to your domain root and /.well-known/ path, and D1 jumps on the next scan. For businesses on platforms we adapt to (Shopify, WooCommerce, Square), we can install the files through the platform API directly. See /connect for details.


See your D1 score in 60 seconds

AgentHermes runs the full six-check D1 audit on your domain and tells you exactly which discovery files are missing. Fix what matters first, then rescan.


Share this article: