Technical GuideHands-On

How to Test Your Agent Readiness Score Before Going Live

Before you scan with AgentHermes, you can self-check with 10 curl commands from your terminal. These tests cover the signals that account for roughly 70% of your total score. Run them, fix the failures, then scan for the full 9-dimension breakdown.

AgentHermes Research

April 15, 202614 min read

Why Self-Test First?

The AgentHermes scanner checks over 40 signals across 9 weighted dimensions with vertical-specific scoring profiles. That level of detail is valuable, but you do not need it to identify the most obvious gaps. A terminal and curl will tell you whether you are in the invisible tier (0-19), struggling tier (20-39), or competitive tier (40+) in about five minutes.

More importantly, self-testing teaches you what agents actually look for when they evaluate your business. Each test below maps to a specific dimension and explains why that signal matters. Fix the failures, then run the full AgentHermes scan to see where you land across the complete scoring model.

curl commands

~70%

of score weight covered

5 min

to run all tests

7 of 9

dimensions tested

The 10-Test Agent Readiness Self-Check

Replace yoursite.com with your actual domain. Run each command from any terminal with curl installed. Count your passes.

Does your API return JSON?

D2 API Quality (15%)

curl -s https://yoursite.com/api | head -c 200

Pass: Returns valid JSON: {"status":"ok",...}

Fail: Returns HTML: <!DOCTYPE html>...

If your primary API endpoint returns HTML instead of JSON, agents cannot parse it. This single check accounts for the largest dimension weight in the entire score.

Is GPTBot allowed in robots.txt?

D1 Discoverability (12%)

curl -s https://yoursite.com/robots.txt

Pass: No blanket Disallow for GPTBot, ClaudeBot, PerplexityBot

Fail: User-agent: * Disallow: / (blocks all bots including AI)

Blocking AI crawlers in robots.txt makes you invisible to AI models. Check for User-agent: GPTBot, ClaudeBot, and Google-Extended specifically. A blanket Disallow: / blocks everything.

Do you have an agent-card.json?

D9 Agent Experience (10%)

curl -s -o /dev/null -w "%{http_code}" https://yoursite.com/.well-known/agent-card.json

Pass: Returns 200 with valid JSON agent card

Fail: Returns 404 (like 99.8% of businesses)

agent-card.json is the A2A protocol discovery file. It tells agents what your business does and how to interact with it. Fewer than 1% of 500 businesses scanned have one.

Does content negotiation work?

D6 Data Quality (10%)

curl -s -H "Accept: application/json" https://yoursite.com/ -o /dev/null -w "%{http_code}\n%{content_type}"

Pass: Content-Type: application/json (returns JSON when asked)

Fail: Content-Type: text/html (ignores Accept header)

AI agents send Accept: application/json. Most websites ignore this header and return HTML anyway. If your server respects content negotiation, agents get structured data without a separate API.

Does /health exist?

D8 Reliability (13%)

curl -s -o /dev/null -w "%{http_code}" https://yoursite.com/health

Pass: Returns 200 with uptime/status information

Fail: Returns 404 or redirects to homepage

Agents check /health before delegating work to your API. No health endpoint means no confidence signal. Also check /api/health and /status. Even a simple {"status":"ok"} adds points.

Is there an OpenAPI spec?

D2 API Quality (15%)

curl -s -o /dev/null -w "%{http_code}" https://yoursite.com/openapi.json

Pass: Returns 200 with valid OpenAPI 3.x spec

Fail: Returns 404 (no published spec)

OpenAPI specs let agents auto-discover every endpoint, parameter, and response type. Also check /swagger.json, /api-docs, and /api/openapi.json. This is the single biggest factor in D2.

Does llms.txt exist?

D1 Discoverability (12%)

curl -s -o /dev/null -w "%{http_code}" https://yoursite.com/llms.txt

Pass: Returns 200 with markdown content for LLMs

Fail: Returns 404 (95% of businesses)

llms.txt is a markdown file that tells AI models what your business is and does. It is the cheapest, fastest way to boost D1 and D9. Takes 10 minutes to create.

Does your API return structured errors?

D6 Data Quality (10%)

curl -s https://yoursite.com/api/nonexistent-endpoint

Pass: Returns JSON: {"error":"not_found","code":404,"message":"..."}

Fail: Returns HTML 404 page or generic server error

When agents hit an error, they need structured JSON guidance to recover. HTML error pages are useless to agents. Check that 404, 400, and 500 responses all return JSON with error codes.

Is there a security.txt?

D7 Security (12%)

curl -s -o /dev/null -w "%{http_code}" https://yoursite.com/.well-known/security.txt

Pass: Returns 200 with RFC 9116 compliant content

Fail: Returns 404

security.txt signals API maturity to agents evaluating trustworthiness. 100% of Silver-tier businesses have one. Takes 2 minutes to create.

Does TLS work properly?

D7 Security (12%)

curl -sI https://yoursite.com | grep -i strict-transport

Pass: Returns Strict-Transport-Security header

Fail: No HSTS header or HTTPS redirect fails

No TLS = hard cap at 39. This is non-negotiable. Also check for HSTS header which signals HTTPS-only commitment. Without valid TLS, nothing else matters.

Interpreting Your Results

Count how many of the 10 tests your site passes. Here is what each range predicts for your actual AgentHermes scan score.

Tests Passed

Predicted Range

Tier

0-2

0-19

Not Scored

3-4

20-39

Not Scored

5-6

40-54

Bronze

7-8

55-64

Bronze/Silver

9-10

60-75+

Silver/Gold

Important caveat: These 10 tests cover 7 of 9 dimensions but cannot check D3 Onboarding (self-service API key signup), D4 Pricing (structured pricing data), or D5 Payment (programmatic payment flow). Those three dimensions account for 21% of the score. A business that passes all 10 curl tests but has no pricing page and no self-service signup will still score lower than expected. The full AgentHermes scan catches everything.

Priority Fix Order

If multiple tests fail, fix them in this order. Each step unlocks the maximum score improvement per hour of effort. This sequence is derived from the 30-signal checklist and real scan data from 500 businesses.

Fix TLS (Test 10)

(30 min)

Hard cap at 39 without it. If you are on any modern hosting (Vercel, Netlify, Cloudflare) you already have this. If not, Cloudflare free tier adds HTTPS in 15 minutes.

Unblock AI crawlers (Test 2)

(5 min)

Edit robots.txt. Allow GPTBot, ClaudeBot, PerplexityBot, Google-Extended. One file edit, immediate impact on D1 Discoverability.

Add llms.txt (Test 7)

(15 min)

Create a markdown file describing your business for AI models. Drop it at /llms.txt. Boosts D1 and D9 simultaneously.

Return JSON from your API (Test 1)

(1-4 hours)

If you have no API, this is the biggest build. If you have one that returns HTML, add Content-Type: application/json. Largest single score impact.

Add /health endpoint (Test 5)

(15 min)

One endpoint returning {"status":"ok","timestamp":"..."}. Boosts D8 Reliability which carries 13% weight.

Create agent-card.json (Test 3)

(30 min)

Place at /.well-known/agent-card.json. Describes your business capabilities for A2A discovery. Use the AgentHermes generator at /connect to auto-create one.

Following this order, you can go from 0 passing tests to 6+ in a single afternoon. That moves a typical business from the invisible tier (0-19) to Bronze (40-54). From there, the step-by-step improvement guide covers the remaining dimensions including self-service onboarding, structured pricing, and payment processing that push you toward Silver.

Self-Test vs Full AgentHermes Scan

The self-test is a quick health check. The full AgentHermes scan is the comprehensive audit. Here is what the full scan adds beyond these 10 curl commands.

Vertical-specific scoring

AgentHermes uses 27 vertical scoring profiles that adjust dimension weights based on your industry. A restaurant gets different weights than a SaaS platform.

Schema.org markup detection

The scanner reads JSON-LD structured data on your pages to evaluate D6 Data Quality. This includes Product, Service, Organization, and LocalBusiness schemas.

Response time analysis

D8 Reliability measures actual response times. CDN-backed APIs under 100ms score highest. The scanner times real requests and evaluates consistency.

Auth pattern detection

D7 Security evaluates whether your API uses Bearer tokens, OAuth 2.0, API keys, or session cookies. The scanner detects auth-protected endpoints and evaluates the 401 response quality.

Payment flow analysis

D5 Payment checks for programmatic payment processing: Stripe integration, payment links, x402 protocol support. This requires deeper analysis than curl provides.

Platform detection

AgentHermes detects Shopify, WooCommerce, Square, and other platforms to apply platform-specific scoring bonuses. A WooCommerce store with Store API enabled gets credit automatically.

Frequently Asked Questions

Can I really predict my AgentHermes score with curl commands?

These 10 tests cover the signals that account for roughly 70% of the total score weight. They will not give you an exact number, but they will tell you whether you are in the 0-20 range (most tests fail), 20-40 (a few pass), 40-60 (most pass), or 60+ (all pass plus extras). The actual AgentHermes scan checks over 40 signals across 9 dimensions, including nuances like response time, Schema.org markup, and rate-limit headers.

Which test matters most?

TLS (test 10) is the most critical because it hard-caps your score at 39 if it fails. After that, having a callable API that returns JSON (test 1) and an OpenAPI spec (test 6) together account for the largest score contribution since D2 API Quality is weighted at 15%.

How do these tests map to the 9 dimensions?

D1 Discoverability: tests 2 and 7. D2 API Quality: tests 1 and 6. D6 Data Quality: tests 4 and 8. D7 Security: tests 9 and 10. D8 Reliability: test 5. D9 Agent Experience: test 3. Dimensions D3 Onboarding, D4 Pricing, and D5 Payment require more complex checks that curl alone cannot easily replicate.

What if all 10 tests fail?

You are in the 0-19 range, which is ARL-0: Dark. This means your business is completely invisible to AI agents. The good news: fixing tests 10 (TLS), 2 (robots.txt), and 7 (llms.txt) takes under an hour and will move you into the 20-30 range immediately. That alone puts you ahead of 40% of the 500 businesses we have scanned.

Should I run these tests before or after the actual AgentHermes scan?

Run them before to identify obvious gaps you can fix immediately. Then run the AgentHermes scan to get the precise score across all 9 dimensions with vertical-specific weighting. After fixing the issues the scan identifies, run these curl commands again to verify the fixes landed correctly before rescanning.

Continue Reading

Checklist

Ready for the full scan?

The self-test covers 70% of the score. The full AgentHermes scan covers all 9 dimensions with vertical-specific weighting, platform detection, and actionable recommendations.

Run Full Scan Connect My Business

Share this article:

Complete Guide