Content Negotiation: Why Accept Headers Determine If AI Agents Get JSON or HTML
When an AI agent visits your website, it sends Accept: application/json. Most websites ignore this header entirely and return HTML anyway. The agent gets a sea of tags, scripts, and stylesheets instead of the structured data it needs. This is a 5-minute middleware fix that directly impacts your D6 Data Quality score.
How the Accept Header Works
Every HTTP request includes headers that tell the server about the client. The Accept header is the client saying: “here are the formats I can understand, in order of preference.”
A web browser sends Accept: text/html because it renders HTML pages. An AI agent sends Accept: application/json because it processes structured data. The server is supposed to check this header and respond with the appropriate format. This mechanism is called content negotiation, defined in RFC 7231.
The problem is that most web servers and frameworks are configured to return HTML for every request, regardless of what the client asks for. They treat the Accept header as decoration. For human visitors, this is fine — browsers handle HTML. For AI agents, it is a dead end.
AI agent requesting structured data
Accept: application/jsonExpected response: { "name": "Acme Co", "hours": "9-5", "services": [...] }
Browser requesting a web page
Accept: text/html,application/xhtml+xmlExpected response: <!DOCTYPE html><html>...rendered page...</html>
Agent accepting multiple formats
Accept: application/json, text/html;q=0.9Expected response: Server should return JSON (higher priority), falling back to HTML
Impact on Your Agent Readiness Score
AgentHermes checks for content negotiation as part of D6 Data Quality (0.10 weight). When our scanner sends a request with Accept: application/json, it checks three things:
Does the response Content-Type match the request?
If you asked for JSON and got JSON, the server respects content negotiation. If you got HTML with Content-Type: text/html, the server ignores the Accept header.
Is the JSON response actually structured?
Returning Content-Type: application/json with an HTML body inside is worse than returning text/html. AgentHermes validates that the response body parses as valid JSON.
Does the response include a Vary: Accept header?
The Vary header tells caches (CDNs, proxies) that the response varies based on the Accept header. Without it, a CDN might cache the HTML version and serve it to every agent that requests JSON.
Combined impact: Content negotiation affects D6 Data Quality (0.10) directly. It also contributes to proper header behavior which overlaps with D9 Agent Experience (0.10). Together, these two dimensions account for 20% of your total Agent Readiness Score. A site that returns HTML for every request regardless of headers is signaling to agents: “I was not built with you in mind.”
Good vs Bad Content Negotiation Patterns
Here is what agent-ready content negotiation looks like compared to the anti-patterns AgentHermes flags during scans.
The most common anti-pattern is the simplest: ignoring the Accept header entirely. The web server returns the same HTML page no matter what format the client requests. This is the default behavior of most static site generators, WordPress installations, and even some modern frameworks when not explicitly configured for content negotiation.
The 5-Minute Middleware Fix
Adding content negotiation is a three-step middleware change. Here is how to implement it in any framework.
Read the Accept header from the incoming request
Check req.headers['accept'] or req.headers.get('accept') depending on your framework. Parse it to determine if the client prefers JSON over HTML.
const acceptsJson = req.headers.get('accept')?.includes('application/json')Route to the appropriate response format
If the client accepts JSON, respond with structured data. If they accept HTML, respond with the rendered page. If neither matches, return 406 Not Acceptable.
if (acceptsJson) return Response.json(structuredData) return renderHtml(page)
Set the correct Content-Type and Vary headers
Always set Content-Type to match the actual response body. Add Vary: Accept so CDNs and proxies cache JSON and HTML responses separately.
headers.set('Content-Type', 'application/json')
headers.set('Vary', 'Accept')For the JSON response, extract the structured data that already exists on your page: business name, hours, services, pricing, contact information. You do not need to build a full API — just return a JSON representation of the data that is already in your HTML.
This approach pairs well with structured error handling. If an agent requests JSON from a URL that does not exist, return a JSON 404 response instead of an HTML error page. The combination of content negotiation and structured errors makes your entire site more agent-accessible.
Framework-specific notes: Next.js API routes return JSON by default — no middleware needed for /api/* paths. For page routes, use middleware.ts to check the Accept header. Express has the res.format() method built in. Django has content negotiation via DRF. Rails has respond_to. The mechanism exists in every major framework — it just needs to be turned on.
What We See in 500+ Scans
Across 500+ business scans, AgentHermes data shows a clear pattern: businesses that score above 60 (Silver tier) almost always handle the Accept header correctly on their API endpoints. Businesses below 40 (Bronze and below) almost never do. The correlation is not causation — better-engineered APIs tend to get content negotiation right alongside everything else — but it is a strong signal of overall API maturity.
API-first companies
Stripe, Resend, Supabase — all return JSON for Accept: application/json. Their APIs are designed for machines first, humans second.
Hybrid companies
Shopify returns JSON on /products.json but HTML on main pages. Partial content negotiation. Better than nothing but inconsistent.
Website-only businesses
Local businesses, restaurants, clinics — always return HTML. No Accept header handling whatsoever. Agents get raw markup.
Frequently Asked Questions
What is content negotiation in HTTP?
Content negotiation is the mechanism defined in HTTP/1.1 (RFC 7231) where the client tells the server what response formats it can handle using the Accept header, and the server responds with the best available format. For example, a browser sends Accept: text/html and gets a web page. An AI agent sends Accept: application/json and gets structured data. The same URL can serve different representations of the same resource.
How does AgentHermes check for content negotiation?
AgentHermes sends requests with Accept: application/json to key URLs on your site and checks whether the response Content-Type is application/json. This is part of the D6 Data Quality dimension (0.10 weight). Sites that return JSON for JSON requests and HTML for HTML requests score higher than sites that always return HTML regardless of the Accept header.
Does every page need to support content negotiation?
No. Focus on pages that represent structured resources: your homepage (business info), services page, pricing page, product listings, and any API-like URLs. Blog posts and marketing pages can remain HTML-only. The key is that pages AI agents would query for actionable data should respond with JSON when asked.
What about GraphQL or dedicated API endpoints?
If you already have a dedicated API at /api/* that returns JSON, content negotiation on your main pages is a bonus rather than a requirement. But many businesses have no API at all — their website IS their only digital presence. For these businesses, content negotiation on existing pages is the fastest path to structured data without building a separate API.
Will content negotiation break my website for human visitors?
No. Browsers send Accept: text/html, so they will continue getting the HTML version exactly as before. Content negotiation only changes the response for clients that explicitly request a different format. Your website looks and works exactly the same for every human visitor.
Does your site handle the Accept header?
Run a free Agent Readiness Scan to see how your site handles content negotiation and 40+ other agent-readiness signals across 9 dimensions.