How to Test Your MCP Server: Validation, Debugging, and Scoring Impact
You built your MCP server. Now how do you know it actually works? A broken MCP server is worse than no MCP server — agents will try to connect, fail, and mark your business as unreliable. This guide covers five validation methods, the six most common bugs, and how testing translates directly to your Agent Readiness Score.
Why MCP Testing Is Not Optional
A website with a broken contact form is annoying. An MCP server with a broken tool is catastrophic for agent trust. Here is why: when a human hits a broken form, they try again or call you. When an AI agent hits a broken tool, it marks your server as unreliable and deprioritizes you in future queries. There is no second chance — agents have perfect memory and zero patience.
Our scan data from 500+ businesses shows that 40% of deployed MCP servers have at least one broken tool. The most common failure: tools that work during development but break in production due to environment differences, missing auth, or transport configuration issues. Testing is not about quality — it is about agent trust.
5 Methods to Validate Your MCP Server
Test from the outside in. Start with visual inspection (MCP Inspector), then real-world agent testing (Claude Desktop), then protocol-level verification (curl), then automated regression (Jest), and finally score impact (AgentHermes scan).
MCP Inspector — Visual Validation
The MCP Inspector (npx @modelcontextprotocol/inspector) connects to your server and displays all tools, resources, and prompts in a browser UI. You can call tools interactively, see response schemas, and verify descriptions. This is your first test — if Inspector cannot connect or shows missing tools, agents cannot either.
npx @modelcontextprotocol/inspectorClaude Desktop — Real Agent Testing
Add your MCP server to Claude Desktop's configuration file (claude_desktop_config.json). Then ask Claude to use your tools naturally: "Check availability at my business" or "Get a quote for lawn care." Claude will discover your tools, call them, and show you the results. This is the closest simulation to how real agents will interact with your server.
Edit ~/Library/Application Support/Claude/claude_desktop_config.jsoncurl for JSON-RPC 2.0 Verification
MCP uses JSON-RPC 2.0 over stdio or SSE transport. For HTTP/SSE servers, use curl to send raw JSON-RPC requests. Test the initialize handshake, tools/list, and individual tool calls. This verifies your server handles the protocol correctly at the lowest level — no client abstractions hiding bugs.
curl -X POST http://localhost:3000/mcp -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"tools/list","id":1}'Automated Test Suite with Jest
Write Jest tests that import your MCP server handlers directly. Test each tool with valid inputs, invalid inputs, missing required fields, and edge cases. Assert response schemas match your tool definitions. Run in CI so every commit validates the MCP contract. This catches regressions before agents hit them.
npx jest --testPathPattern=mcpAgentHermes Scan — D2 Scoring Impact
Run an AgentHermes scan on your domain after deploying the MCP server. The scanner detects MCP endpoints, checks tool count and quality, verifies SSE transport, and measures the impact on your D2 (API) dimension score. A working MCP server with 5+ well-described tools typically adds 15-25 points to your overall Agent Readiness Score.
Visit agenthermes.ai/audit and enter your domain6 Most Common MCP Server Bugs
These are the bugs we see most frequently when scanning MCP servers. Each one silently degrades agent trust. Check your server against this list before deploying.
Wrong method names in tool definitions
CriticalSymptom
Inspector shows tools but agents get "method not found" errors when calling them
Fix
Tool name in the definition must exactly match the handler function name. Check for typos, underscores vs hyphens, and case sensitivity.
Missing error handling on tool calls
HighSymptom
Agent gets a raw stack trace or empty response instead of a structured error
Fix
Wrap every tool handler in try/catch. Return { content: [{ type: "text", text: "Error: descriptive message" }], isError: true } on failure. Never expose internal errors.
Auth not forwarded from MCP to backend
HighSymptom
Tools work in local testing but return 401 in production. Inspector works but Claude Desktop fails.
Fix
If your MCP tools call authenticated backend APIs, the auth token must flow from the MCP client through your server to the backend. Use environment variables for service-to-service auth, not user tokens.
SSE transport not sending keep-alive
MediumSymptom
Connection drops after 30-60 seconds of inactivity. Tools work for first call then fail.
Fix
Send SSE comments (": keep-alive\n\n") every 15-30 seconds. Most reverse proxies (nginx, Cloudflare) timeout idle SSE connections. Configure proxy timeout to 300s minimum.
Tool input schemas missing required fields
HighSymptom
Agent sends partial data and tool returns garbage instead of a validation error
Fix
Define inputSchema with JSON Schema "required" array for every mandatory field. Validate inputs before processing. Return clear error messages listing which fields are missing.
Tool descriptions too vague for agent discovery
MediumSymptom
Agent has access to your tools but never calls them because it does not understand when to use them
Fix
Tool descriptions must answer: What does this tool do? When should an agent use it? What will it return? Include example use cases. Bad: "Get info." Good: "Returns business hours, address, phone number, and service area for the specified business location."
Debugging Techniques
When a tool fails and the error is not obvious, use these techniques to isolate the problem. The goal is always the same: see the exact JSON-RPC message the client sends and the exact response your server returns.
Enable verbose logging
Set DEBUG=mcp:* or your framework's verbose flag. Log every incoming JSON-RPC message and outgoing response. This shows exactly what the client sends and what your server returns — indispensable for protocol-level debugging.
Check SSE transport headers
For HTTP/SSE servers, verify Content-Type is "text/event-stream", Cache-Control is "no-cache", and Connection is "keep-alive". Missing headers cause silent failures in some MCP clients.
Validate tool schemas with ajv
Install ajv (JSON Schema validator) and validate your tool inputSchema definitions against the JSON Schema draft-07 spec. Invalid schemas silently break agent input validation.
Test with multiple MCP clients
If your server works in Inspector but not Claude Desktop, the bug is in transport or auth handling — not tool logic. Test with at least two different clients to isolate client-specific issues.
Monitor with structured logging
Log tool calls as JSON objects: { tool: "get_services", input: {...}, duration_ms: 142, success: true }. Aggregate these to find slow tools, high error rates, and unused tools that agents ignore.
The most important debugging insight: if your server works in MCP Inspector but fails with a real agent, the bug is almost always in transport or auth — not in your tool logic. Inspector often runs locally via stdio, while production agents connect via HTTP/SSE through proxies and load balancers that can interfere with the connection. Always test the full production path, not just local.
How Testing Impacts Your Agent Readiness Score
A working MCP server directly impacts three of the nine scoring dimensions. Here is the breakdown from our scoring methodology:
D2: API Quality
15% weightMCP tools count as structured API endpoints. 5+ working tools with proper schemas = 70+ on this dimension.
D8: Reliability
13% weightConsistent responses, proper error handling, and uptime. Broken tools that return 500s actively hurt this score.
D9: Agent Experience
10% weightMCP is the gold standard for agent experience. Having an MCP server at all puts you in the top 1% of businesses.
The scoring math: A business with no MCP server scores 0 on D9 (Agent Experience). Adding a working MCP server with 5 tools jumps D9 to 60-80. With D9 weighted at 10%, that is a direct 6-8 point boost to your total score. Combined with D2 and D8 improvements, expect a 15-25 point total increase from a properly tested MCP server. Run a free scan at /audit to see your before and after.
Frequently Asked Questions
How do I test an MCP server that uses stdio transport?
For stdio-based MCP servers, the MCP Inspector is your primary testing tool — it handles stdio communication automatically. For automated testing, import your server module directly in Jest and call handler functions. You cannot use curl with stdio servers since they communicate via stdin/stdout, not HTTP. If you need HTTP-based testing, consider adding SSE transport as an alternative — most production deployments benefit from having both.
How many tools should my MCP server expose?
Quality matters more than quantity, but 5-8 tools is the sweet spot for most businesses. Too few (1-2) means agents cannot do much. Too many (20+) means agents struggle to pick the right tool. Our scan data shows the highest-scoring MCP servers have 5-10 well-described tools with clear use cases. Each tool should do one thing well with typed inputs and outputs.
Will testing my MCP server improve my Agent Readiness Score?
Testing itself does not directly change your score — but fixing the bugs you find does. A broken MCP server that returns errors will score lower than one with no MCP server at all (the scanner detects failed endpoints). The D2 (API) dimension rewards working, well-documented endpoints. The D8 (Reliability) dimension rewards consistent uptime and proper error responses. Testing ensures both dimensions score well.
How often should I re-test my MCP server?
Run automated Jest tests on every commit in CI. Run an AgentHermes scan monthly or after any significant change to tools, schemas, or transport. Manual testing with MCP Inspector is most valuable when adding new tools or changing existing ones. The most common failure pattern is a code change that breaks an existing tool without anyone noticing — agents silently stop using it.
Test your MCP server with a free scan
See how your MCP server impacts your Agent Readiness Score across all 9 dimensions. The scanner detects MCP endpoints automatically and measures tool quality.