Six months with MCP: the good, the bad, the weird

I've been building with Model Context Protocol since around early 2025, first integrating tools for an AI agent security platform, then building a multi-MCP gateway. Here's where I actually stand after six months.

What MCP gets right

The interface contract is clean. A tool is a name, a JSON Schema input definition, and a handler. That's it. The simplicity makes it straightforward to define new tools and understand what's available.

The fact that it's a protocol rather than a library matters a lot for multi-vendor environments. We could write tools in Node.js, expose them over MCP, and have any MCP-compatible client consume them without caring about the implementation language.

Defining investigation tools as MCP servers meant the agent could be swapped out (different LLM, different orchestration) without touching the tool implementations. That separation of concerns is genuinely valuable.

The reliability problem at scale

Single MCP server? Fine. Multiple MCP servers with dependencies between them? Now you have distributed systems problems.

We learned this the hard way when building our multi-MCP gateway. If one MCP server in a multi-server configuration goes down during a deployment, naively it takes the whole tool catalog offline for that client. We spent significant time on fallback strategies — keeping connections to older healthy instances alive while new versions roll out.

class MCPConnectionPool {
  private connections: Map<string, MCPConnection[]> = new Map();

  async getHealthyConnection(serverId: string): Promise<MCPConnection | null> {
    const pool = this.connections.get(serverId) ?? [];
    // prefer newest healthy, fall back to older if new is failing
    return pool.find(c => c.isHealthy()) ?? null;
  }
}

Availability-first tool discovery — routing to any connectable instance rather than requiring the latest version — was the key insight. The tool API being stable helps here; old and new instances serve the same tool surface during rollouts.

OAuth is painful

Tool servers that need third-party auth are painful. OAuth token management across multiple MCP servers with different authorization scopes, TTLs, and refresh flows is genuinely complex.

We built a dedicated credentials service that handles token storage, scope management, and refresh. All MCP servers that need third-party auth delegate to it rather than handling OAuth themselves.

// Instead of each MCP server managing its own tokens
const token = await credentialsService.getToken({
  userId,
  provider: 'github',
  scopes: ['repo:read'],
});

Centralizing credentials also makes it easier to audit and revoke access. Something that's non-trivial when tokens are scattered across services.

The tooling ecosystem is still early

The client implementations vary significantly in quality. Some reliably handle streaming, tool call retries, and connection recovery. Others are buggy in ways that are difficult to debug because the protocol itself is new enough that there aren't many reference implementations.

I've spent more time than I'd like debugging edge cases at the transport layer. Health checks, timeout controls under high concurrency, connection teardown — these are well-understood problems in HTTP/gRPC land that MCP's ecosystem is still working through.

That said: the pace of improvement is fast. Six months in, things are meaningfully better than when I started.

Where I think it's going

Multi-agent architectures where agents can discover and call each other's capabilities through MCP is genuinely compelling. We started experimenting with this in our security testing platform — agents coordinating through structured MCP tool calls rather than free-text LLM-to-LLM communication.

The structure helps. When an agent invokes a sub-agent via MCP, the inputs and outputs are typed and validated. Compare that to passing a text description and asking for text back. The structured approach makes the whole system much easier to debug and improve.

Is it worth it?

For multi-agent systems and environments where tools need to be modular and swappable: yes, absolutely. The protocol overhead is worth the composability.

For simple single-agent use cases where you control the tool implementations directly: probably not yet. Just write tool handlers directly in your language of choice.