What Is an MCP Server? Explained With a Real Example [2026]

An MCP server is a program that exposes tools and data to an AI assistant through a standard protocol. Instead of every assistant needing a custom integration for every service, both sides implement the Model Context Protocol once, and anything that speaks MCP can talk to anything else that does.

Most explanations of MCP stay abstract — boxes, arrows, and a USB-C metaphor. This one doesn't. We build and run a production MCP server at Hooklistener (46 tools, used daily from Claude Code and Cursor), so every example below is taken from a real server you can connect to in about two minutes. If you'd rather skip the theory, jump straight to the Hooklistener MCP page for setup instructions.

The Plain-Language Definition

MCP — the Model Context Protocol — is an open protocol introduced by Anthropic in November 2024 that standardizes how AI assistants connect to external tools and data sources. An MCP server is the piece that sits on the service side: it advertises a set of capabilities (tools the model can call, resources it can read, prompts it can use) and responds to requests from an AI client using JSON-RPC 2.0 messages.

The point of the protocol is to kill the integration matrix. Before MCP, connecting N assistants to M services meant N × Mbespoke integrations — a Claude plugin for your database, a separate Cursor extension for the same database, and so on. With MCP it becomes N + M: each assistant implements the client side once, each service implements the server side once, and every pairing works.

About that “USB-C for AI” analogy

The most common way MCP gets explained is “USB-C for AI tools.” It's not wrong — one connector, many devices — but it undersells what's actually happening, because USB-C is a passive physical standard and MCP is a two-way conversation.

A better analogy for developers: MCP is to AI assistants what the Language Server Protocolis to code editors. Before LSP, every editor needed a custom plugin for every language. LSP defined a standard way for an editor to ask “what's at this position?” and for a language server to answer — and suddenly one Rust language server worked in every editor. MCP does the same thing one level up: the assistant asks “what tools do you have?” and “run this tool with these arguments,” and any conforming server can answer. Same shape, different layer.

The Architecture: Host, Client, Server

MCP defines three roles. The names trip people up because two of them live in the same process, so here's the short version:

Host

The application the user actually interacts with — Claude Code, Cursor, or any other AI-powered tool. The host decides which servers to connect to, manages permissions, and feeds tool results back into the model's context.

Client

The protocol-speaking component inside the host. Each client holds a one-to-one connection with a single server: it performs the initialization handshake, asks the server what it can do, and relays tool calls and results. As a user you never touch the client directly — the host runs one per configured server.

Server

The program exposing capabilities. It can be a 50-line script wrapping a local SQLite file or a full production service. Hooklistener's server, for example, identifies itself as hooklistener version 1.0.0 and exposes 46 tools across 9 categories: debug endpoints, captured requests, automations, schedules, secrets, a datastore, uptime monitors, email inboxes, and AI analysis.

Tools, Resources, and Prompts

A server can advertise three kinds of capabilities, and it declares which ones it supports during the handshake:

Tools

Functions the model can call: a name, a description, and a JSON Schema for the inputs. The model decides when to invoke them. This is where almost all the action is.

Resources

Readable data identified by URIs — files, database rows, documents. The host (or user) chooses what to load into context, rather than the model calling for it.

Prompts

Reusable prompt templates the server offers to users, typically surfaced as slash commands or menu items in the host.

Honest note: plenty of real servers ship tools and nothing else. Hooklistener's server declares capabilities: tools only — no resources, no prompts — because for a webhook debugging workflow, callable tools cover everything the assistant needs. If you're evaluating MCP, judge servers by their tools first.

Transports: stdio vs Streamable HTTP

The protocol messages are identical either way — what changes is how the bytes move:

stdio (local servers)

The host spawns the server as a child process and exchanges JSON-RPC messages over stdin/stdout. Zero network setup, no auth handshake — the server runs as you, on your machine. This is how filesystem, git, and database servers typically run.

Streamable HTTP (remote servers)

The server lives at a URL. Clients POST JSON-RPC messages to it, optionally open a GET connection with Accept: text/event-stream for server-sent events, and close the session with DELETE. Hooklistener's server runs this way at https://app.hooklistener.com/api/mcp, tracking each session via an mcp-session-id header and sending SSE keepalives every 15 seconds.

What an MCP Tool Actually Looks Like

Tutorials love to show a get_weathertoy example. Here's a real one instead: wait_for_request, from Hooklistener's server. It blocks until a webhook request arrives on one of your debug endpoints, or until a timeout expires. When a client asks the server what it can do (via tools/list), this is roughly the definition it gets back:

{
  "name": "wait_for_request",
  "description": "Block until a webhook request arrives on a debug endpoint, or until timeout.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "endpoint_id": {
        "type": "string",
        "description": "ID of the debug endpoint to watch"
      },
      "timeout": {
        "type": "integer",
        "description": "Max seconds to wait (default 30, max 60). Set to 0 to only check for existing requests."
      }
    },
    "required": ["endpoint_id"]
  }
}

Notice that the schema does real work. The model reads the description and parameter docs to decide when and how to call the tool — there is no other documentation channel. The server, meanwhile, defends itself: the timeout is clamped to the 0–60 second range and defaults to 30 if the model sends something invalid. Good MCP tools are designed like public APIs, because that's exactly what they are.

When the model decides to use it, the client sends a JSON-RPC tools/call request. On a remote server this is an HTTP POST:

POST https://app.hooklistener.com/api/mcp
Authorization: Bearer <access-token>
Accept: application/json, text/event-stream
Content-Type: application/json

{
  "jsonrpc": "2.0",
  "id": 12,
  "method": "tools/call",
  "params": {
    "name": "wait_for_request",
    "arguments": {
      "endpoint_id": "0b6e3c9a-...",
      "timeout": 60
    }
  }
}

The server verifies the endpoint exists, then subscribes to an internal pub/sub topic for that endpoint and blocks until a request is captured (the wait runs in a supervised task so a slow caller can't wedge the server). When a webhook lands, the response comes back as a JSON-RPC result whose content is the captured request:

{
  "jsonrpc": "2.0",
  "id": 12,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "{ \"method\": \"POST\", \"path\": \"/\", \"headers\": { ... }, \"body\": { \"event\": \"payment.succeeded\", ... } }"
      }
    ],
    "isError": false
  }
}

If nothing arrives in time, the tool doesn't error — it returns a structured timeout the model can reason about: {"status": "timeout", "message": "No request received within 60 seconds."}. That distinction matters in practice: an assistant that gets a clean timeout can decide to retry, check configuration, or tell you the provider never fired — instead of choking on an exception.

Local vs Remote Servers — and How Auth Works

Local stdio servers mostly skip authentication: the process runs under your OS user, and any credentials it needs (a database password, an API key) arrive through environment variables in its config. Remote servers are a different story. They're multi-tenant services on the public internet, so they need to know who you are — and the MCP ecosystem has converged on OAuth 2.0 for that. Here's the actual flow Hooklistener's server implements, which is worth walking through because it's the same dance you'll see on most serious remote servers:

1. The 401 that starts everything

Your client hits /api/mcp with no credentials and gets a 401 with a WWW-Authenticate header pointing at /.well-known/oauth-protected-resource. That's not a failure — it's the server telling the client where to learn how to authenticate.

2. Discovery

The client fetches two standard metadata documents — the protected-resource metadata (RFC 9728) and the authorization-server metadata (RFC 8414) — and now knows the authorization, token, and registration endpoints without any hardcoding.

3. Dynamic client registration

There's no “create an OAuth app” dashboard step. The client registers itself programmatically via POST /oauth/register and gets a client ID on the spot. This is what makes connecting a brand-new MCP client feel like zero setup.

4. Browser sign-in with PKCE

The client opens your browser to the authorization page, you sign in, and an authorization code (valid for 10 minutes) comes back. PKCE with S256 is mandatory — the server rejects any exchange missing the code verifier — so a stolen code is useless to an attacker.

5. Tokens

The code is exchanged at POST /oauth/token for an access token (1-hour lifetime) and a refresh token (30 days). The client silently refreshes as needed; the only grants supported are authorization_code and refresh_token.

What about API keys?

Hooklistener's server still accepts a legacy hklst_API key as a bearer token, but it's deprecated — every response on that path carries a note asking you to reconnect with OAuth. The reasoning generalizes to any remote server: long-lived static keys pasted into editor config files are exactly the kind of credential that leaks, while OAuth gives you short-lived tokens, scoped access, and revocation. If a remote MCP server only offers a static key, treat that as a yellow flag.

End to End: “Create a Webhook Endpoint and Wait for the Next Request”

Here's what actually happens, step by step, when you type that sentence into Claude Code with the Hooklistener server connected. This is the part most MCP explainers skip, and it's where the protocol stops being abstract.

1. The handshake already happened

When the session started, the client initialized the connection and called tools/list. The model now has all 46 tool definitions — names, descriptions, schemas — available as context. It knows a tool called create_endpoint exists without you mentioning it.

2. The model calls create_endpoint

It emits a tools/call for create_endpoint with a name it invents from your request. The server creates the endpoint and returns its details, including the public webhook URL. (If your plan's endpoint quota is full, the tool returns a plain-language error — “Debug endpoint limit reached for your plan” — which the model relays instead of failing cryptically.)

3. It tells you the URL, then calls wait_for_request

The assistant surfaces the URL so you can point a provider (or a curl command) at it, then issues wait_for_request with the new endpoint's ID. On the server, that call subscribes to the endpoint's pub/sub topic and blocks — up to 60 seconds — while your terminal just shows the assistant waiting.

4. The webhook arrives and flows back as a tool result

You trigger the webhook. The capture broadcasts to the waiting task, the tool call returns with the full request — method, path, headers, body — and that JSON lands directly in the model's context. No copy-pasting payloads out of a dashboard.

5. The model reasons over the payload

This is the payoff. The assistant can now compare the actual payload against the handler code in your repo, spot the field your parser misses, and fix it — in one loop, without you leaving the editor. Deeper analysis tools like diagnose_request and compare_requests follow the same call shape when you need them. We cover that workflow in depth in agentic webhook testing.

Worth internalizing: the model never saw an HTTP client, a database, or our infrastructure. It saw tool names, schemas, and JSON results. Everything else — auth, pub/sub, timeouts, quota enforcement — lives behind the protocol boundary, which is exactly where it belongs.

Try a Real MCP Server in 2 Minutes

Reading about protocols only gets you so far. The fastest way to make MCP concrete is to connect to a live server and watch your assistant use it. With Claude Code:

claude mcp add --transport http hooklistener https://app.hooklistener.com/api/mcp

On first use, Claude Code walks the OAuth flow described above: your browser opens, you sign in, and the connection is live. No keys to copy. A free Hooklistener account is enough — the core endpoint and request tools work on the free plan. For clients that read a config file instead, the equivalent .mcp.json entry is:

{
  "mcpServers": {
    "hooklistener": {
      "type": "streamable-http",
      "url": "https://app.hooklistener.com/api/mcp"
    }
  }
}

Then ask your assistant something like “create a webhook endpoint and wait for the next request,” send a test webhook at the URL it gives you, and watch the loop close. The full tool list and per-editor setup live on the MCP page, and the AI coding assistants guide covers Cursor and other editors.

When an MCP Server Is the Wrong Tool

We ship an MCP server, so take this section as informed rather than dismissive: MCP is not the answer to everything, and bolting it on where it doesn't fit wastes effort.

One-off scripts and deterministic pipelines

If you know exactly which API calls need to happen and in what order — a cron job, a CI step, a migration script — call the REST API directly. MCP's value is letting a model decidewhich tools to use. When there's no decision to make, the protocol is pure overhead.

Latency-critical paths

Every MCP tool call is wrapped in a model inference step: the model reads the result, thinks, and decides what to do next. That loop is measured in seconds, not milliseconds. Anything on a production request path or a tight feedback loop should call services directly and leave the model out of it.

When a CLI already does the job

Coding assistants are good at running shell commands. If a well-documented CLI exists for your service, the assistant can often drive it through its built-in terminal without any MCP server at all. MCP earns its keep when you need typed inputs, structured results, server-side waiting (like wait_for_request), or auth that shouldn't live in shell history — not for wrapping git status.

Context budget is real

Every connected server's tool definitions occupy space in the model's context window. Connect five servers with dozens of tools each and you've spent a meaningful slice of context before the conversation starts — and given the model more ways to pick the wrong tool. Connect the servers you'll actually use for the task at hand.

See an MCP Server in Action

The best way to understand MCP is to watch your own assistant create an endpoint, catch a live webhook, and debug the payload without you touching a dashboard. Sign up free, run one claude mcp addcommand, and you're connected to the 46-tool server this article was written from.

Connect the Hooklistener MCP Server Free →

What Is an MCP Server? Explained With a Real Example