MCP Server Quickstart 2026: Build, Host, and Get Listed

The Model Context Protocol turned two years old this April. What started as an Anthropic-shipped spec for plugging Claude into external tools has become the closest thing the agent ecosystem has to a working standard. Cursor speaks it. Claude Desktop, Claude Code, Continue, Cline, Zed, and Goose all speak it. The MCP registry has crossed five thousand servers. If you build with AI assistants and you have data your assistant should reach, you can hand it that data with about a hundred lines of code and a config block.

This post walks through what it actually takes to ship a real, useful MCP server in 2026. It is not a theoretical primer. It is the practical version: what to choose, what to skip, how to get listed, and which mistakes to avoid. The reference implementation is the server running at terminalfeed.io/api/mcp and the npm-distributed sister at @tensorfeed/mcp-server. Both are public, both are in production, both got listed on every major registry within a week of shipping.

Pick your distribution model first

The biggest decision is also the easiest. There are two ways to ship an MCP server in 2026, and they make very different tradeoffs.

Stdio packages. You publish your server as an npm or PyPI package. The user installs it with npm install -g your-server, configures Claude Desktop to spawn it as a subprocess, and the protocol runs over the spawned process's stdin and stdout. This is what TensorFeed's MCP does. The advantages are clean local execution (the user's secrets and their LLM client both sit on the same machine), simple distribution (npm publish, done), and zero infra to run. The disadvantages are that you need to handle every install environment (different Node versions, missing dependencies, OS-specific path quirks) and you cannot share state across users without an external service to call into.

Hosted servers. You run an HTTP endpoint. Claude Desktop talks to it through mcp-remote, a small adapter that bridges stdio to HTTP. This is what TerminalFeed's MCP does at /api/mcp. The advantages are central state (caches, rate limits, authentication, billing all live in one place you control), zero install footprint (the user pastes a config block, that is it), and the same server can serve thousands of users with consistent behavior. The disadvantages are that you need an HTTPS endpoint with reasonable uptime, and your users have to trust your service with whatever data they are sending it.

Pick stdio if your server is mostly compute (parsing files, running scripts, talking to local databases). Pick hosted if your server is mostly read-from-the-world (APIs, datasets, anything cached centrally). For TerminalFeed, hosted made sense because the data is already on Cloudflare Workers and the cache is already shared across users. For TensorFeed, npm distribution made sense because there are some cases where the user wants to set their own provider keys and run lookups locally.

The minimum viable server

An MCP server needs to handle three JSON-RPC methods: initialize, tools/list, and tools/call. That is the entire protocol surface, plus a few optional convenience methods.

Here is a hosted server in roughly forty lines, running on a Cloudflare Worker. This is a trimmed version of what handles requests to /api/mcp on TerminalFeed:

export default {
  async fetch(request) {
    if (request.method !== 'POST') {
      return Response.json({ name: 'terminalfeed-mcp', protocol_version: '2024-11-05' });
    }
    const body = await request.json();
    const { id, method, params } = body;

    if (method === 'initialize') {
      return jsonRpc(id, { protocolVersion: '2024-11-05', capabilities: { tools: {} } });
    }

    if (method === 'tools/list') {
      return jsonRpc(id, { tools: TOOLS });
    }

    if (method === 'tools/call') {
      const { name, arguments: args } = params;
      const result = await dispatchTool(name, args);
      return jsonRpc(id, { content: [{ type: 'text', text: JSON.stringify(result) }] });
    }

    return jsonRpc(id, null, { code: -32601, message: 'Method not found' });
  }
};

function jsonRpc(id, result, error) {
  return Response.json({ jsonrpc: '2.0', id, ...(error ? { error } : { result }) });
}

The interesting part is what goes into TOOLS and dispatchTool. Each tool needs three things: a name (snake_case, like get_btc_price), a human-readable description, and an input schema in JSON Schema format. The description is what the LLM sees when it decides whether to call your tool, so it actually matters for tool selection. Be specific. "Get current Bitcoin price in USD" is better than "Bitcoin price". "Latest 24h earthquakes magnitude 2.5+ from USGS" is better than "Earthquake data".

Tool definitions that an LLM will actually pick

The single biggest mistake in MCP server design is fuzzy tool descriptions. The LLM sees your tool definition once at the top of its context, then it has to decide for itself when your tool is the right answer to a user's question. If the description is vague, your tool doesn't get picked. If the description is overly specific, your tool gets picked when something else would be better.

Good tool definitions follow three rules. First, the name says exactly what the tool does. get_btc_price, not btc. list_recent_earthquakes, not earthquakes. Second, the description starts with the verb and ends with the data shape. "Returns the current Bitcoin price in USD with 24h change percentage and trading volume." Third, parameters are strict. If a tool takes a date range, declare both as required, give them ISO 8601 format hints in the description, and validate them server-side. Loose tool definitions cause the LLM to call your tool with garbage inputs.

{
  name: 'get_world_briefing',
  description: 'Returns a one-call composed snapshot of current world state: BTC price, Fear and Greed Index, last 24h earthquakes, top Hacker News stories, current ISS crew, top prediction market odds. Use this when the user asks for a general "what is happening" briefing or when assembling context for a research task.',
  inputSchema: {
    type: 'object',
    properties: {},
    additionalProperties: false
  }
}

That description tells the LLM exactly when to call this tool, which is the entire point.

Authentication for paid tools

If your server has any tool that costs you money to run (a third-party API, a paid data feed, an LLM call inside the tool), you will eventually need to charge for it. The cleanest pattern in 2026 is bearer-token authentication on the HTTP endpoint, with the token coupled to a credit balance.

The token is a long random string the user gets when they buy credits. They pass it in the Authorization header on every request. Your server checks the token's balance before running the tool, decrements the balance atomically, and runs the tool. If the user is on the free tier, you reject the call with a clear "this tool requires credits, here is how to buy them" error message that the LLM will surface to the user.

async function handleToolCall(request, body) {
  const auth = request.headers.get('Authorization');
  const token = auth?.replace('Bearer ', '');
  const tool = TOOLS_BY_NAME[body.params.name];

  if (tool.cost_credits > 0) {
    if (!token) return errorResponse('PAYMENT_REQUIRED', tool.cost_credits);
    const ok = await chargeAndDecrement(token, tool.cost_credits);
    if (!ok) return errorResponse('INSUFFICIENT_CREDITS');
  }

  const result = await tool.handler(body.params.arguments);
  return successResponse(result);
}

For the actual payment rail, USDC on Base is the path of least resistance in 2026. Stripe Link for Agents handles the human-in-the-loop case (the user approves each spend), but for autonomous agents that need to make ten paid calls per minute, on-chain micropayments are the only thing that actually works at sub-second latency. Both terminalfeed.io/api/mcp and TensorFeed's MCP take USDC on Base, with credits cross-redeemable across both servers. How AI Agents Pay for Real-Time Data on TerminalFeed walks through the whole flow.

Getting listed

An MCP server with no listing is a tree falling in an empty forest. The discovery problem is real, and it is solved by submitting to the registries. There are five that matter in 2026, in priority order:

Official MCP Registry at registry.modelcontextprotocol.io. Submit a JSON manifest, automated review. Highest credibility because it is run by the protocol's maintainers.
awesome-mcp-servers on GitHub (punkpeye/awesome-mcp-servers). Most clicked, most starred. PR a one-line entry to the appropriate category in the README. The maintainer also runs Glama, so a merged PR cross-lists you on the Glama AI catalog automatically.
Smithery at smithery.ai. Form-based submission, fastest turnaround.
PulseMCP at pulsemcp.com. Form-based, similar to Smithery, different audience.
mcp.so. Form-based, smaller but growing.

For each, you need the same metadata: name, slug, description (one line and three lines), repo URL, install snippet, categories. Prepare these once, paste five times. The whole batch takes about 90 minutes. Most listings approve within 3-7 days.

One thing worth noting: registries care about your install snippet being accurate and current. If you ship a breaking change later, update it on the registries. Stale snippets are the most common reason MCP servers stop working for new users.

Common mistakes to skip

Vague tool descriptions. If your description does not tell the LLM exactly when to call the tool, it will not call the tool. "Bitcoin data" loses to "Returns current BTC price in USD with 24h change."
Returning massive payloads. Tool responses go into the LLM's context window and cost the user tokens. Cap response size, paginate, summarize. A tool that returns 50KB of JSON makes you the most expensive tool in the chain.
Failing silently. When your tool errors, return a clear error message in the response, not a 500. The LLM can recover from "Source temporarily unavailable, retry in 30s" but cannot recover from a Python traceback.
No rate limiting. Agents loop. Without a per-token rate limit, one runaway agent can drain your origin's quota in minutes. Set a sane cap (100 calls per minute per token is a good starting point) and return a structured error when hit.
Treating MCP as RPC. The protocol is not just JSON-RPC over a socket. It is JSON-RPC plus a strong implicit contract that tools are described well enough for an LLM to choose between them. Skip the contract part and your server gets installed once, never used, and silently uninstalled.

What's next

The MCP spec is evolving. Resources (read-only data attachments separate from tools) and prompts (server-defined templates) are stable now. Sampling (server asks the LLM something) and roots (server-defined filesystem access boundaries) are landing this quarter. The official spec doc at modelcontextprotocol.io is the source of truth; treat anything older than three months as suspect.

If you build something interesting, consider listing it. The agent ecosystem benefits from more high-quality MCP servers, and the cost to publish is genuinely two days of work plus a registry submission round. The downside is small. The upside is that your server might be running inside thousands of agent contexts a year from now, doing useful work for people you will never meet.

See a working MCP server in production at /api/mcp. Free tier, no signup required.

Inspect /api/mcp

Further reading: How AI Agents Pay for Real-Time Data on TerminalFeed, the full TerminalFeed MCP setup page, the official MCP docs, and the community servers list.