Skip to content

How to Deploy a Remote MCP Server (Cloudflare Workers, Vercel, AWS)

Step-by-step guide to deploying remote MCP servers using Streamable HTTP. Covers Cloudflare Workers, Vercel, and AWS with OAuth 2.1 authentication.

March 27, 2026
9 min read

Most MCP servers today run locally. You install them with npx, they spawn a subprocess on your machine, and they communicate over stdio. That works for developer tools, but it breaks down fast: mobile clients cannot spawn subprocesses, web apps cannot run Node.js, and every user must install every server individually.

Remote MCP servers fix all of this. They run on your infrastructure, communicate over HTTPS, and any client connects with just a URL. Figma, Supabase, Linear, and Sentry already ship production remote servers. This guide shows you how to deploy your own on Cloudflare Workers and Vercel.

If you are new to MCP servers entirely, start with our tutorial on building one from scratch. If you are weighing MCP against a plain REST API, read MCP vs API: when each makes sense.

Why Remote Matters

Local MCP servers (stdio transport) require the client to launch a subprocess. That means:

  • The client machine needs Node.js, Python, or whatever runtime the server uses
  • Mobile and web clients cannot participate at all
  • Every user installs and configures the server independently
  • There is no central place to enforce auth, rate limits, or logging

Remote servers flip all of this. The server runs on your infrastructure. Clients connect with a single URL. You control access, monitor usage, and deploy updates without touching any client machine.

Production examples already live:

ServiceRemote MCP Endpoint
Figmahttps://mcp.figma.com/mcp
Supabasehttps://mcp.supabase.com/mcp
Linearhttps://mcp.linear.app/mcp
Sentryhttps://mcp.sentry.dev/mcp
GitHub Copilothttps://api.githubcopilot.com/mcp/

Clients that support remote connections today: Claude Desktop, Claude Code, Cursor, and VS Code with GitHub Copilot.

Transport Comparison: stdio vs SSE vs Streamable HTTP

MCP defines three transport mechanisms. Understanding when to use each one saves you from picking a deprecated path.

stdioHTTP+SSE (deprecated)Streamable HTTP
Spec versionAll2024-11-052025-03-26
How it worksClient spawns subprocess, communicates via stdin/stdoutTwo HTTP endpoints: SSE for server-to-client, POST for client-to-serverSingle HTTP endpoint handling both POST and GET
Requires local installYesNoNo
Stateful sessionsImplicit (process lifetime)YesYes (via Mcp-Session-Id header)
Resumable streamsNoNoYes (Last-Event-ID)
Use caseDev tools, CLI integrationsLegacy remote serversAll new remote servers

The takeaway: Use stdio for local-only servers (CLI tools, file system access). Use Streamable HTTP for everything remote. Do not start new projects on HTTP+SSE — the 2025-03-26 spec replaces it with Streamable HTTP.

Streamable HTTP works through a single endpoint (e.g., https://example.com/mcp). Clients send JSON-RPC messages via POST. The server can respond with a single JSON object or open an SSE stream for long-running operations. Clients can also GET the endpoint to listen for server-initiated messages.

Deploy on Cloudflare Workers

Cloudflare Workers is the most popular hosting option for remote MCP servers. The free tier gives you 100,000 requests per day, the cold start is under 5ms, and Cloudflare maintains an official MCP template.

Step 1: Scaffold the Project

npm create cloudflare@latest -- my-mcp-server \
  --template=cloudflare/ai/demos/remote-mcp-authless
cd my-mcp-server

This generates a project with the Cloudflare Agents SDK pre-configured.

Step 2: Define Your Tools

Open src/index.ts. The template ships a calculator example. Replace it with your own tools:

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { McpAgent } from "agents/mcp";
import { z } from "zod";

export class MyMCP extends McpAgent {
  server = new McpServer({
    name: "My Remote Server",
    version: "1.0.0",
  });

  async init() {
    this.server.tool(
      "get_status",
      { service: z.string().describe("Service name to check") },
      async ({ service }) => {
        const res = await fetch(`https://api.example.com/status/${service}`);
        const data = await res.json();
        return {
          content: [{ type: "text", text: JSON.stringify(data, null, 2) }],
        };
      }
    );

    this.server.tool(
      "search_docs",
      {
        query: z.string().describe("Search query"),
        limit: z.number().int().min(1).max(20).default(5),
      },
      async ({ query, limit }) => {
        const res = await fetch(
          `https://api.example.com/search?q=${encodeURIComponent(query)}&limit=${limit}`
        );
        const results = await res.json();
        return {
          content: [{ type: "text", text: JSON.stringify(results, null, 2) }],
        };
      }
    );
  }
}

export default {
  fetch(request: Request, env: Env, ctx: ExecutionContext) {
    const url = new URL(request.url);
    if (url.pathname === "/mcp") {
      return MyMCP.serve("/mcp").fetch(request, env, ctx);
    }
    return new Response("Not found", { status: 404 });
  },
};

Key details:

  • McpAgent uses Durable Objects for per-session state. Each client connection gets its own instance.
  • Tools are registered in the init() method, which runs once per session.
  • The fetch handler routes /mcp to the MCP server and returns 404 for everything else.

Step 3: Configure Wrangler

The template generates wrangler.jsonc with the required Durable Objects binding:

{
  "name": "my-mcp-server",
  "main": "src/index.ts",
  "compatibility_date": "2025-03-10",
  "compatibility_flags": ["nodejs_compat"],
  "migrations": [
    { "new_sqlite_classes": ["MyMCP"], "tag": "v1" }
  ],
  "durable_objects": {
    "bindings": [
      { "class_name": "MyMCP", "name": "MCP_OBJECT" }
    ]
  }
}

The migrations block registers MyMCP as a Durable Object class backed by SQLite. This is how sessions persist across requests.

Step 4: Test Locally

npm start

The server runs at http://localhost:8788/mcp. Test it with the MCP Inspector:

npx @modelcontextprotocol/inspector@latest

The Inspector opens a web UI at http://localhost:5173. Enter your server URL, connect, and you will see your tools listed. Call them to verify they work.

Step 5: Deploy

npx wrangler@latest deploy

Your server is now live at https://my-mcp-server.<your-account>.workers.dev/mcp.

Step 6: Connect a Client

For clients that support remote URLs natively (Claude Desktop 2025+, Claude Code), add it directly:

{
  "mcpServers": {
    "my-server": {
      "url": "https://my-mcp-server.your-account.workers.dev/mcp"
    }
  }
}

For older clients that only support stdio, use mcp-remote as a local bridge:

{
  "mcpServers": {
    "my-server": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://my-mcp-server.your-account.workers.dev/mcp"
      ]
    }
  }
}

Deploy on Vercel

Vercel works well if you already have a Next.js app and want to add MCP capabilities as an API route. The mcp-handler package handles transport negotiation.

Step 1: Install Dependencies

npm install mcp-handler @modelcontextprotocol/sdk zod

Use @modelcontextprotocol/sdk version 1.26.0 or later. Earlier versions have a known security vulnerability.

Step 2: Create the Route Handler

Create app/api/[transport]/route.ts:

import { createMcpHandler } from "mcp-handler";
import { z } from "zod";

const handler = createMcpHandler(
  (server) => {
    server.registerTool(
      "get_status",
      {
        title: "Get Status",
        description: "Check the status of a service.",
        inputSchema: {
          service: z.string(),
        },
      },
      async ({ service }) => {
        const res = await fetch(`https://api.example.com/status/${service}`);
        const data = await res.json();
        return {
          content: [{ type: "text", text: JSON.stringify(data, null, 2) }],
        };
      }
    );

    server.registerTool(
      "search_docs",
      {
        title: "Search Docs",
        description: "Search documentation by keyword.",
        inputSchema: {
          query: z.string(),
          limit: z.number().int().min(1).max(20).default(5),
        },
      },
      async ({ query, limit }) => {
        const res = await fetch(
          `https://api.example.com/search?q=${encodeURIComponent(query)}&limit=${limit}`
        );
        const results = await res.json();
        return {
          content: [{ type: "text", text: JSON.stringify(results, null, 2) }],
        };
      }
    );
  },
  {},
  {
    basePath: "/api",
    maxDuration: 60,
  }
);

export { handler as GET, handler as POST };

The [transport] dynamic segment lets the adapter handle both /api/mcp (Streamable HTTP) and /api/sse (legacy SSE) from the same route file.

Step 3: Deploy

vercel --prod

Or push to your connected Git repository. Your MCP endpoint is now live at https://your-app.vercel.app/api/mcp.

Step 4: Connect a Client

{
  "mcpServers": {
    "my-vercel-server": {
      "url": "https://your-app.vercel.app/api/mcp"
    }
  }
}

Cloudflare vs Vercel: Which to Pick

FactorCloudflare WorkersVercel
Best forStandalone MCP serversAdding MCP to an existing Next.js app
Session stateBuilt-in via Durable ObjectsStateless by default (add Redis for SSE)
Free tier100K requests/day100K function invocations/month
Cold start<5ms~250ms (serverless functions)
Auth templateOfficial OAuth + GitHub templateRoll your own
SDKagents (Cloudflare Agents SDK)mcp-handler

Authentication with OAuth 2.1

Public MCP servers work for read-only, non-sensitive tools. Anything that accesses user data or performs actions on their behalf needs authentication. The MCP spec (2025-03-26) standardizes on OAuth 2.1 for remote servers.

For a deep dive on securing MCP servers, see our MCP server security best practices guide.

How the OAuth Flow Works

  1. Client connects to your MCP endpoint
  2. Server returns 401 Unauthorized with an WWW-Authenticate header pointing to the authorization endpoint
  3. Client opens a browser for the user to log in and authorize
  4. Server issues an access token
  5. Client includes the token in subsequent MCP requests

Cloudflare Workers: GitHub OAuth Example

Cloudflare provides an official template with GitHub OAuth pre-wired:

npm create cloudflare@latest -- my-mcp-server-auth \
  --template=cloudflare/ai/demos/remote-mcp-github-oauth
cd my-mcp-server-auth

The entry point wraps your MCP server in an OAuthProvider:

import { OAuthProvider } from "agents/oauth";
import GitHubHandler from "./github-handler";

export default new OAuthProvider({
  apiRoute: "/mcp",
  apiHandler: MyMCP.serve("/mcp"),
  defaultHandler: GitHubHandler,
  authorizeEndpoint: "/authorize",
  tokenEndpoint: "/token",
  clientRegistrationEndpoint: "/register",
});

You then access the authenticated user's context inside your tools via this.props:

export class MyMCP extends McpAgent<Env, {}, { login: string; accessToken: string }> {
  async init() {
    this.server.tool("my_repos", {}, async () => {
      const res = await fetch("https://api.github.com/user/repos", {
        headers: {
          Authorization: `Bearer ${this.props.accessToken}`,
          "User-Agent": "my-mcp-server",
        },
      });
      const repos = await res.json();
      return {
        content: [{ type: "text", text: JSON.stringify(repos, null, 2) }],
      };
    });
  }
}

Configure your GitHub OAuth app at github.com/settings/developers and set the secrets:

npx wrangler secret put GITHUB_CLIENT_ID
npx wrangler secret put GITHUB_CLIENT_SECRET
npx wrangler secret put COOKIE_ENCRYPTION_KEY

Create the KV namespace for OAuth state:

npx wrangler kv namespace create "OAUTH_KV"

Add the KV binding to wrangler.jsonc:

{
  "kv_namespaces": [
    { "binding": "OAUTH_KV", "id": "<your-namespace-id>" }
  ]
}

Deploy:

npx wrangler@latest deploy

The same approach works with any OAuth provider — Google, Slack, Auth0, or your own identity system. Swap out GitHubHandler for your provider's handler.

Testing Your Remote Server

MCP Inspector

The official testing tool. Works with any transport.

npx @modelcontextprotocol/inspector@latest

Enter your server URL (local or deployed), connect, and interactively call tools. The Inspector shows request/response payloads, making it easy to debug schema issues.

curl

Test the raw Streamable HTTP endpoint:

# Initialize the session
curl -X POST https://your-server.workers.dev/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "initialize",
    "params": {
      "protocolVersion": "2025-03-26",
      "capabilities": {},
      "clientInfo": { "name": "curl-test", "version": "1.0.0" }
    }
  }'

A successful response returns the server's capabilities and name. Note the Mcp-Session-Id header in the response — include it in subsequent requests.

# List available tools
curl -X POST https://your-server.workers.dev/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "Mcp-Session-Id: <session-id-from-init>" \
  -d '{
    "jsonrpc": "2.0",
    "id": 2,
    "method": "tools/list",
    "params": {}
  }'

Claude Code

The fastest way to test end-to-end. Add your server to ~/.claude/claude_desktop_config.json (or the Claude Code config) and ask Claude to use one of your tools.

Deployment Checklist

Before going to production:

  • HTTPS only. Never expose MCP over plain HTTP. All production examples use HTTPS.
  • Validate the Origin header. The MCP spec explicitly requires servers to validate Origin to prevent DNS rebinding attacks. Both Cloudflare Workers and Vercel handle TLS, but you must add Origin validation if your server binds to localhost during development.
  • Add authentication for any server that accesses user data. OAuth 2.1 is the standard; API keys work for server-to-server use cases.
  • Rate limit aggressively. AI clients can call tools in rapid loops. Cloudflare Workers has built-in rate limiting. On Vercel, use the @vercel/kv rate limiter or an external service.
  • Set maxDuration on Vercel to prevent runaway function execution. The default of 60 seconds is reasonable for most tools.
  • Monitor usage. Cloudflare Workers analytics and Vercel function logs give you visibility into which tools get called and how often.
  • Write clear tool descriptions. The AI reads these to decide when to call your tool. Vague descriptions lead to incorrect tool selection. Specific descriptions like "Get the current weather for a city. Returns temperature in Celsius, humidity percentage, and conditions" outperform "Get weather data."

What Comes Next

Remote MCP servers are still early. The Streamable HTTP transport shipped in March 2025, and most production servers went live in Q1-Q2 2025. Expect rapid changes in auth flows, client support, and tooling.

The pattern is clear, though: MCP is moving from local-first to remote-first. If you are building a developer tool or SaaS product, shipping a remote MCP endpoint is becoming as standard as shipping a REST API. The infrastructure is ready — Cloudflare's free tier and Vercel's serverless functions mean you can deploy one in under an hour with zero ongoing cost for moderate traffic.

Start with the Cloudflare authless template for a quick prototype. Add OAuth when you need user-specific data. And read our security best practices guide before going to production.

mcpmcp-serversremote-mcpcloudflare-workersverceldeveloper-toolstutorial
Share this article