Every MCP Server Needs a Data Moat: Lessons from Building Toolradar MCP

We launched the Toolradar MCP server three weeks ago. It gives AI agents, Claude, Cursor, Windsurf, Cline, access to our database of 8,500+ software tools so they can answer questions like "What's the best free CRM for a 5-person startup?" with verified data instead of hallucinated pricing from 2023.

The MCP server itself is open source on GitHub. The npm package is 127 lines of TypeScript. The entire thing took about three days to build.

But the data behind it took 14 months.

That asymmetry is the whole story. And it's the first lesson anyone building an MCP server needs to internalize.

Why We Built It

The trigger was simple: LLMs are confidently wrong about software pricing. Ask Claude or GPT what Notion costs, and you'll get a number that was accurate sometime in 2024. Ask about a tool launched six months ago, and you'll get a fabrication presented as fact.

We had a database of 8,500+ tools with weekly-verified pricing, editorial scores, AI-identified alternatives, and structured pros/cons. That data was trapped behind a web UI. Meanwhile, developers were increasingly making software decisions inside AI-powered IDEs, not on comparison websites.

MCP was the bridge. One npm package, and suddenly every Claude Desktop user could query our entire database through natural conversation. The tools they were already using became the distribution channel.

Lesson 1: Your MCP Server Is Only as Good as Your Data

This is the lesson that matters more than all the others combined.

Our MCP server works because the data behind it is genuinely hard to replicate:

8,500+ tools with structured metadata. Not scraped descriptions, editorial summaries, TL;DRs, feature lists, and pros/cons written by humans.
Weekly-verified pricing. We fetch and parse pricing pages for 6,300+ tools with verified tier-level details. Not "freemium" as a label, actual plan names, prices, and feature breakdowns.
Editorial scores (0-100). Every tool is rated by our team. This isn't a popularity contest or a star rating average. It's an opinionated editorial judgment.
43,000+ AI-identified alternative pairs. We don't match alternatives by category. We use an LLM to analyze each tool and identify 3-6 direct competitors from the full database. Figma's alternatives are Sketch, Adobe XD, and Framer, not Canva.
402 granular categories. Not 20 broad buckets. Specific enough that "real-time analytics" and "big-data-analytics" are separate categories with different tool sets.

Without this data, our MCP server would just be a thin wrapper around web scraping. Anybody can build that. Nobody would use it.

The test is simple: if someone could replicate your MCP server's functionality by calling a public API, you don't have a moat. Your server becomes a convenience layer, not a product. The moment a better wrapper appears, your users leave.

This applies broadly. If you're building an MCP server for financial data, the value isn't the MCP protocol, it's your proprietary dataset. If you're building one for code analysis, the value is your static analysis engine, not the JSON formatting.

Lesson 2: Fewer Tools, Better Descriptions

Our MCP server exposes exactly 6 tools:

search_tools, Search and filter by keyword, category, pricing model
get_tool, Full details for a specific tool
compare_tools, Side-by-side comparison of 2-4 tools
get_alternatives, Competitors for any tool
get_pricing, Detailed pricing with all tiers
list_categories, Browse all 402 categories

That's it. Not 60 tools. Not 20. Six.

This was a deliberate choice backed by research. LLMs become unreliable when exposed to more than 30-40 tools, they start hallucinating tool calls and picking the wrong tool. At 6 tools, the model picks the right one almost every time.

But the number alone isn't enough. The descriptions do the heavy lifting. Each tool description is written for LLM consumption, not human documentation. Compare:

Bad (human-oriented):

"Search for tools in the Toolradar database."

Good (LLM-oriented):

"Search and filter software tools from Toolradar's database of 8,600+ tools. Returns tools with names, descriptions, pricing, scores, and categories."

The second description tells the model exactly what it will get back. The model knows this is the right tool when the user asks "find me a project management tool under $10/month" because the description mentions pricing and categories explicitly.

We also deliberately avoided overlapping tool descriptions. get_tool returns "full details including pricing" while get_pricing returns "detailed pricing with all tiers." An LLM reading both descriptions understands the difference: one is comprehensive, the other is pricing-specific.

Optimized descriptions can increase tool selection probability by up to 260% compared to generic ones. We've seen this firsthand, the routing accuracy on our 6 tools is effectively perfect for well-formed queries.

Lesson 3: Token Efficiency Matters More Than You Think

Every byte your MCP server returns costs your users tokens. This isn't theoretical, it's money. At scale, a chatty API response can double or triple the cost of every interaction.

Our API responses are structured JSON, not markdown. Here's why that matters:

The search endpoint returns compact summaries. Each tool in a search result is roughly 150 tokens:

{
  "name": "Linear",
  "slug": "linear",
  "tagline": "Streamline issues, sprints, and product roadmaps.",
  "pricing": "freemium",
  "editorialScore": 92,
  "reviewCount": 47,
  "categories": ["project-management", "issue-tracking"]
}

A 10-result search costs about 1,500 tokens. That's the entire recommendation, search, results, and enough data for the LLM to make a decision, for roughly $0.003 at current API pricing.

The compare endpoint is even more efficient. Instead of returning full details for every tool, it returns a focused comparison object with computed insights:

{
  "comparison": {
    "bestOverall": "linear",
    "bestValue": "clickup",
    "mostReviewed": "asana",
    "scoreComparison": { "linear": 92, "clickup": 85, "asana": 88 },
    "pricingComparison": { "linear": "freemium", "clickup": "freemium", "asana": "freemium" }
  }
}

The LLM doesn't need to compute "which one scored highest" from raw data. We pre-compute it. This saves tokens and reduces reasoning errors.

What we didn't do: We didn't return markdown-formatted responses. We didn't include HTML. We didn't embed logos as base64. Every field in our response exists because an LLM needs it to answer a user's question.

If your MCP server returns 10,000 tokens per call when 1,500 would suffice, you're not just wasting tokens, you're making your users' AI slower and more expensive. They'll switch to a leaner alternative.

Lesson 4: Distribution Is the Hard Part

Building the MCP server took three days. Getting people to install it is an ongoing effort.

Here's what we've tried and what's worked:

npm package: npx -y toolradar-mcp is the install command. One line. No dependencies to manage. This is table stakes, if your MCP server requires cloning a repo and running a build step, most people won't bother.

GitHub repo: Open source with an MIT license. The README is optimized for copy-paste setup across Claude Desktop, Cursor, Windsurf, and Cline. Each client gets its own config block.

Content that showcases the MCP in action: This is the underrated channel. Every pricing page, every comparison page, every best-of list on Toolradar is powered by the same data the MCP server exposes. When someone reads our content and finds it useful, the implicit pitch is: "You can get this same data inside your IDE."

We wrote a step-by-step tutorial on building an MCP server from scratch. It drives traffic, establishes authority, and naturally links to our own server as a reference implementation.

What hasn't worked (yet): Cold outreach to MCP directories. Most discovery still happens through word of mouth, blog content, and people searching for specific tool comparisons who then discover the MCP option.

The uncomfortable truth: distribution for MCP servers is still nascent. There's no "App Store for MCP" with meaningful traffic. Smithery and PulseMCP exist but aren't major acquisition channels yet. Your best bet is building something genuinely useful and making it discoverable through the content you create around it.

Lesson 5: Free Tier Drives Adoption, Revenue Comes from Elsewhere

Our MCP server is free. 100 API calls per day, no credit card required. The npm package is open source.

This was not a difficult decision. Here's the math:

The marginal cost of an API call is near zero. We're querying our own database and Meilisearch instance. There's no third-party API cost per call. The infrastructure runs regardless of whether anyone calls the API.

The value of an MCP user is indirect but real:

Every tool result includes a Toolradar link. AI agents surface our URLs in their responses. That's organic traffic we didn't pay for.
Every comparison includes an affiliate tracking URL. When an agent recommends a tool and the user clicks through, we can attribute that referral.
API users become platform users. Someone who discovers Toolradar through the MCP server might later submit a review, claim a company profile, or use the web interface directly.

The revenue model is the platform, not the API. Dofollow links, premium company features, sponsored placements, that's where the money is. The MCP server is a distribution channel that costs almost nothing to run.

If you're building an MCP server and trying to charge per call from day one, reconsider. The friction of payment will kill adoption when free alternatives exist. Give the data away through the MCP, and monetize the attention it generates.

What We'd Do Differently

Ship the MCP server earlier. We spent months perfecting the web UI before building the API. In hindsight, the API and MCP server should have been a launch-week feature. The data was ready long before we got around to exposing it.

Invest in a Claude Code skill from day one. We eventually built a /recommend-tool skill that chains search, compare, and formatting into a single command. It's more useful than the raw MCP tools for most users. We should have shipped it alongside the MCP server, not after.

Track tool selection accuracy from the start. We know our 6-tool design works well because the queries we see match the tools being called. But we didn't instrument this properly at launch. If you're building an MCP server, log every tool invocation with the original query from day one. You'll discover which descriptions need tuning.

Build a playground. A web-based "try before you install" experience where people can type a question and see the MCP response. We still don't have this. It would dramatically reduce the friction from "reading about it" to "installing it."

Your Turn: When to Build an MCP Server for Your Product

Not every product needs an MCP server. Build one if:

You have proprietary data that LLMs get wrong. If GPT-4 already answers questions about your domain accurately, your MCP server adds no value. Ours works because LLMs are reliably wrong about software pricing and tools launched in the past year.
Your data changes frequently. Static data gets absorbed into training sets. Dynamic data, pricing that changes monthly, tools that launch weekly, ratings that update daily, stays valuable because it can't be memorized.
Your users already work inside AI tools. Developers live in Cursor and Claude Code. If your audience uses AI assistants as part of their daily workflow, meeting them there is higher-leverage than building another web app.
You can keep the tool count under 10. If exposing your product requires 30+ MCP tools, the LLM will struggle to pick the right one. Redesign around fewer, more powerful tools with precise descriptions.
You have a monetization path that doesn't depend on API revenue. Unless you're Snowflake or Bloomberg, charging per API call for an MCP server is premature. Find the indirect value, traffic, attribution, conversion, and build for that.

The MCP ecosystem is still early. Most MCP servers today are thin wrappers around existing APIs, which means the bar for standing out is low. If you have genuinely differentiated data, now is the time to expose it.

Our full API documentation is live, and you can generate a free API key in 30 seconds. If you want to see how we structured the implementation, the source code is 127 lines of TypeScript.

The MCP server was the easy part. The 14 months of building the data was the hard part. That's the moat.

Every MCP Server Needs a Data Moat: Lessons from Building Toolradar MCP

Why We Built It

Lesson 1: Your MCP Server Is Only as Good as Your Data

Lesson 2: Fewer Tools, Better Descriptions

Lesson 3: Token Efficiency Matters More Than You Think

Lesson 4: Distribution Is the Hard Part

Lesson 5: Free Tier Drives Adoption, Revenue Comes from Elsewhere

What We'd Do Differently

Your Turn: When to Build an MCP Server for Your Product

Growth partner for B2B tech

Louis Corneloup

Related Articles

The AI Agent Stack for Software Procurement: Automate Tool Selection

How AI Agents Choose Software (And Why They Get It Wrong)

Build a Software Recommendation Bot in 10 Minutes (LangChain + Toolradar MCP)