Find the hidden APIs behind any website.Try it now
All posts
EngineeringMarch 23, 2026·7 min·Ilmenite Team

MCP Isn't a Plugin. It's How AI Agents Talk to the Web Now.

AI agents are limited by their training data. To interact with the live web, they need a standardized way to fetch, parse, and process data in real time. This is where an mcp server web browsing imple...

AI agents are limited by their training data. To interact with the live web, they need a standardized way to fetch, parse, and process data in real time. This is where an mcp server web browsing implementation changes the architecture of how LLMs interact with external information.

For a long time, the industry relied on "plugins" or "actions." These were fragmented, proprietary integrations that required developers to build a new interface for every different LLM provider. The Model Context Protocol (MCP) replaces this fragmented approach with a universal standard.

What is the Model Context Protocol (MCP)?

The Model Context Protocol is an open standard that enables AI models to connect to external data sources and tools. Instead of building a custom integration for every AI application, developers build an MCP server that exposes specific capabilities. Any MCP-compliant client—such as Claude Desktop or a custom AI agent—can then connect to that server and use its tools.

Think of MCP as the USB port for AI. Before USB, every peripheral needed a specific port and driver. USB standardized the connection, allowing a single port to handle keyboards, mice, and hard drives. MCP does the same for the "context" an AI model uses.

An MCP server provides three main primitives:

  1. Resources: Read-only data that the LLM can pull into its context (like a local file or a database record).
  2. Tools: Executable functions that the LLM can call to perform an action (like searching the web or calculating a value).
  3. Prompts: Pre-defined templates that help the LLM interact with the server's data more effectively.

When an AI agent needs to browse the web, it doesn't "open a browser" in the human sense. It calls a tool provided by an MCP server, which fetches the data, cleans it, and returns it as text the model can understand.

Why an MCP Server for Web Browsing Matters

The primary challenge for AI agents is the "noise" of the modern web. Raw HTML is filled with navigation menus, tracking scripts, and CSS that waste the LLM's context window and confuse its reasoning.

Traditional plugins often tried to solve this by wrapping a headless browser in a heavy Node.js or Python layer. This created significant latency and infrastructure overhead. If an agent has to wait five seconds for a browser to cold-start, the user experience suffers.

By using a dedicated mcp server web browsing architecture, we move the heavy lifting of rendering and cleaning to a specialized server. This allows the LLM to focus on reasoning while the server handles the technicalities of the web.

This shift matters for three practical reasons:

First, it preserves the context window. By converting a website into clean markdown before it reaches the LLM, we remove 90% of the irrelevant characters. This allows the agent to "read" more pages without hitting token limits.

Second, it enables autonomy. An agent equipped with an MCP server can decide which tool to use based on the task. If it needs a single page, it calls a scrape tool. If it needs to understand a site's structure, it calls a map tool.

Third, it solves the infrastructure burden. Developers no longer need to manage headless Chrome instances in their own application logic. The MCP server acts as the managed gateway to the web.

How MCP Works Technically

MCP operates on a client-server architecture using JSON-RPC 2.0. The communication typically happens over one of two transports: Standard Input/Output (stdio) for local processes or Server-Sent Events (SSE) for remote servers.

When a user asks an AI agent to "Find the latest pricing for X and compare it to Y," the following sequence occurs:

  1. Tool Discovery: The MCP client (the AI agent) queries the server to see what tools are available. The server responds with a list of tool definitions, including their names, descriptions, and required input schemas.
  2. Tool Selection: The LLM analyzes the user's request and determines that it needs to browse the web. It selects the appropriate tool (e.g., browser_search) and generates the necessary arguments.
  3. Execution: The client sends a JSON-RPC request to the MCP server. The server executes the code—such as launching a Rust-based headless browser—and fetches the data.
  4. Response: The server returns the result (usually in markdown) to the client. The LLM then incorporates this new information into its context to formulate the final answer.

This separation of concerns is critical. The LLM does not need to know how to handle JavaScript rendering or bypass anti-bot protections; it only needs to know how to request a URL and receive clean text.

MCP in Practice: The Ilmenite Implementation

We shipped a native MCP server on day one because we believe the bottleneck for AI agents isn't the model's intelligence, but the quality and speed of the data it consumes.

Most web browsing tools are wrappers around Chrome. This makes them slow and resource-heavy. We built Ilmenite in pure Rust to eliminate this overhead. While a Chrome-based session might use 200-500MB of RAM, Ilmenite uses a small per-request memory footprint. This efficiency is vital for MCP servers that may be handling hundreds of concurrent requests for different agents.

Our mcp server web browsing implementation provides five core tools to Claude and other AI agents:

1. browser_scrape

This tool takes a single URL and returns clean markdown. It handles JavaScript rendering for React or Next.js sites and strips away the boilerplate. This is the primary tool for reading a specific article or documentation page.

2. browser_crawl

When an agent needs more than one page, it uses browser_crawl. The agent can specify a starting URL and a depth limit. The server then traverses the site and returns the content of multiple pages, which is essential for building a knowledge base or performing deep research.

3. browser_map

Before crawling, an agent often needs to know what is actually on a domain. browser_map discovers all reachable URLs on a site. This allows the agent to be surgical about which pages it decides to scrape, saving tokens and time.

4. browser_extract

For tasks requiring structured data, browser_extract allows the agent to pass a JSON schema. The server then extracts only the requested fields (e.g., product names and prices) from the page. This prevents the LLM from having to parse a large markdown file to find a few specific numbers.

5. browser_search

This tool combines web search with scraping. The agent can search for a query, and the server returns the top results already converted into markdown. This removes the need for the agent to perform a search and then manually call browser_scrape for every result.

Setting up Ilmenite with Claude Desktop

Because we provide a hosted SSE endpoint, you don't need to run a local server. You can connect Claude Desktop directly to Ilmenite.

  1. Open your claude_desktop_config.json file.

    • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
    • Windows: %APPDATA%\Claude\claude_desktop_config.json
  2. Add the Ilmenite SSE configuration:

{
  "mcpServers": {
    "ilmenite": {
      "url": "https://mcp.ilmenite.dev/sse"
    }
  }
}
  1. Restart Claude Desktop.

Once restarted, Claude will have access to the five browser tools. You can now ask it to "Map the documentation at ilmenite.dev and tell me how the /v1/extract endpoint works," and it will use the MCP server to fetch the live data.

Comparison: MCP vs. Traditional Scraping

The difference between an MCP-based approach and traditional scraping is the difference between "providing a tool" and "providing a result."

In a traditional setup, a developer writes a Python script using Playwright or Puppeteer. They manage the browser lifecycle, handle the HTML parsing, and then send the resulting string to the LLM. If the website structure changes, the script breaks.

With an MCP server, the developer provides the agent with the capability to browse. The agent decides how to use the tool. If the agent finds that a page is too large, it might decide to use browser_map first to find a more specific sub-page. The intelligence is moved from the static script to the dynamic agent.

Furthermore, the performance difference is measurable. Ilmenite routes static pages through a pure-Rust fast path that skips Chromium entirely — so the agent doesn't pay browser-launch cost on pages that don't actually need a browser. For an AI agent, that's the difference between a real-time conversation and one that feels like a series of slow loading screens.

Tools and Resources

If you are building AI agents and need reliable web access, the following resources are the best place to start:

  • The MCP Specification: The official documentation for the Model Context Protocol.
  • Ilmenite Documentation: Our docs explain the underlying API that powers our MCP server.
  • Ilmenite Playground: Test how URLs are converted to markdown before implementing them in your agent via the playground.
  • MCP Integration Guide: Detailed steps on MCP integration for custom clients.

MCP is not just another plugin; it is a fundamental shift in how we provide context to AI. By standardizing the interface between the model and the web, we enable agents that are more autonomous, more accurate, and significantly faster.

To start building your own AI agents with live web access, you can sign up for a free account and connect to our MCP server today.