Is Ilmenite really a browser?

Yes — a headless browser built in pure Rust. It parses HTML, queries CSS selectors, runs JavaScript, and extracts content. Not a Chromium fork.

How does pricing work?

Credits-based, pay per use. 1 scrape = 1 credit. Free tier includes 500 credits/month. Pay as you go after that — no subscriptions, no browser-hour metering.

All posts

Use CaseMarch 25, 2026·5 min·Ilmenite Team

Building an AI Research Assistant That Reads the Web

LLMs are limited by their training cut-off dates and a tendency to hallucinate when they lack specific, real-time information. To build a functional ai research assistant web tool, you must give the m...

The challenge isn't just getting the data, but getting it clean. Raw HTML is noisy, filled with navigation menus, scripts, and ads that waste tokens and confuse the model. You need a pipeline that converts a user's question into a set of clean, markdown-formatted documents that an LLM can synthesize into an accurate answer.

The problem with traditional web browsing for AI

Most developers attempt to build research assistants by wrapping a headless browser like Playwright or Puppeteer. While these tools are powerful, they introduce significant infrastructure overhead.

A single Chrome instance consumes between 200MB and 500MB of RAM. If your research assistant needs to scrape five different sources to answer a single query, you are suddenly managing massive memory spikes and potential process crashes. Furthermore, the "cold start" time for these browsers—the time it takes to launch the process and navigate to a page—is often between 500ms and 2,000ms.

Then there is the data cleaning problem. LLMs do not need <div> tags or CSS classes; they need the actual content. Manually writing selectors for every website your assistant might visit is impossible. You need a system that automatically strips the boilerplate and returns clean markdown.

The architecture of an ai research assistant web tool

To solve these problems, we use a four-stage pipeline: query expansion, discovery, extraction, and synthesis. Instead of managing browser infrastructure, we use Ilmenite, a web scraping API built in Rust that handles rendering and cleaning in a single call.

1. Query Expansion and Search

The process begins when a user asks a question. The assistant uses an LLM to turn that question into a search query. This query is sent to the /v1/search endpoint.

The search endpoint performs a web search and returns the top relevant URLs. By combining search and scraping into one API, you eliminate the need to manage separate search API keys and custom parsing logic for search engine result pages (SERPs).

2. Parallel Scraping

Once the assistant has a list of URLs, it must fetch the content. This is where performance becomes critical. Because Ilmenite is built in pure Rust, it has a a pure-Rust fast path for static pages and uses a small memory footprint for static scrapes.

The assistant sends these URLs to the /v1/scrape endpoint. Ilmenite renders the JavaScript (handling React, Vue, or Next.js sites) and strips away the noise. The result is clean markdown that preserves the structure of the page without the HTML clutter.

3. Context Window Management

The assistant receives several markdown documents. Since even large context windows have limits, the assistant may need to chunk the text or use a RAG (Retrieval-Augmented Generation) approach to find the most relevant snippets across the scraped pages.

4. Synthesis

The final stage is the synthesis. The LLM takes the clean markdown, the original user question, and a system prompt instructing it to cite its sources. It then generates a comprehensive answer based solely on the retrieved web data.

Implementation with Python and Ilmenite

Below is a complete implementation of this architecture using Python. This example assumes you have an OpenAI or Anthropic API key for the synthesis stage and an Ilmenite API key for the web data.

import requests
import openai

# Configuration
ILMENITE_API_KEY = "your_ilmenite_key"
OPENAI_API_KEY = "your_openai_key"
ILMENITE_BASE_URL = "https://api.ilmenite.dev/v1"

client = openai.OpenAI(api_key=OPENAI_API_KEY)

def research_web(query):
    print(f"Searching for: {query}...")
    
    # Step 1: Search for top results
    search_response = requests.post(
        f"{ILMENITE_BASE_URL}/search",
        headers={"Authorization": f"Bearer {ILMENITE_API_KEY}"},
        json={"q": query, "num_results": 3}
    )
    search_results = search_response.json().get("results", [])
    
    # Step 2: Scrape top results into markdown
    context_data = []
    for result in search_results:
        url = result['url']
        print(f"Scraping {url}...")
        
        scrape_response = requests.post(
            f"{ILMENITE_BASE_URL}/scrape",
            headers={"Authorization": f"Bearer {ILMENITE_API_KEY}"},
            json={"url": url, "format": "markdown"}
        )
        
        if scrape_response.status_code == 200:
            markdown_content = scrape_response.json().get("content")
            context_data.append(f"Source: {url}\nContent:\n{markdown_content}")

    # Combine all scraped data into one context block
    full_context = "\n\n---\n\n".join(context_data)
    
    # Step 3: Synthesize answer with LLM
    print("Synthesizing answer...")
    prompt = f"""
    You are a professional research assistant. Use the following web context to answer the user's question.
    If the answer is not in the context, say you don't know. Always cite the source URL.

    Context:
    {full_context}

    Question: {query}
    """
    
    completion = client.chat.completions.create(
        model="gpt-4-turbo",
        messages=[{"role": "user", "content": prompt}]
    )
    
    return completion.choices[0].message.content

# Execution
user_query = "What are the current benchmarks for Rust-based headless browsers compared to Chrome?"
answer = research_web(user_query)
print("\nFinal Answer:\n", answer)

Results and performance

When building an ai research assistant web tool, the perceived speed of the agent depends on the latency of the data retrieval.

Using traditional headless Chrome infrastructure, the time spent waiting for browser boot-up and page rendering often exceeds the time the LLM spends generating the answer. In contrast, Ilmenite's per-request latency stays low on the fast path.

Resource Efficiency

If you are deploying this assistant as a microservice, the infrastructure requirements are drastically lower. A standard Chrome-based scraper requires significant memory to avoid OOM (Out of Memory) errors. Because Ilmenite is a single binary with a hosted API (no Docker to deploy), you can run thousands of concurrent sessions on a small server.

Token Optimization

By converting HTML to markdown, you reduce the token count by 60-80% per page. This not only lowers your LLM API costs but also allows you to fit more sources into the context window, increasing the accuracy and depth of the research assistant's answers.

Metric	Traditional Headless Chrome	Ilmenite API
Output Format	Raw HTML (Noisy)	Clean Markdown
Deployment	Large Docker Images (>1GB)	hosted Docker Image

Going further with your AI assistant

Once you have a basic research loop working, you can enhance the assistant with more advanced features available in the documentation.

Structured Data Extraction

If your research assistant needs to compare specific data points—such as pricing tables or technical specifications—don't rely on the LLM to find them in a wall of text. Use the /v1/extract endpoint. You can provide a JSON schema, and Ilmenite will return structured data directly, which can be fed into a database or a comparison table.

Integration with Claude via MCP

For those using Claude, you don't need to build a custom wrapper. Ilmenite provides an MCP server integration, allowing Claude to use the headless browser as a native tool. This removes the need to write the orchestration code entirely.

Privacy and Enterprise

For research assistants that handle sensitive queries, Ilmenite offers dedicated enterprise infrastructure with custom data-residency, SLA, and compliance guarantees. Contact hello@ilmenite.dev to discuss requirements.

To start building your own agent, you can sign up for a free account or test the scraping logic in the playground. For a full breakdown of costs per request, visit our pricing page.