Find the hidden APIs behind any website.Try it now
All posts
EngineeringMarch 20, 2026·7 min·Ilmenite Team

Why We Wrote Our Headless browser in Rust

AI agents require a constant stream of live web data to function. Whether they are performing research, updating a RAG pipeline, or executing autonomous tasks, the bottleneck is almost always the brow...

AI agents require a constant stream of live web data to function. Whether they are performing research, updating a RAG pipeline, or executing autonomous tasks, the bottleneck is almost always the browser. Traditional headless browsers are designed for human interaction, not for the high-frequency, low-latency requirements of AI infrastructure. To solve this, we built a rust headless browser designed specifically for machine consumption.

What is a Rust Headless browser?

A headless browser is the core software component that transforms raw HTML, CSS, and JavaScript into a structured representation of a web page—typically a Document Object Model (DOM). Most modern browsers, like Chrome (Blink) and Safari (WebKit), are written primarily in C++. While C++ provides the necessary performance, it leaves the developer responsible for manual memory management, which often leads to memory leaks, segmentation faults, and security vulnerabilities.

A rust headless browser leverages the Rust programming language to achieve the same hardware-level performance as C++ but with a fundamental guarantee: memory safety. Rust uses a system of ownership and borrowing to manage memory at compile time. This means the compiler ensures that memory is allocated and freed correctly without needing a garbage collector (GC).

In the context of a web scraping API, the engine's job is not to render pixels on a screen for a human to see. Instead, it is to parse the network response, execute necessary JavaScript to handle dynamic content (like React or Vue), and extract clean text or markdown. By building this in Rust, we can strip away the overhead of a graphical user interface and focus entirely on data extraction efficiency.

Why a Rust Headless browser Matters for AI Infrastructure

When building AI agents, the latency of the "browse" step directly impacts the user experience. If an agent takes 5 seconds to start a browser and another 3 seconds to render a page, the total loop time becomes unacceptable.

The Problem with Garbage Collection (GC)

Most high-level languages used for web scraping, such as JavaScript (Node.js) or Python, rely on garbage collection. A GC periodically pauses the execution of the program to identify and reclaim unused memory. These "stop-the-world" pauses create unpredictable latency spikes.

For a developer monitoring p95 latency, GC pauses are a nightmare. You might have an average response time of 200ms, but your p95 could be 2,000ms because a GC cycle triggered at the wrong moment. Because Rust has no garbage collector, it provides deterministic performance. There are no random pauses, allowing us to keep per-request latency low on the fast path.

The RAM Tax

Headless Chrome is resource-intensive. A single Chrome instance can easily consume 200MB to 500MB of RAM. When scaling to thousands of concurrent AI agent sessions, the infrastructure cost becomes prohibitive.

Using a rust headless browser allows us to minimize the memory footprint. By avoiding the overhead of a full browser suite and utilizing Rust's efficient memory layout, we can run a session in a small memory footprint. This is a a much lower memory footprint compared to Chrome-based alternatives.

Cold Start Latency

In a serverless or highly scaled environment, "cold starts"—the time it takes to initialize a new process—are a critical metric. Starting a headless Chrome instance involves launching a heavy binary, initializing a profile, and setting up a communication bridge (like CDP). This typically takes between 500ms and 2,000ms.

A specialized Rust binary can start in a fraction of that time. Our engine achieves a a pure-Rust fast path for static pages. For an AI agent making hundreds of requests, the difference between 500ms and the fast path is the difference between a tool that feels instantaneous and one that feels sluggish.

How the Engine Works Technically

The efficiency of our architecture comes from three core Rust principles: memory safety, zero-cost abstractions, and fearless concurrency.

Memory Safety and Ownership

In C++, a "use-after-free" bug can crash a scraper or create a security hole. Rust prevents this via the Borrow Checker. The compiler tracks every piece of data and ensures that only one part of the code "owns" it at a time.

When our engine parses a 12KB HTML page, it does so in 134 microseconds. This speed is possible because Rust allows us to use "slices"—references to a part of the original network buffer—rather than copying strings into new memory locations. This minimizes allocations and keeps the CPU cache efficient.

Zero-Cost Abstractions

Rust allows us to write high-level, readable code without sacrificing low-level performance. This is known as zero-cost abstractions. For example, we can use complex iterators and closures to filter navigation menus and ads from a page, and the Rust compiler will optimize that code down to the same assembly as a manual for loop in C.

This allows us to build complex logic for converting HTML to markdown (which takes only 3.84μs) without adding runtime overhead.

Fearless Concurrency with Tokio

Web scraping is an I/O-bound task. The CPU spends most of its time waiting for the network. To handle this, we use the Tokio runtime, an asynchronous event loop for Rust.

Rust's type system ensures that data shared between threads is thread-safe. This "fearless concurrency" allows our engine to handle thousands of concurrent requests on a single small server without the risk of data races. While a Node.js application is limited by a single-threaded event loop, our Rust engine can fully utilize every core of the processor.

Single-Binary Deployment

One of the most practical advantages of Rust is that it compiles to a static binary. There is no need to install a runtime (like Node.js or Python) or a massive browser installation (like Chromium) on the host machine.

This results in a hosted API (no Docker to deploy). Compared to the 500MB to 2GB images required for Chrome-based scrapers, this makes deployment and scaling nearly instantaneous.

The Rust Headless browser in Practice

We applied these principles to create Ilmenite, a web scraping API designed for AI agents. By routing static pages through a pure-Rust fast path and only launching Chromium when a page actually needs it, we shifted the performance ceiling of what a scraping API can do.

Real-World Benchmarks

The architectural choice to use Rust manifests in the actual numbers:

Handling JavaScript with Boa

A major challenge in building a rust headless browser is JavaScript execution. Most scrapers simply wrap Chrome's V8 engine. We use Boa, a JavaScript engine written entirely in Rust.

It is important to be honest about the trade-offs here: Boa is not as fast as V8 for extremely complex, computation-heavy Single Page Applications (SPAs). V8 has had decades of JIT (Just-In-Time) compilation optimization. For the majority of web pages, Boa is more than sufficient and maintains our low memory footprint. For the small percentage of sites that require V8-level performance, we implement a fallback to Chrome. This hybrid approach gives us the efficiency of Rust for 95% of requests while maintaining 100% compatibility.

Integration with AI Pipelines

Because the engine is so lightweight, we can offer a pricing model based on a prepaid USD balance where users pay per scrape rather than per browser-hour. This is only possible because our infrastructure costs are orders of magnitude lower than those of companies running cloud Chrome clusters.

For developers building RAG pipelines, this means they can index thousands of pages into a vector database without worrying about the memory leaks or crashes associated with Puppeteer or Playwright.

The Trade-offs of Using Rust

Building a headless browser in Rust is not without its challenges. It is a harder path than wrapping an existing browser.

The Learning Curve

The Rust borrow checker is notorious for its steep learning curve. Developers cannot simply "write code and fix it at runtime." They must prove to the compiler that their memory management is safe. This increases initial development time. We spent significantly more time in the design phase than we would have in Node.js, as we had to strictly define data ownership for every part of the DOM tree.

Compile Times

Rust's compiler does a massive amount of work to ensure safety and optimization. As a result, compile times are significantly slower than in Go or TypeScript. Large changes to the engine can take several minutes to compile. While this doesn't affect the end-user of the API, it slows down the inner development loop for the engineering team.

Ecosystem Maturity

While the Rust ecosystem is growing rapidly, it is not as vast as the JavaScript ecosystem. We had to rely on libraries like html5ever (the parser used by Servo) and reqwest for HTTP handling. In some cases, we had to implement custom logic that would have been a simple npm package install in a Node.js environment.

Tools and Resources for Rust Web Infrastructure

For developers interested in building their own high-performance web tools in Rust, we recommend the following crates and resources:

  1. Tokio: The industry standard for asynchronous runtime in Rust. Essential for any I/O-bound application.
  2. html5ever: A highly compliant HTML5 parser. It is the foundation for many Rust-based browser projects.
  3. Boa: A Rust-native JavaScript engine. Great for lightweight JS execution without the Chrome overhead.
  4. Axum: A web framework built by the Tokio team that is excellent for building high-performance APIs.
  5. Reqwest: The most robust HTTP client for Rust, supporting connection pooling and async requests.

Building a headless browser from scratch is a significant undertaking, but for infrastructure that supports AI agents, the investment in Rust pays off in every single request. By eliminating the garbage collector and the Chrome overhead, we can provide the sub-millisecond startup and minimal memory footprint that the next generation of AI requires.

If you want to see this architecture in action, you can try the Ilmenite playground to convert any URL to clean markdown in milliseconds.