Connect Claude & GPT directly to the web.Try it now
All posts
Use CaseApril 9, 2026·5 min·Ilmenite Team

Price Monitoring at Scale — Architecture Guide

Monitoring competitor prices at scale requires a reliable price monitoring api that can handle JavaScript rendering without the overhead of managing a browser cluster. Most e-commerce sites today use ...

Price Monitoring at Scale — Architecture Guide

Monitoring competitor prices at scale requires a reliable price monitoring api that can handle JavaScript rendering without the overhead of managing a browser cluster. Most e-commerce sites today use React, Next.js, or Vue, meaning a simple HTTP request often returns an empty shell instead of the actual price.

To build a production-grade monitor, you need to solve for three things: discovery of product pages, structured extraction of data, and an infrastructure that doesn't crash under the weight of headless Chrome.

The problem with traditional price monitoring

Most developers start by using Puppeteer or Playwright. While these tools are powerful, they create a significant infrastructure burden. A single Chrome instance consumes 200-500MB of RAM. If you are monitoring 10,000 products daily, the memory leaks and process management overhead become a full-time job.

Beyond infrastructure, there is the "messy HTML" problem. Prices are often buried in deeply nested div tags or rendered dynamically via API calls after the page loads. Writing custom CSS selectors for every competitor is fragile; as soon as a site updates its UI, your scrapers break.

Finally, anti-bot protections like Cloudflare often block basic headless browsers. You need a tool that mimics a real browser environment but operates with the efficiency of a backend service.

The architecture for a price monitoring api system

A scalable price monitoring system consists of four primary stages: Discovery, Extraction, Storage, and Diffing.

1. Discovery (Mapping)

You cannot monitor what you cannot find. Instead of hardcoding URLs, use the /v1/map endpoint to discover all product URLs on a competitor's domain. This allows your system to automatically detect new product launches or category changes.

2. Structured Extraction

Once you have the URLs, you need the data. Rather than parsing raw HTML, use the /v1/extract endpoint. This endpoint allows you to pass a JSON schema. Ilmenite renders the JavaScript, strips the boilerplate, and returns exactly the fields you requested.

3. Storage and State

Store the extracted price, currency, and timestamp in a database (such as PostgreSQL or MongoDB). You must maintain a state for each product to compare the current scrape against the previous one.

4. Diffing and Alerting

A background worker compares the new price with the stored price. If the difference exceeds a certain percentage, the system triggers an alert via a webhook, Slack, or email.

System Flow: Competitor Domain \rightarrow /v1/map \rightarrow Product URL List \rightarrow /v1/extract \rightarrow Database \rightarrow Diff Engine \rightarrow Alert

Implementation with Python

Below is a professional implementation using the Ilmenite Python SDK. This example demonstrates how to extract prices from a list of URLs and trigger an alert on change.

Prerequisites

Install the SDK and a database client:

pip install ilmenite-sdk psycopg2-binary

Complete Implementation

import os
from ilmenite import Ilmenite
import psycopg2

# Initialize Ilmenite client
client = Ilmenite(api_key=os.environ.get("ILMENITE_API_KEY"))

# Database connection for state management
db = psycopg2.connect("dbname=prices user=admin password=secret")
cur = db.cursor()

# Define the schema for the price monitoring api extraction
# This ensures we get structured JSON back, not messy HTML
price_schema = {
    "product_name": "string",
    "price": "number",
    "currency": "string",
    "in_stock": "boolean"
}

def monitor_prices(urls):
    for url in urls:
        try:
            # Use /v1/extract to get structured data
            # This handles JS rendering automatically
            result = client.extract(
                url=url,
                schema=price_schema
            )
            
            data = result.data
            product_id = data['product_name']
            current_price = data['price']
            
            # Fetch previous price from DB
            cur.execute("SELECT price FROM product_prices WHERE id = %s", (product_id,))
            row = cur.fetchone()
            
            if row:
                old_price = row[0]
                if current_price != old_price:
                    trigger_alert(product_id, old_price, current_price)
            
            # Update DB with latest price
            cur.execute(
                "INSERT INTO product_prices (id, price) VALUES (%s, %s) "
                "ON CONFLICT (id) DO UPDATE SET price = EXCLUDED.price",
                (product_id, current_price)
            )
            db.commit()
            
        except Exception as e:
            print(f"Error scraping {url}: {e}")

def trigger_alert(name, old, new):
    diff = ((new - old) / old) * 100
    print(f"ALERT: {name} price changed from {old} to {new} ({diff:.2f}%)")

# Example usage
product_urls = [
    "https://competitor.com/p/gaming-laptop-x1",
    "https://competitor.com/p/wireless-mouse-z2"
]

monitor_prices(product_urls)

Results and performance

When running this architecture, the primary bottleneck is usually the browser startup time and memory consumption. Because Ilmenite is built in pure Rust, it eliminates the "Chrome Tax."

Infrastructure Comparison

MetricTraditional Chrome ClusterIlmenite API
Cold Start Time500ms - 2,000ms0.19ms
RAM per Session200MB - 500MB~2MB
DeploymentHeavy Docker / KubernetesSingle Binary / 12MB Image
Latency (p95)200ms - 2,000ms47ms

In a traditional setup, a $5/month VPS might struggle to run 10 concurrent Chrome sessions without swapping to disk. With Ilmenite, that same server can handle 1,000 concurrent sessions. This represents a 100x reduction in memory requirements.

From a cost perspective, you pay per operation via credits rather than paying for "browser-hours." A standard scrape costs 1 credit, while an LLM-powered extraction costs 5 credits. This makes the cost per product monitored predictable and significantly lower than maintaining a dedicated headless browser fleet.

Going further with your price monitoring api

Once the basic pipeline is operational, you can optimize for scale and accuracy.

Handling Complex SPAs

While Ilmenite's native Rust-based JavaScript engine (Boa) handles most sites, some extremely complex Single Page Applications (SPAs) require full V8 compatibility. In these cases, you can enable Chrome rendering in your request. This increases the credit cost to 3 credits but ensures 100% compatibility with the most complex web apps.

Enterprise Scaling and Self-Hosting

For teams with strict data residency requirements or massive scale (millions of pages per day), the Enterprise tier allows you to self-host Ilmenite as a single binary or Docker container. Because the image is only 12MB, it can be deployed into sidecar containers or edge functions with almost zero overhead.

AI-Driven Analysis

Instead of simple diffing, you can pipe the markdown output from Ilmenite into an LLM to analyze why a price changed. By combining the search endpoint with the extract endpoint, your agent can determine if a competitor's price drop is part of a wider seasonal sale or a targeted promotion.

If you are building an autonomous agent to handle this entire workflow, you can use the MCP integration to give Claude native access to these browsing capabilities.

Ready to build your monitor? Sign up for a free account and start with 500 free credits.