Aether

· 4 min read

The Problem with Price Comparison for AI

Price comparison is one of the most obvious use cases for AI agents. A user asks for the cheapest place to buy a specific product, and the agent goes and finds out. It should be straightforward. In practice, it is surprisingly difficult - and the tools that exist for human comparison shopping do not translate well to agent use.

How price comparison works for humans

The traditional price comparison model is built around websites. A user visits a comparison site, types in a product, and gets a list of prices from different retailers. The interface is visual - sortable tables, retailer logos, buy-now buttons. The data behind it is messy, but the presentation layer smooths over the inconsistencies. A human can glance at a results page and immediately understand the options, even if the product names do not quite match or the formatting differs between listings.

This model has worked well for two decades. But it was designed entirely for humans interacting through a web browser, and it falls apart when an AI agent tries to use it.

Why agents cannot use comparison websites

An AI agent does not see a webpage the way a person does. It does not interpret layout, scan visual hierarchies, or infer meaning from context the way a human reader can. When an agent encounters a price comparison page, it has to parse the underlying HTML - which is designed for rendering, not for data extraction.

This creates several problems. Product names are displayed as formatted text, not as structured data with separate fields for brand, model, and variant. Prices might appear in one format on the results page and another on the detail page. Stock status might mean different things on different sites - immediately available, available within 48 hours, or available for pre-order. The distinction matters when someone wants to buy something today, but it is buried in the presentation rather than exposed as a filterable data point.

Some comparison sites offer APIs, but these are typically designed for other websites to embed widgets, not for autonomous agents to query. They return HTML snippets or require browser-like sessions. They were not built with machine-to-machine communication in mind.

The scraping trap

In the absence of clean data sources, the default approach for agents has been to scrape retail websites directly. An agent visits a product page, extracts the price, checks the stock status, and moves on to the next retailer. This works in demos. It does not work reliably at scale.

Scraping is fragile. Retailers redesign their pages, change their markup, or add anti-bot protections. A scraper that worked yesterday breaks today. Maintaining scrapers across even a handful of UK retailers is a significant ongoing engineering burden.

It is also slow. Visiting five retailer websites sequentially, waiting for JavaScript to render, and parsing each page takes seconds at best. An agent user is waiting for an answer. That delay compounds when the agent needs to check multiple products or when the comparison involves more than a few merchants.

And it raises legal and ethical questions. Many retailers explicitly prohibit scraping in their terms of service. An agent-mediated service built on scraping is building on unstable ground.

What agents actually need

The solution is not a better scraper or a faster way to parse HTML. It is a purpose-built data layer that sits between agents and retailers - one that does the ingestion and normalisation work once, centrally, and serves the result in a format that agents can consume directly.

This means an API, not a website. JSON responses, not HTML pages. Consistent field names across all merchants. A standard product identifier (like a GTIN) for cross-merchant matching. Timestamps on every data point so the agent knows how fresh the information is.

It also means working with retailers and affiliate networks rather than around them. Retailer product feeds - the structured data files that merchants provide to affiliate networks and comparison platforms - are the cleanest available source of product information. They are updated regularly, provided with the retailer’s consent, and already designed for machine consumption. They are not perfect, but they are a dramatically better starting point than scraping.

The price comparison problem for AI is not primarily a technology problem. It is a data architecture problem. The information exists - it is just in the wrong shape. Reshaping it for agents is where the real work lies.