Aether

· 4 min read

Affiliate Product Feeds: What Developers Need to Know

If you are building an agent or application that involves product data, affiliate product feeds are one of the most underappreciated data sources available. They are structured, regularly updated, and provided with the retailer’s explicit consent. But they come with real limitations that are worth understanding before you build on them.

What product feeds are

When a retailer joins an affiliate network, they typically provide a product data feed - a structured file containing their catalogue or a subset of it. These feeds follow standardised formats (most commonly the Google Product Data Specification) and include fields like product name, price, availability, images, brand, identifiers, and category.

The affiliate network hosts these feeds and makes them available to approved publishers. Feeds are refreshed on a schedule set by the retailer - anywhere from multiple times daily to once a week, depending on the merchant and the network.

For a developer, the key point is that these feeds are the cleanest source of structured product data available from UK retailers without a direct commercial relationship. They are machine-readable from the start, unlike website content which requires scraping and parsing.

What the feeds contain

A typical product feed entry includes the following fields, mapped here to the Google Product Data Specification naming:

Core fields: product ID, title, description, link (product page URL), image link, availability, price, sale price, brand, condition.

Identifiers: GTIN (EAN/UPC barcode number), MPN (manufacturer part number). These are critical for cross-merchant product matching - if two feed entries from different retailers share the same GTIN, they are the same physical product.

Categorisation: Google product category (a standardised taxonomy), plus the retailer’s own category path.

Attributes: colour, size, material, and other variant-level detail. Coverage varies significantly by retailer - some provide rich attribute data, others provide very little.

What the feeds do not contain

Understanding the gaps is as important as understanding the content. Most affiliate product feeds do not include:

Customer ratings and reviews: These are proprietary to each retailer and are not shared through affiliate feeds. If your application needs ratings data, you will need a separate source.

Detailed specifications: A laptop feed entry might include the product name and price, but not the RAM, processor, or screen resolution as separate queryable fields. Some of this information may be embedded in the product title or description, but extracting it reliably requires parsing free text - which undermines the benefit of having structured data in the first place.

Real-time stock levels: The availability field indicates whether a product is in stock, out of stock, on pre-order, or on backorder. But it is only as current as the last feed refresh. A product that showed as in stock when the feed was generated at 6am might be sold out by 10am. Feed freshness is a real constraint.

Shipping costs and delivery times: These are typically calculated dynamically based on the customer’s location and are not included in static product feeds.

Working with feed data effectively

If you are building on affiliate product feeds, a few practical considerations:

Freshness varies by merchant. Some retailers refresh their feeds multiple times a day. Others update weekly. You need to track when each feed was last updated and communicate that freshness to consumers of your data. Treating all prices as equally reliable when some are hours old and others are days old will erode trust.

GTIN coverage is incomplete. Not all products have GTINs, and not all retailers include them in their feeds. Own-brand products and accessories are the most common gaps. Any cross-merchant matching strategy that relies solely on GTINs will miss a portion of the catalogue. Fuzzy matching on title and brand can fill some gaps, but it introduces its own reliability problems.

Feed quality is the retailer’s responsibility. If a retailer provides inaccurate descriptions or incorrect prices in their feed, those errors will flow through to any service built on that data. You cannot fix bad source data - you can only be transparent about where it came from and when it was last checked.

Feeds are large. A major electronics retailer might have tens of thousands of products in their feed. Ingesting, parsing, and normalising this data at scale requires thought about storage, indexing, and update frequency.

These constraints are manageable, but they shape the architecture of any service built on feed data. The design decisions in Aether’s schema and API - explicit freshness timestamps, GTIN-based matching, compact default responses - are direct responses to the realities of working with affiliate product feeds.