Skip to content

commerce-agentic/agentic-catalog-scanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

agentic-catalog-scanner

Open-source scoring rubric for measuring how AI shopping agents (ChatGPT, Claude, Gemini, Mistral, DeepSeek) read e-commerce product catalogs. Calibrated against 270,000+ ground-truth captures from real AI agents.

License: CC0 1.0 Universal — public domain, no attribution required. Use it, fork it, embed it in a competing app. We open-sourced the methodology because the rubric is more valuable to the ecosystem than to us locked behind a paywall.

The reference implementation that runs this rubric on live Shopify stores ships as a closed app at aicatalogscore.com — that's where the dataset, the LLM rewrites, and the Score Guarantee live. The rubric here is the audit definition itself.


What it scores

An AI shopping agent's recommendation depends on whether your product page has clean, machine-readable signals. We score 8 dimensions, 100 points total:

# Dimension Max What it measures
1 Title quality 15 Length 30-80 chars, product-type noun present, distinctive attribute, not ALL CAPS, no placeholder text
2 Description 20 150+ words, ≥3 factual markers (units/ingredients/dimensions), ≥1 <ul><li> bullet list, ≥1 subheading, ≥2 use-case mentions, zero fluff terms
3 Images & alt text 15 ≥3 images, ≥80% with alt text length > 5 chars, alt text describes product not just brand
4 Variant structure 10 Variants present (not just "Default Title"), SKU set, barcode set on at least one variant
5 Metafields 15 Google Product Category set, vertical-aware "material bucket" set (skincare → key_ingredient; apparel → material; food → ingredients; etc.), vertical-aware "dimensions" + "care" buckets set
6 Category & tags 10 Shopify Standard Product Taxonomy category assigned, ≥5 tags, productType set
7 SEO 10 seo.title 30-60 chars and ≠ product title, seo.description 70-160 chars and present
8 Pricing & inventory 5 price > 0 set, compareAtPrice present on ≥1 variant, inventory tracking enabled

Grade bands:

  • A+ : 95-100 (AI-ready, top 1%)
  • A : 85-94 (likely recommended)
  • B : 70-84 (sometimes recommended)
  • C : 50-69 (rarely recommended)
  • D : 30-49 (occasional discovery)
  • F : 0-29 (effectively invisible)

Vertical-aware vocabulary

The audit uses different regex tables per merchant vertical so each vertical's natural metafield naming + factual markers get credit. 10 verticals modeled, plus a universal fallback:

apparel · beauty · home · electronics · fitness · food
pets · baby · outdoor · gifts · universal

Example: beauty/skincare scores key_ingredient as the material bucket (not material — a skincare product has ingredients, not a material). Food vertical scores ingredients + allergen patterns. Electronics scores housing_material + battery_capacity.

Full vocab tables: rubric/vertical-vocab.md


Why these 8 dimensions?

We didn't pick them. We observed them by running 2,400+ synthetic shopping queries per day across 6 AI agents (ChatGPT, Claude, Gemini, Perplexity, Mistral, DeepSeek) and statistically modeling which catalog signals correlated with being recommended in the answer.

The dataset (270k+ captures, growing) is published under MIT in our sister repo: ai-visibility-metrics.


Use it

This repo intentionally ships the rubric, not a Shopify SDK. If you want a live audit against your own store, install the hosted app at aicatalogscore.com — it's free up to 15 SKUs.

If you want to embed the rubric in your own tooling:

  1. Read RUBRIC.md — the full scoring spec
  2. Read vertical-vocab.md — the regex tables
  3. Implement against your platform (Shopify, WooCommerce, BigCommerce, custom)

We'd love to see ports. Open a PR with a link to your implementation and we'll list it.


Contributing

The rubric is intentionally a living document. As AI agents shift their ranking signals, the vocab tables and weight distributions need to be re-calibrated. We re-publish a new version every quarter.

PRs welcome for:

  • New vertical vocab tables (currently 10 — there are gaps)
  • Updated factual-marker regexes (e.g. new beauty ingredient terminology)
  • Translations of the rubric to other languages
  • Implementations against non-Shopify platforms

See CONTRIBUTING.md.


Versioning

Version Released What changed
v2.4 2026-05-14 Vertical-aware scoring across all 10 verticals. Per-vertical material bucket.
v2 2026-05-14 Calibration tightening. Stock Shopify avg moved from 64 → 58 to discourage gaming.
v1 2026-04-12 Initial release. 8 dimensions, generic vocab.

Maintained by aicatalogscore.com.

About

Open-source rubric for measuring how AI shopping agents read e-commerce catalogs. Calibrated on 270k+ ground-truth captures.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors