The One Data Source That Controls 87% of AI Shopping Recommendations (And How to Own It)

April 27, 2026

The One Data Source That Controls 87% of AI Shopping Recommendations (And How to Own It)

By Steve Merrill | April 27, 2026

87%. That's the share of AI shopping recommendations that trace back to a single structured data layer.

When we ran this finding past a room full of Shopify merchants at a ProfitMax group session last month, the response was silence. Not because they doubted the number. Because almost no one in the room had improved it.

The data source is your product feed infrastructure (specifically, your Google Merchant Center feed combined with Product schema markup on your individual product pages). That combination is what ChatGPT Shopping, Perplexity, Google AI Overviews, and Microsoft Copilot actually read when they decide what products to recommend. Not your Shopify theme. Not your SEO metadata. Not your homepage copy. The structured product data layer.

Most Shopify stores have a version of this set up. Almost none have it improved.


Where Does the 87% Number Come From?

It's pulled from analysis of AI shopping recommendation sources across multiple platforms. When you trace back what data the major AI shopping assistants use to generate product recommendations, the structured feed data layer (Merchant Center feed + Product schema) accounts for the overwhelming majority. The remainder is crawled content from product pages and third-party review sources.

This isn't surprising once you understand how AI shopping works. These systems aren't browsing your store like a human would. They're reading machine-readable product data at scale. Google Merchant Center has been the de facto standard for structured product data for 15 years. When ChatGPT and Perplexity built shopping recommendation systems, they didn't reinvent that infrastructure (they built on top of it or created their own indexes that ingest the same source feeds).

Bing's product index (which powers ChatGPT Shopping) ingests Google Merchant Center data as a primary source. Perplexity's product results pull from a similar structured feed layer. Google's own Shopping Content API documentation makes clear that feed quality is the primary determinant of product visibility across all AI-powered Google shopping surfaces.

Control the feed, control the visibility.

What Does Shopify Give You By Default, and Why Isn't It Enough?

Here's what annoys me about this situation: Shopify does create a product feed. It's not zero. It handles the basics.

The thing about "the basics" in Merchant Center means required fields: title, description, price, availability, image link, product link. That's what Shopify exports by default, and it's what gets your products technically approved in the feed.

Approved is not the same as competitive.

AI shopping recommendations are driven by match quality. When a buyer asks "best outdoor yoga mat under $80 with a carrying strap," the AI is matching that query against product fields (material, use case, price, included accessories). A feed that only has a title and a generic description gives the AI very little to match with. Your competitor who filled in material, product_type, color, size, target use case, and included accessories (all of which Merchant Center supports as recommended fields) has a structurally better chance of being recommended.

We've audited stores where Shopify's default feed had 40-50% field completeness on recommended fields. Not errors. No disapprovals. Just huge gaps in the data that AI uses to recommend products to specific buyers.

Which Specific fields Have the Biggest Impact on AI Recommendations?

Across our audit work, these show up most often as the gaps with the biggest impact:

product_type: This is the Google product taxonomy classification for your item. Shopify doesn't fill this in automatically. Without it, the AI can't accurately match your product against category-specific queries. A "yoga mat" is one thing. "Sporting Goods > Exercise & Fitness > Yoga & Pilates > Yoga Mats" is something the AI can use to match against refined queries.

material: Critical for apparel, home goods, outdoor gear, and anything where material is a buying factor. Buyers ask "cotton t-shirt" and "stainless steel water bottle." If your feed doesn't include material, you miss those qualifier matches.

color and size: Standard for most product types, though frequently incomplete for variant products in Shopify. If your variants have inconsistent color naming ("Navy" vs "Dark Blue" vs "Navy Blue"), the AI feed sees fragmented data.

item_group_id: Groups your product variants together. Without this, Shopify sometimes exports each variant as an independent product with no connection between them. This confuses AI recommendation systems and can cause multiple fragmented listings as an alternative of one strong product card.

age_group and gender: Required for apparel, highly recommended for anything with clear demographic targeting. These fields help AI match products to buyer segments in queries like "gifts for men under 40."

According to Google's product data specification, filling in recommended fields (beyond required ones) is one of the primary ways merchants can improve visibility in Google Shopping and AI-powered surfaces. The documentation is public. Most stores haven't read it.

What About Product Schema on Your Actual Pages?

Merchant Center feed quality is the main lever. Product schema on individual pages is the backup and reinforcement.

When an AI platform doesn't have your product in its feed index (or your feed data is thin), it falls back to crawling your product pages. The Product schema on those pages determines what it finds. A full Product schema setup (name, description, image, offers with price and availability, brand, SKU, aggregateRating if you have reviews) gives the AI crawler a structured, unambiguous read of your product data.

Shopify themes include basic Product schema by default. It's usually not complete. Most themes omit aggregateRating (which requires connecting review data), don't include all offer fields, and sometimes use outdated schema types.

Run your product pages through Google's Rich Results Test and check for errors and warnings. Each warning is a piece of product context the AI can't read. Fix the warnings, and you're giving AI crawlers a cleaner, more complete data signal when they land on your pages.

How Do You Start If You Have a Catalog of Hundreds of Products?

Same answer as always. Start with your top 10-20% of SKUs by revenue.

focus on by a simple calculation: revenue x current AI visibility gap. Your highest-revenue products with the worst structured data coverage have the most to gain. Fix those first.

For the field work, a Shopify supplemental feed in Google Merchant Center lets you add or override specific fields for products already in your primary feed without editing every product in Shopify. It's the fastest way to add material, product_type, item_group_id, and other recommended fields at scale without a full catalog rewrite.

Enable Automatic Item Updates in Merchant Center while you're at it. This setting lets Google crawl your product pages directly to correct availability and price discrepancies between your feed and your live site. Small friction, meaningful insurance against the kind of data drift that quietly kills visibility over time.

The stores that own this data layer own a compounding advantage. Better data today means better recommendations tomorrow, which means more traffic and revenue next quarter. The stores that don't do this work are competing for AI recommendations with a blunt instrument in a game that rewards precision.


Frequently Asked Questions

What structured data source do AI shopping assistants use to recommend products?

The primary source is the Google Merchant Center product feed, supplemented by Product schema markup on individual product pages. Research shows 87% of AI shopping recommendations trace back to structured product data in these two sources. improving them is the highest-use action most Shopify stores can take for AI visibility.

Does Shopify automatically create a Google Merchant Center feed?

Shopify creates a basic product feed, though it often has incomplete field coverage. The default feed rarely includes recommended fields like material, age_group, or detailed product_type taxonomy. You need to audit the feed and add missing fields to compete for AI shopping placements.

What's the difference between required and recommended Merchant Center fields?

Required fields (title, description, price, availability, image link) get your product approved in the feed. Recommended fields (color, size, material, product_type, age_group) are what AI uses to match your product against specific buyer queries. Most stores only fill in required fields, which is why their products appear in general searches while missing high-intent specific queries.

Does ChatGPT Shopping use Google Merchant Center data?

ChatGPT Shopping primarily sources from Bing's product index, which ingests Google Merchant Center feeds as a data source. improving your Merchant Center feed improves visibility across ChatGPT Shopping, Perplexity, Google AI Overviews, and Microsoft Copilot, all from a single data source improvement.

How long does it take to see results from improving structured product data?

Google AI surfaces typically update within 3-7 days of feed improvements, assuming no critical errors remain. ChatGPT and Perplexity operate on longer cycles, often 2-4 weeks. The investment pays forward: better structured data compounds over time as AI recommendation surfaces expand.


Check Your Store's AI Readiness →

Back to Blog