The robots.txt Mistake That Hides Your Store from Every AI Crawler (We Found It on Our Own Site)

June 18, 2026

Can AI Crawlers Actually Access Your Shopify Store?

Probably not. And the reason has nothing to do with your content, your products, or your Shopify plan. It's three lines in a file most store owners haven't opened in years.

We found this problem on our own site six weeks ago. We were doing a deep audit on a client's store and decided to pull ours while we were at it.

GPTBot blocked. PerplexityBot blocked. ClaudeBot blocked.

Every major AI crawler was locked out. We'd been producing content about AI commerce for months. None of it was being indexed by the platforms we were writing about.

That stings a little to admit publicly. But I've since found the exact same configuration in 40+ store audits. This isn't a niche technical edge case. It's happening quietly across thousands of Shopify stores right now, and most owners have no idea.

Why robots.txt Matters More Than It Used To

Most store owners treat robots.txt as a Google thing. Set it once, forget it exists.

That made sense five years ago. Google was the primary crawler you cared about. Today, the list of crawlers that matter to your business includes GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot (Perplexity AI), and Amazonbot (Amazon). Each one powers an AI platform that's actively deciding what products to recommend to buyers.

ChatGPT Shopping is live. Perplexity is surfacing product results with direct purchase links. These aren't features in beta. They're running now, sending buying-intent traffic to the stores that show up.

Your robots.txt is the first gate every one of these crawlers hits. If they can't get through it, nothing else matters. Not your structured data. Not your product descriptions. Not your content. All of it stays invisible.

According to Google Search Central's robots.txt documentation, the file uses a simple directive format to allow or block crawler access. The spec isn't complicated. The errors aren't always obvious.

The Exact Mistake

Here's what we found.

A standard-looking robots.txt with one quiet problem buried at the bottom:

User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /checkout
Disallow: /orders
Disallow: /account
Disallow: /

That last line. Disallow: /

It tells every crawler matching the wildcard User-agent: * to stay out of the entire site. The five lines above it protect specific paths. That final line cancels all of it. The whole store is blocked to anyone not explicitly re-allowed below.

The file probably started with just the path-specific Disallow rules. At some point, a developer or a third-party app added Disallow: /. Nobody noticed. Nobody checked. Years passed.

The second version we find frequently looks like this:

User-agent: *
Disallow: /

User-agent: Googlebot
Allow: /

User-agent: Bingbot
Allow: /

This one deliberately re-allows Google and Bing. Whoever wrote it thought about search engines. Nobody thought about AI crawlers. GPTBot, ClaudeBot, PerplexityBot — not listed. Still blocked by the wildcard at the top.

OpenAI's GPTBot documentation confirms that GPTBot respects robots.txt directives. Which means blocking it works exactly as intended. Just not how you intended.

How to Audit Your Store in Under 10 Minutes

Five steps. No paid tools.

Step 1: Pull your robots.txt file. Go to yourdomain.com/robots.txt in any browser. It's always publicly accessible. You'll see the raw directives immediately.

Step 2: Search for wildcard Disallow rules. Find any User-agent: * block. Check what follows it. If you see Disallow: / anywhere in that block, your entire site is blocked to every crawler not explicitly re-allowed below that rule.

Step 3: Check for AI crawler allowances. Search the full file for GPTBot, ClaudeBot, PerplexityBot, and Amazonbot. If they don't appear with an explicit Allow: / directive, they're blocked.

Step 4: Test with Google Search Console. Search Console has a built-in robots.txt tester. Enter / as the URL path, change the user-agent dropdown to GPTBot, and run it. Blocked or allowed, you'll know in seconds.

Step 5: Find your Shopify template. In your Shopify admin, go to Online Store > Themes > Edit Code. Look for a file called robots.txt.liquid. Shopify's robots.txt documentation explains how this template works. If that file exists and contains custom rules, that's where your problem lives. If it doesn't exist, Shopify is generating your robots.txt automatically and the default is generally safer.

The Correct Configuration

Don't nuke everything. Don't open your store to every bot on the internet.

Keep your sensitive path restrictions. Add explicit allowances for the crawlers that matter. That's it.

User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /checkout
Disallow: /orders
Disallow: /account

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Amazonbot
Allow: /

This configuration blocks sensitive paths for all crawlers. Then it grants explicit, full access to each AI crawler you want to let in. Nothing hidden. Nothing accidental.

Read that again. You're not removing security. You're adding permissions for crawlers that simply didn't exist when your robots.txt was originally written.

After making the change, give it two to four weeks. AI crawler index cycles run slower than Google's. Don't expect next-day results. But you can't get indexed at all when the door is locked.

What This Actually Costs You

Every week your store stays blocked from AI crawlers is a week your competitors might not have the same problem.

ChatGPT Shopping pulls recommendations from crawled and indexed content. Perplexity surfaces products from pages it's actually read. Stores that get indexed first build a compounding advantage. The brands getting recommended today started letting crawlers in months ago.

I've seen this pattern before. A slow, invisible loss. Traffic doesn't fall off a cliff. It just quietly doesn't come from a new source it should have been coming from all along. Then one day you look around and wonder why competitors are showing up in AI results and you're not.

The fix takes ten minutes. The audit takes five. There's no good reason to wait.

Frequently Asked Questions

Does Shopify block AI crawlers by default?
Shopify's default robots.txt doesn't specifically block AI crawlers. The problem happens when stores have customized their robots.txt with wildcard Disallow rules and never added AI crawlers to the allowlist. Check your specific configuration, not just the default.
What is GPTBot and should I allow it?
GPTBot is OpenAI's web crawler. It indexes content for GPT model training and powers ChatGPT's browsing and shopping features. If you want your products surfaced in ChatGPT Shopping results, GPTBot needs to crawl your store. The risk of allowing it is low. The risk of blocking it is staying invisible.
Will allowing AI crawlers hurt my Google rankings?
No. Googlebot and AI crawlers are completely separate systems. Allowing GPTBot or ClaudeBot has zero effect on Googlebot behavior or your search rankings. Your SEO stays the same. Your AI visibility improves.
How do I edit robots.txt on Shopify?
In your Shopify admin, go to Online Store > Themes > Edit Code. Look for robots.txt.liquid. If it exists, that file controls your output. If it doesn't, you can create it. Shopify processes it as a Liquid template and the changes take effect immediately after saving.
Which AI crawlers should I allow?
Start with these four: GPTBot (OpenAI/ChatGPT), ClaudeBot (Anthropic), PerplexityBot (Perplexity AI), and Amazonbot (Amazon). Each one powers an AI platform actively surfacing product recommendations to buyers right now. More crawlers will follow as the space matures.

Is Your Store Actually Visible to AI?

robots.txt is one piece. There are 11 other signals AI shopping platforms use to decide whether your store gets recommended.

We built a free audit that checks all of them. It takes about three minutes. You'll see exactly where your store stands and what to fix first.

Get Your Free AI Commerce Audit

Back to Blog