PublishedVerifiedEvery 6 weeksSources5 namedAuthored bySquareRank Team
ChatGPT · § 1.1.2 · Explainer
ChatGPT-User vs GPTBot vs OAI-SearchBot
OpenAI documents three crawlers1, and the difference between them decides whether ChatGPT cites you tomorrow. GPTBot trains future models. OAI-SearchBot indexes content for ChatGPT Search results. ChatGPT-User fetches a page live when a user asks a question that needs it. Only one of those three is on Squarespace's 26-bot AI checkbox list — and it is the least important one for live citations.
This leaf is the user-agent breakdown. The companion piece on unblocking GPTBot covers the Settings → Crawlers walkthrough. Together they answer the question most "how do I get cited by ChatGPT" tutorials skip: which OpenAI bot are you actually trying to let in, and why.
§01The three jobs
Three bots, three different jobs
OpenAI's bots page documents the split in one paragraph each. GPTBot collects training data — its disallow means 'do not use my content to train future generative AI foundation models.' OAI-SearchBot indexes content for the ChatGPT Search feature — when ChatGPT decides to show source cards above an answer, OAI-SearchBot is the bot that put them there. ChatGPT-User fetches a page in real time on behalf of a user request — when you ask ChatGPT to summarise a specific URL or to find information that requires browsing, this is the bot doing the work.
The user-agent strings are distinct and published1. GPTBot: Mozilla/5.0 ... GPTBot/1.3, IP ranges at openai.com/gptbot.json. OAI-SearchBot: OAI-SearchBot/1.3, IP ranges at openai.com/searchbot.json3. ChatGPT-User: ChatGPT-User/1.0, IP ranges at openai.com/chatgpt-user.json4. Three separate identities, three separate IP pools, three separate jobs.
The reason this matters operationally: each bot can be allowed or disallowed independently in robots.txt. A Squarespace site can configure itself to allow OAI-SearchBot while disallowing GPTBot, which would let ChatGPT cite the site in Search results while keeping the content out of training corpora. That configuration is what most AI-visibility playbooks recommend for sites whose content is the business — be cited, but not absorbed into the model itself.
§02The asymmetry
The robots.txt asymmetry nobody documents
GPTBot and OAI-SearchBot respect robots.txt — they are well-behaved crawlers that read your file and obey Disallow rules. ChatGPT-User does not. OpenAI's documentation is explicit on this point: 'because these actions are initiated by a user, robots.txt rules may not apply.' The reasoning is that a user typing a query is making an active choice to fetch a page, similar to how a browser does not consult robots.txt before loading a typed URL. The implication for Squarespace: even with the AI exclusion box turned on, ChatGPT-User will still reach your site.
The behaviour parallels how Perplexity-User and Claude-User handle user-initiated requests across other AI engines — all three crawlers prioritise the user's stated intent over the site's robots.txt. The honest framing for a Squarespace owner: robots.txt is a polite request for autonomous crawlers, not a hard enforcement mechanism for user-initiated traffic. If you want to block ChatGPT-User specifically, the path is not robots.txt — it is HTTP-level blocks (Cloudflare rules, hosting-side WAF) targeted at the published ChatGPT-User IP ranges4.
§03The priority
Which bot actually drives live citations
OAI-SearchBot drives live citations in ChatGPT Search. When a user asks ChatGPT a question that triggers web-grounded search mode, the source cards above the answer come from OAI-SearchBot's index. ChatGPT-User drives a smaller share of citations — when a user explicitly asks ChatGPT to read a specific URL, or when the model decides mid-conversation that it needs to fetch a page. GPTBot drives almost no live citations — its contribution is to next year's model version, not today's answers.
The priority order for Squarespace owners who want today's citations: allow OAI-SearchBot first, allow ChatGPT-User second (or note that it does not respect robots.txt anyway), allow GPTBot third. The most common configuration mistake we see in audits is the inverse — sites that have specifically allowed GPTBot through a custom robots.txt rule while leaving OAI-SearchBot blocked or unverified. The result: training contribution but no live citations, which is the opposite of what most owners want.
What each bot contributes
Today
OAI-SearchBot's index decides today's ChatGPT Search citations.
One updated 2025 policy worth noting: OAI-SearchBot and GPTBot now share crawl data, per OpenAI's revised documentation5. If a site has allowed both bots, OpenAI may use the results from one crawl for both purposes to avoid duplicate fetches. The change does not affect site owners directly — it is an OpenAI-side optimisation — but it does mean that disallowing GPTBot while allowing OAI-SearchBot is now a cleaner configuration than it was when the two were strictly independent crawls.
§04The recommendation
Per-bot recommendation for a Squarespace site
For a Squarespace owner whose goal is to be cited by ChatGPT, the cleanest configuration is: leave the Squarespace AI exclusion box unchecked (allows all three OpenAI bots). If you specifically do not want to contribute training data, you cannot do that from the Squarespace panel directly — the all-or-nothing toggle means the only way to be more selective is to leave the box off and add a custom robots.txt directive for GPTBot. Squarespace does not let you edit robots.txt, so the workaround is the X-Robots-Tag header via Code Injection on Business plan or above.
The honest constraint: Squarespace's panel does not give you per-bot control over OpenAI's three crawlers. The unchecked default allows all three (because none of them are in a Disallow block). The checked state disallows GPTBot (it is on the 26-bot list2) but does not affect OAI-SearchBot or ChatGPT-User (neither is on the list — and ChatGPT-User would not respect the rule anyway). The configuration most owners actually want — allow OAI-SearchBot, allow ChatGPT-User, deny GPTBot — is not reachable through the panel UI alone.
HTMLPer-bot meta directive via Code Injection (Business plan and above) — site-wide header
<!-- Allow OAI-SearchBot and ChatGPT-User (default behaviour), block GPTBot training --><meta name="GPTBot" content="noindex"><!-- Note: meta directives target specific bots by user-agent name. GPTBot respects this. -->
The cleaner method for full per-bot control is the X-Robots-Tag pattern documented in the AI Crawlers cluster, which covers Squarespace's three workarounds for the platform's lack of direct robots.txt editing. For most owners, the simplest approach — leave the panel box unchecked, accept that all three OpenAI bots can crawl — produces the best citation outcome.
§05The audit
How to tell which OpenAI bot actually visited your site
If you have server-level access to your access logs, the user-agent string distinguishes the three bots immediately. Squarespace owners on Personal and Business plans do not get raw log access, but Squarespace Analytics surfaces some referrer and bot data. The pragmatic 2026 audit: run a curl impersonation test for each user-agent against your homepage, confirm 200 OK across all three, and watch the AI Visibility panel for branded-prompt mentions of your site.
The three curl one-liners below confirm reachability per bot. A 200 OK across all three means your Squarespace configuration is not interfering with any of OpenAI's crawlers. A 403 or 5xx on any one of them suggests a downstream block — a Cloudflare rule, a hosting-side WAF, or a stray X-Robots-Tag header — and the audit is to identify what.
bashThree curl checks, one per OpenAI bot — expect 200 OK on each
# GPTBot — training crawler
curl -I-A"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.3; +https://openai.com/gptbot"https://yoursite.com/# OAI-SearchBot — ChatGPT Search index (most important for live citations)
curl -I-A"Mozilla/5.0 ... OAI-SearchBot/1.3; +https://openai.com/searchbot"https://yoursite.com/# ChatGPT-User — live, user-initiated fetch
curl -I-A"Mozilla/5.0 ... ChatGPT-User/1.0; +https://openai.com/bot"https://yoursite.com/