PublishedVerifiedEvery 6 weeksSources5 namedAuthored bySquareRank Team
ChatGPT · § 1.1.1 · How-to
Allow GPTBot on a Squarespace Site, the 90-Second Walkthrough
GPTBot is allowed by default on a fresh Squarespace site2 — the "Block known artificial intelligence crawlers" checkbox in Settings → Crawlers ships unchecked. The walkthrough you actually need is the audit: confirm the box is still off, verify GPTBot is not in your live robots.txt, and live-check the OpenAI user-agent against your homepage.
This leaf is the step-by-step for OpenAI's GPTBot specifically. The companion leaf on ChatGPT-User vs OAI-SearchBot covers the two other OpenAI bots — both of which matter more for live citations than GPTBot itself does. If you have not opened Settings → Crawlers in twelve months, the audit below will tell you in under five minutes whether ChatGPT can read your site.
§01The short answer
TL;DR — Squarespace allows GPTBot by default
A new Squarespace 7.1 site ships with the 'Block known artificial intelligence crawlers' checkbox unchecked, which means GPTBot and the 25 other AI bots Squarespace's panel controls can crawl your site. (OAI-SearchBot and ChatGPT-User are not on the panel list at all, so they remain reachable regardless.) The fix most people search for is one they probably do not need — unless an old designer flipped the box on, or a 2024-era 'protect your content from AI' tutorial talked you into it. The full audit is below: ninety seconds to confirm or correct.
Squarespace's own help center3 says the same thing from the other direction. The recommendation reads: leave the box unchecked so AI search engines can read your site. The reason the question shows up so often is the gap between Squarespace's recommendation and the field guidance most Squarespace owners actually encountered between 2023 and 2025 — designer blog posts, Reddit threads, and Squarespace-forum advice that all encouraged checking the box. If you do not remember opening Settings → Crawlers in the last two years, the most likely state is "still unchecked, all good."
§02The default state
What 'default-allowed' means in practice
The default state on Settings > Crawlers is two checkboxes: one for general search engines (checked, allows Google and Bing to crawl normally), one for AI crawlers (unchecked, allows the 26 named AI bots Squarespace's panel controls). With both at their default, your site is reachable by Google, Bing, and OpenAI's bots without any further action. GPTBot will respect robots.txt and crawl normally. The implication: most Squarespace sites already pass the GPTBot test.
The 26-bot list Squarespace's checkbox controls2 includes GPTBot, ClaudeBot, Google-Extended, CCBot, Meta-ExternalAgent, Applebot-Extended, and twenty other AI training crawlers. The exclusion is all-or-nothing — there is no per-bot toggle in the Squarespace UI. The check box updates robots.txt with a Disallow: / rule for each of the 26 user-agents, and unchecking it removes all 26 rules at once.
Why this matters for ChatGPT
Off
default state of the AI exclusion box on a new Squarespace site, per Squarespace's own help docs.
From the Squarespace dashboard: click Settings in the left sidebar. Click Crawlers. Confirm the box labelled 'Block known artificial intelligence crawlers' is unchecked. If it is checked, uncheck it and click Save. That is the entire panel action. The remaining ninety seconds is for verification: open your live robots.txt in a private window and confirm GPTBot is not in a Disallow block, then live-check the user-agent.
The path is the same one Squarespace's help center documents2: "Open the Settings panel. Click Crawlers. Check the box next to 'Block known artificial intelligence crawlers.'" Read that as the opposite for owners who want citations — leave it unchecked, or uncheck it if a previous designer turned it on. There is no save confirmation pulse; the change writes to your live robots.txt within seconds.
One subtlety: the Squarespace UI state can briefly lag the live robots.txt file after a save. If the checkbox shows the right state but the live file does not, hard-refresh the panel, toggle once and save, then re-verify. The robots.txt itself is the authoritative source for what crawlers see.
§04The verification
Open your live robots.txt and search for GPTBot
Open yoursite.com/robots.txt in a private browser window. Find-on-page for 'GPTBot'. If GPTBot appears followed by Disallow: /, the AI exclusion box is still effectively on — go back to Settings > Crawlers and uncheck it. If GPTBot does not appear in the file at all, the bot is allowed and the configuration is correct. This is the only verification step that matters; the Squarespace dashboard state is a UI representation, the robots.txt is the contract.
robots.txtWhat you should NOT see when GPTBot is allowed (this excerpt means AI block is ON)
# If your robots.txt contains lines like these, GPTBot is being told to stay outUser-agent:GPTBotDisallow:/User-agent:ClaudeBotDisallow:/# ...and 24 more Disallow blocks for the rest of the Squarespace AI list
When the AI box is unchecked and GPTBot is allowed, none of those Disallow rules appear in your robots.txt. The file will still contain rules for non-AI bots (the standard User-agent: * block, anything Squarespace manages for sitemap and admin paths) — but GPTBot will be absent, which means the rules-default permits crawling.
§05The live test
Live-check the OpenAI user-agent against your homepage
The robots.txt verification confirms intent. The live curl test confirms behaviour. Run curl with the GPTBot user-agent string against your homepage and confirm a 200 OK response. The user-agent string is documented on OpenAI's bots page, and the test takes one command. If the response is 200 OK, GPTBot can fetch your site. If it is 403 Forbidden, something downstream of the panel — a Cloudflare rule, a hosting-side WAF, a custom header — is intercepting the bot.
bashThe verification one-liner — replace yoursite.com with your live domain
# Test with the published GPTBot user-agent string
curl -I-A"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.3; +https://openai.com/gptbot"https://yoursite.com/# Expected: HTTP/2 200 OK (with normal response headers)# Problem state: HTTP/2 403 Forbidden, or anything in the 4xx/5xx range
The user-agent string is published on OpenAI's bots documentation1. OpenAI also publishes the IP range list at openai.com/gptbot.json5, which is useful if you want to cross-reference your server logs to see which GPTBot visits actually came from OpenAI versus user-agent-spoofers. The IP-based check is more thorough than the user-agent check, but for most Squarespace owners the curl one-liner is enough — Squarespace does not block user-agents at the platform level, so a 200 OK in this test means the bot can crawl.
§06If the box was on
Recovering from a previously-blocked state
If you find GPTBot in the Disallow block and uncheck the panel today, OpenAI will pick up the new robots.txt on its next crawl. For ChatGPT Search citations, that typically lands within 5-14 days. For GPTBot training data, the effect is longer — model versions ship on OpenAI's release cadence, which is months. Squarespace's help center is explicit that blocking does not retroactively remove anything already scraped, so anything OpenAI learned from your site before the block is already in the model.
The recovery curve has two parts. The fast part is OAI-SearchBot and the ChatGPT Search index — once the robots.txt is clean, OAI-SearchBot will re-fetch on its normal schedule and your URL will start surfacing in source cards again. The slow part is GPTBot's training contribution, which is a one-way function: anything not crawled during the training window is not in the model, and the next chance to be included is the next model version.
The honest framing for owners who toggled the box on years ago: you may have missed a training cycle, but you have not lost long-term citation potential. Squarespace's exclusion explicitly does not delete what was already scraped2, and ChatGPT Search uses a live-retrieval crawler (OAI-SearchBot) that does not depend on training data to surface your page. Unblock the bot, fix the citation hygiene, and the surface comes back.