§ 01 The limit

Squarespace does not let you edit robots.txt directly

Squarespace generates a default robots.txt for every site and does not provide a UI, API, or file-upload path to modify it. The file is at yoursite.com/robots.txt, but its contents are platform-managed. The only ways to influence what appears in it are the two checkboxes in Settings > Crawlers and the page-level 'Hide this page from search results' option on each page's SEO tab.

Squarespace publishes its own platform-level robots.txt at support.squarespace.com/robots.txt³ and a help article on hiding pages from search² that lists the controls available to site owners. The set is narrow on purpose. The trade-off is that owners cannot break their robots.txt because they cannot touch it.

For AI crawler control specifically, the consequence is that you cannot add a custom Disallow rule for a bot that is not on Squarespace's 26-bot list¹. You cannot block ChatGPT-User but allow OAI-SearchBot directly. You cannot block Claude-User but allow Claude-SearchBot. The Crawlers panel groups bots in fixed clusters, and the file follows the panel.

What the platform exposes natively

checkboxes in Settings > Crawlers: one for search engines, one for the 26-bot AI list.

Squarespace Help · 2026

ways to add a custom Disallow line for a single user-agent not on the AI list.

Squarespace Help · 2026

per-page toggle: 'Hide this page from search results' in the page's SEO settings.

Squarespace Help · 2026

§ 02 Panel

What the Crawlers panel can actually do

The Crawlers panel writes exactly two kinds of rules into robots.txt: the search-engine block (when toggled on, all major search crawlers are disallowed sitewide) and the AI block (when toggled on, the 26 named AI user-agents are disallowed sitewide). The panel does not let you select a subset, a path, or a user-agent outside the fixed groups.

The search-engine toggle should stay on for any site that wants to be indexed by Google or Bing. The AI toggle is the one Squarespace owners debate. As covered in the hub and the bot-list leaf, the AI block disallows 26 user-agents at once¹, most of which are training crawlers. For a site that wants AI citations, the default unchecked state is the recommended one.

There is no third checkbox, no advanced sub-panel, no API. If you want anything beyond the two-toggle setup, you need one of the workarounds below.

§ 03 Workaround 1

Workaround 1: page-level noindex via the SEO tab

Every Squarespace plan exposes a 'Hide this page from search results' toggle in each page's SEO settings. Checking it adds a noindex directive that all major crawlers — search and AI — should respect. This is the lowest-friction way to remove a single page from AI surfaces without touching anything site-wide.

The mechanism is straightforward: Squarespace adds a <meta name="robots" content="noindex"> tag to the page's head when the toggle is on. Search and AI engines that read meta robots — which is to say all of the major ones — exclude the page from indexing².

The use cases that justify it: private membership pages, internal documentation pages, draft pages still in editing, thank-you pages whose URLs you don't want appearing in AI-search results. The use cases that do not: anything that should appear in classical search (because noindex is bot-agnostic, blocking Google as well as ChatGPT).

§ 04 Workaround 2

Workaround 2: site-wide meta robots via Code Injection (Business plan+)

On Business plan and above, Settings > Advanced > Code Injection lets you add HTML to the site's head. A site-wide meta robots tag here gives you control over snippet length, image-preview behaviour, and video-preview behaviour for every page — including how AI engines summarise your content when they cite you.

The pattern is a single line in the Header injection panel:

HTML Site-wide meta robots tag injected into the head

 <meta name="robots" content="max-snippet:-1, max-image-preview:large, max-video-preview:-1">

That tag tells crawlers that you allow snippets of any length, large image previews, and video previews of any length. For an AI-citation site, those are the permissive settings: you want engines to be able to lift longer passages and use richer image cards when they cite you. The opposite values (max-snippet:0, max-image-preview:none) would suppress those features.

This pattern does not let you target a specific user-agent. The meta robots tag is read by all crawlers and applied uniformly. For per-bot rules, the third workaround is the only Squarespace-native option.

§ 05 Workaround 3

Workaround 3: X-Robots-Tag HTTP headers via Developer Mode

The most granular workaround, and the steepest learning curve. Squarespace Developer Mode lets a code-comfortable owner override page templates at the file level, which includes the ability to add custom HTTP headers like X-Robots-Tag. This is the closest Squarespace allows you to get to a per-user-agent allow/deny pattern.

X-Robots-Tag is the HTTP-header equivalent of meta robots, but with one important difference: it can carry a user-agent prefix, allowing different directives for different bots. The pattern looks like this:

HTTP X-Robots-Tag with per-user-agent directives

 X-Robots-Tag: GPTBot: noindex X-Robots-Tag: ClaudeBot: noindex X-Robots-Tag: Google-Extended: noindex # Retrieval bots get no directive — they remain allowed.

The reality is that Squarespace's Developer Mode does not expose HTTP-header-level control on most plans. Plans that do support it route through complex template configuration, and the supported set of headers is narrow. For most owners, the third workaround is theoretical: it documents what is possible in principle, not what is feasible inside the platform's UI.

The owners who can actually use this pattern are those running Squarespace Commerce Advanced or Enterprise plans with a developer relationship, or those who have moved their templating to a custom-domain reverse-proxy setup. For everyone else, Workaround 1 (page-level noindex) and Workaround 2 (site-wide meta robots) cover the realistic surface.

§ 06 The honour system

The honour-system limit no workaround can fix

Robots.txt, meta robots, and X-Robots-Tag are all voluntary protocols. A crawler that has decided to ignore them will. Cloudflare's August 2025 investigation reported that Perplexity rotated through undeclared user-agents and IP ranges to access test domains that explicitly blocked the declared Perplexity bots. The takeaway is that none of the three workarounds is enforcement; they are requests.

Cloudflare's findings, published on August 4, 2025, are the most thoroughly documented public case⁸. The investigators created brand-new test domains with strict robots.txt blocks against Perplexity's declared crawlers, and added WAF rules to reinforce the block. Perplexity's content still showed up in answers about those domains, accessed via a generic Chrome user-agent and IP addresses outside Perplexity's documented ranges.

Cloudflare's response was to de-list Perplexity as a verified bot and add detection signatures to its managed bot rules. For Squarespace owners, the practical implications are narrower because Squarespace does not expose WAF-level controls — and broader because the underlying point applies to any bot, not just Perplexity. If genuine enforcement matters (and for most small-business sites trying to be cited by AI engines, it does not), the platform layer is the wrong place to do it. A reverse proxy with bot-detection rules is the right one, and that is a much larger change than this cluster covers.

For sites that have a normal AI-citation goal (be cited, not be hidden), the simple Squarespace setup — both checkboxes in the default state, no custom code — is the cleanest path. The complexity in this article exists for the minority of sites with specific exclusion requirements, not for the majority who should leave the panel alone and focus on the content layer.

Squarespace does not let you edit robots.txt directly

What the Crawlers panel can actually do

Workaround 1: page-level noindex via the SEO tab

Workaround 2: site-wide meta robots via Code Injection (Business plan+)

Workaround 3: X-Robots-Tag HTTP headers via Developer Mode

The honour-system limit no workaround can fix

AI Crawlers hub

Training vs retrieval crawlers

Squarespace AI Search Optimization (pillar)