Skip to content
50% OFF $299 $599
Lock in
§ 1.2.2 ARTICLE
Published Verified Every 6 weeks Sources 5 named Reading time 8 min

Perplexity × Squarespace · § 1.2.2

Allow PerplexityBot and Perplexity-User on Squarespace

Perplexity runs two crawlers. Neither one is on Squarespace's 26-bot AI exclusion list2, which means the Settings > Crawlers checkbox has no effect on Perplexity in either direction. The audit takes five minutes; the fix takes longer only when a CDN layer or Code Injection rule is blocking outside Squarespace's panel.

PerplexityBot and Perplexity-User do different jobs

PerplexityBot indexes content for Perplexity's search layer and respects robots.txt. Perplexity-User fetches a page live when a user asks Perplexity a question that requires a fresh visit, and per Perplexity's own documentation, generally ignores robots.txt rules because user-initiated requests drive the visit. Both crawlers carry distinct user-agent strings and operate from documented IP ranges that Perplexity publishes for WAF configuration.

Perplexity's crawler documentation states each role plainly1. PerplexityBot is described as “designed to surface and link websites in search results on Perplexity. It is not used to crawl content for AI foundation models.” The user-agent string is Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot). Perplexity-User is described as supporting “user actions within Perplexity. When users ask Perplexity a question, it might visit a web page to help provide an accurate answer.” The user-agent identifies as Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Perplexity-User/1.0; +https://perplexity.ai/perplexity-user), and the docs note the bot “generally ignores robots.txt rules.”

The robots.txt asymmetry mirrors the one OpenAI documents for ChatGPT-User4. The shared reasoning across both engines: when a person asks an AI agent a live question, the resulting page fetch is treated as a user-initiated visit rather than a crawler harvest, and robots.txt is not generally interpreted to govern human navigation. The practical consequence: even if a Squarespace owner blocks PerplexityBot through a custom rule, Perplexity-User will continue to fetch pages on behalf of users who ask Perplexity a question that requires it.

Why Squarespace's panel does not control either bot

Squarespace's AI exclusion checkbox controls 26 named user-agents. The list is heavily weighted toward training crawlers — GPTBot, ClaudeBot, Google-Extended, CCBot, Bytespider, Meta-ExternalAgent — plus a few smaller AI training agents from Cohere, AI2, You.com and others. Neither PerplexityBot nor Perplexity-User appears on the list. This is not an oversight that needs fixing; it is consistent with how Squarespace's panel is scoped to handle training-class exclusion requests, not live-retrieval traffic.

The Squarespace help center lists the 26 controlled user-agents2 and notes the default state: “We default to having the box unchecked (which means we haven't added any 'AI do not crawl' requests to your robots.txt file).” The companion AI-optimization help article5 reinforces the same recommendation from the other direction: leave the box unchecked if you want AI engines to surface your site.

For Perplexity specifically, the implication is that the Squarespace UI has no effect on either crawler. Allowing PerplexityBot requires no action because the platform was never blocking it. Blocking PerplexityBot is not possible through the Squarespace panel and would need to happen at the Code Injection or CDN layer instead. The Perplexity ranking and citation work on a Squarespace site therefore concentrates on the content layer, not the crawler-access layer, in a way that is different from how owners need to think about ChatGPT or Claude.

The 5-minute audit to confirm both bots can reach your site

Four checks cover the entire surface where something could be blocking Perplexity on a Squarespace site. Robots.txt for declared rules. Code Injection for custom injected directives. CDN dashboards for layer-level blocks. A live user-agent impersonation test to confirm a 200 OK response. Most sites pass all four checks because the default Squarespace state is permissive; the audit is the discipline that confirms it.

Step one: open yoursite.com/robots.txt in a private browser window. Search the file (Ctrl+F or Cmd+F) for “Perplexity”. On a default Squarespace install, neither PerplexityBot nor Perplexity-User should appear. If you find a User-agent: PerplexityBot entry followed by Disallow: /, custom code is adding it somewhere — this is the rare case where intervention is needed.

Step two: open Settings → Advanced → Code Injection. Search both the Header and Footer fields for “Perplexity” or “noai.” Code Injection on Business and higher plans can add meta robots tags like <meta name="robots" content="noai" /> that some AI bots interpret as a do-not-crawl signal. Remove any such tag if your goal is Perplexity citations.

Step three: check your CDN or WAF layer if you have one. Squarespace sites can sit behind Cloudflare, Fastly, or other infrastructure when an owner has configured a custom DNS or proxy setup. Cloudflare in particular ships managed rules that block AI bots; the rules trigger separately from anything in the Squarespace panel. Log into your CDN dashboard and look for AI-bot-blocking rules under Security, Bot Management, or WAF rules.

robots.txt What you should NOT see in your robots.txt if you want Perplexity citations
 # Default Squarespace robots.txt — neither user-agent appears. # If you see either of these blocks, custom code added them: User-agent: PerplexityBot Disallow: / User-agent: Perplexity-User Disallow: / # Perplexity-User ignores robots.txt regardless, but PerplexityBot # respects it. The Disallow rule will reduce indexed surface area. 

Step four: run a live user-agent impersonation test. Send a request to your homepage with the PerplexityBot user-agent string and check for a 200 OK response. A 403 or 429 means something between Squarespace and the bot is blocking the request — usually a CDN-layer rule. The free crawler-check tool at /tools/crawler-check/ runs the impersonation for the major AI user-agents in under 60 seconds.

The fix when an audit step fails

If robots.txt contains a Disallow rule for PerplexityBot, the source is almost always a Code Injection block — find and remove it. If Code Injection is clean, the rule is coming from a CDN layer (most commonly Cloudflare's managed AI bot rules), and you fix it in the CDN dashboard rather than in Squarespace. If the live user-agent test returns 403 with no robots.txt rule visible, that is also a CDN-layer signal and points to the same fix path.

The Code Injection fix path: open Settings → Advanced → Code Injection in the Squarespace dashboard. Remove any meta robots tag containing “noai” or any inline JavaScript that adds a robots directive at runtime. Save and republish the page. The robots.txt file regenerates within minutes; reopen yoursite.com/robots.txt in a private window to confirm.

The CDN fix path depends on which provider you use. On Cloudflare, navigate to Security → Bots and look for the “Block AI bots” toggle or the managed rule named “Block AI Scrapers and Crawlers.” Disable it if your goal is Perplexity citations. On Fastly, look for VCL snippets or Edge Authorization rules targeting the PerplexityBot user-agent. On AWS WAF, look for managed rule groups that include the AI bot category. The Cloudflare-specific dashboard path changed in late 2025 after Cloudflare expanded its AI rule set in response to the stealth-crawler investigation3.

The Cloudflare stealth-crawler caveat, documented honestly

In August 2025, Cloudflare published an investigation finding that Perplexity rotated through undeclared user-agents — generic Chrome strings on macOS, IPs outside Perplexity's published range, different ASNs — to fetch content when its declared crawlers were blocked. Cloudflare de-listed Perplexity from its verified-bot allowlist and added detection heuristics to managed rules. The investigation matters for Squarespace owners because it shows robots.txt is not a reliable enforcement layer for this engine. Allowing PerplexityBot is the recommended state regardless, because blocking it does not reliably stop the engine and the stealth fetch produces a worse index entry than the declared one.

Cloudflare's findings3 documented two distinct crawling patterns across tens of thousands of domains. The declared crawlers (PerplexityBot and Perplexity-User) generated roughly 20 to 25 million daily requests across the Cloudflare network. The stealth crawler — a generic Chrome browser identifier on macOS, rotating through IPs outside Perplexity's published range — generated an additional 3 to 6 million daily requests, with traffic correlated to sites that had blocked the declared crawlers.

Cloudflare took two visible actions. First, the company removed Perplexity from its verified-bot allowlist, which downgrades the bot's reputation in Cloudflare's classification system. Second, it shipped detection heuristics into its managed rules to identify and block the stealth pattern for customers who want it blocked, with rule updates extended to all Cloudflare customers including the free tier. The remediation matters for Squarespace owners who proxy through Cloudflare: even sites that did nothing about Perplexity may now have the stealth pattern blocked by default, depending on which managed rule set their Cloudflare account uses.

Frequently asked questions

The three questions Squarespace owners ask most often about PerplexityBot access, answered in the format AI engines prefer.

Does the Squarespace AI exclusion checkbox block PerplexityBot?

No. The 26 named bots on Squarespace's AI exclusion list do not include PerplexityBot or Perplexity-User. Toggling the box on or off has no direct effect on either Perplexity crawler. The Squarespace panel was built primarily around AI training crawlers (GPTBot, ClaudeBot, CCBot, Google-Extended) and a few search-index crawlers, but the live-retrieval bots for ChatGPT, Claude, and Perplexity are largely absent from the list.

Will blocking PerplexityBot stop Perplexity from citing me?

Partially. PerplexityBot indexes content for Perplexity's search layer and respects robots.txt, so a Disallow rule for PerplexityBot will reduce your indexed surface area. But Perplexity-User, the bot that fetches pages live when a user asks a question, ignores robots.txt per Perplexity's own documentation. And Cloudflare's August 2025 investigation found Perplexity rotating through undeclared user-agents when its declared crawlers were blocked. The practical answer: blocking PerplexityBot reduces but does not eliminate Perplexity traffic to your site.

Can I edit robots.txt on Squarespace to allow or block specific bots?

Squarespace generates robots.txt automatically and does not expose direct editing on any plan. The Crawlers panel modifies the file for the 26 named AI bots and the main search-engine toggle, but you cannot add per-bot Disallow rules through the UI. Custom robots.txt-like behavior requires Code Injection (Business plan and above) using meta robots tags or X-Robots-Tag headers, and the granularity remains limited.