To allow a third-party vendor to crawl your Webflow site (e.g., for legal compliance archiving), you need to give them access because Webflow doesn't support native IP or User-Agent whitelisting.
1. Understand Webflow's Hosting Firewall
- Webflow is built on AWS CloudFront and uses a global CDN and firewall that you cannot configure manually.
- You cannot whitelist specific IPs or User Agents directly within Webflow.
- However, Webflow public sites are crawlable unless manually blocked via Webflow settings or
robots.txt
.
2. Ensure Your Site Is Public and Crawlable
- Go to Project Settings > SEO tab.
- Make sure “Disable Webflow subdomain indexing” is unchecked.
- In the robots.txt editor, do not block the vendor's crawler. For example:
- Avoid lines such as
User-agent: * Disallow: /
- Optionally add:
User-agent: [VendorBot] Allow: /
(replace [VendorBot]
with actual bot name)
3. Share Custom User-Agent or IPs With Webflow Support
- If the vendor has a known static IP address or unique User-Agent, contact Webflow Enterprise Support.
- Webflow support may work on rare exceptions for enterprise teams to reduce false flagging or rate-limiting.
4. Avoid Blocking by Rate Limits or Bot Detection
- Many third-party crawlers get blocked due to aggressive crawling behavior.
- Advise the vendor to:
- Throttle their crawl rate (e.g., 1 request/sec).
- Use a custom User-Agent string that clearly identifies the crawler.
- Respect
robots.txt
rules.
5. Use a Proxy or Middleware (Workaround for Enterprise Clients)
- If whitelisting is non-negotiable, consider setting up a reverse proxy server outside Webflow that mirrors your site and allows whitelisting.
- This works by:
- Hosting the proxy on a separate server.
- Serving the exact content of your Webflow site.
- Applying custom firewall rules to the proxy server.
- This requires developer resources and is typically used only in enterprise setups.
Summary
Webflow does not allow direct IP or User-Agent whitelisting, but sites are publicly crawlable by default. Ensure robots.txt
and SEO settings allow access, and work with Webflow support if needed. For advanced control, use an external proxy solution.