Site Auditor is available as a paid add-on in accounts created before November 2023. For more information please reach out to our friendly support team.
In most cases, robots.txt files won't be blocking our site auditor, nor will our IP addresses be restricted. However, if you have rules set up in your robots.txt to block bots, or if there are any IP restrictions, you can use the information below to ensure our auditor can still run on your website.
We strongly recommend using an experienced web developer to make these changes, as the process will vary depending on the website and other factors.
Note: you can also stop our crawler from scanning portions of your website via the site auditor settings in our interface.
Use robots.txt to allow the site audit crawler
It's possible to specify URLs to ignore or allow just for our crawler. This is done with the user agent "Mozilla/5.0 (compatible; RSiteAuditor)" in your website's robots.txt file.
When editing your robots.txt file, the following example will allow our site auditor to crawl your website:
User-agent: RSiteAuditor
Allow: /
Note that if you've tried the above and our auditor still won't crawl your website, your server may be blocking the audit. If you see a 4xx or 5xx status code for all pages in our site auditor, this is most likely the case. In this situation, please white list the user agent "Mozilla/5.0 (compatible; RSiteAuditor)" (without quotation marks) on your server. Note that this white listing is done on your server, not in your robots.txt file.
Use robots.txt to allow a specific path to be crawled
There might be situations where you won't need the site auditor to crawl your whole website, but rather a specific path. In this case, it's possible to specify which path you would like our site auditor to crawl using your robots.txt file.
The following example will allow our site auditor to crawl only the pages under example.com/categories:
User-agent: RSiteAuditor
Disallow: /
Allow: /categories
If your robots.txt configuration is not working, please check your cache: It's possible that a different version of robots.txt is being delivered from your website/server cache, or via CDN cache (e.g. Cloudflare). Clearing this cache will usually solve this problem.
Whitelist our IP addresses for the site auditor
If you are having trouble getting our site auditor to connect to your site, the following IP addresses can be whitelisted on your website or server firewall.
IPv4
94.130.93.30
168.119.141.170
168.119.99.190
168.119.99.191
168.119.99.192
168.119.99.193
168.119.99.194
68.183.60.34
134.209.42.109
68.183.60.80
68.183.54.131
68.183.49.222
68.183.149.30
68.183.157.22
68.183.149.129
IPv6
2a01:4f8:c17:f386::1/128
2a01:4f8:c17:f387::1/128
2a01:4f8:c17:f38a::1/128
2a01:4f8:c17:f394::1/128
2a01:4f8:c17:f395::1/128
2a01:4f8:251:5d3::2/128
2604:a880:800:10::eb:9001/128
2604:a880:800:10::596:4001/128
2604:a880:800:10::e9:1001/128
2604:a880:800:10::65b:f001/128
2604:a880:800:10::695:7001/128
2604:a880:800:10::6da:6001/128
2604:a880:800:10::6ee:8001/128
2604:a880:800:10::6f7:3001/128