Use robots.txt to allow the site audit crawler

It's possible to specify URLs to ignore or allow just for our crawler. This is done with the user agent "aabot" in your website's robots.txt file. 

Note that you can also stop our crawler from scanning portions of your website via the site auditor settings in our interface.

The following example will disallow all crawlers except our site auditor (note that this rule will also disallow Google's crawler, but it might be useful when auditing a site under development):

User-agent: *
Disallow: /
User-agent: aabot
Allow: /

Use robots.txt to allow a specific path to be crawled

There might be situations where you won't need the site auditor to crawl your whole website, but rather a specific path. In this case, it's possible to specify which path you would like our site auditor to crawl using your robots.txt file.

The following example will allow our site auditor to crawl only the pages under example.com/categories:

User-agent: aabot
Disallow: /
Allow: /categories

If your robots.txt configuration is not working, please check your cache: It's possible that a different version of robots.txt is being delivered from your website/server cache, or via CDN cache (e.g. Cloudflare). Clearing this cache will usually solve this problem.

What's next:

Did this answer your question?