How to exclude specific pages from being crawled

To exclude specific paths from being crawled by the site audit tool, enter the paths to exclude in the "Exclude Paths" box in the site audit setup screen.

Paths should be entered relative to the campaign URL.

Example 1:

Let's say that the campaign URL is www.agencyanalytics.com, and the site hosts a blog at www.agencyanalytics.com/blog. Blog posts are in the format of:

www.agencyanalytics.com/blog/myblogpost1
www.agencyanalytics.com/blog/myblogpost2 

...and so on.

If you wanted to exclude the entire blog and all blog posts from being crawled, you would enter the /blog path, relative to the root domain.

That means that you would simply enter /blog in the exclusions box. 

Example 2: 

Let's say that the campaign URL is www.agencyanalytics.com/store, and the site has a store category for perishable items at www.agencyanalytics.com/store/perishables

Store items are in the format of:

www.agencyanalytics.com/store/perishables/myproduct1
www.agencyanalytics.com/store/drygoods/myproduct2

...and so on.

If you wanted to exclude all items in the "perishables" category from being crawled, you would enter the /perishables path, relative to the root domain.

That means that you would simply enter /perishables in the exclusions box. 

If the campaign URL were www.agencyanalytics.com and you wanted to exclude this path, then it would be entered as /store/perishables.

But in the this example, since the /store is part of the campaign URL, that part of the path is assumed, and only the /perishables path would be entered for the exclusion.

Note: The exclusions box doesn't currently accept wild cards or "regular expressions", but that functionality is on the roadmap and will be released at a future date.

What's Next:

Did this answer your question?