Our site auditor tool scans for 44 of the most common and most critical issues that affect onsite SEO.
The issues that we scan for are as follows:
- Pages returning a 5xx HTTP status code These are fatal errors that will prevent anyone, including search engines, from accessing your website. They are usually caused by a programming bug or a server misconfiguration.
- Pages returning a 4xx HTTP status code These errors normally occur because a page does not exist (404), it requires authentication (401), or it is forbidden to access the page (403). Make sure you deal with each type of code appropriately to ensure the page can be crawled.
- Overused canonical tags This test will fail if too many pages have the same canonical tag. Canonical tags are used to identify a duplicate page so that Google only indexes one URL. Make sure that each of your pages is not pointing to the same page, or else it will be the only one page indexed.
- Pages missing a title A <title> tag is one of the most important components of a page. It is often used as a link to your page on a search engine and is meant to describe the purpose of the page in a few words.
- Pages with duplicate titles A title is considered to be a duplicate if it matches the exact title of another page. Duplicate titles will diminish the quality of a page since it is unclear on which page has more relevance to a given topic. Furthermore, it will also confuse the user when navigating your site.
- Pages with duplicate content A page is considered to have duplicate content if it contains a similar text to another page. Duplicate content will diminish the quality of a page since it is unclear on which page has more relevance to a given topic. Since there would be no purpose for a search engine to index the same page twice, it may ultimately lead to banning both pages from the results.
- Broken internal links An internal link is one that points to another page that exists on your server and is considered broken when the page cannot be accessed. This could be because it does not exist, or there is an error trying to connect to it. Make sure the URL is inputted correctly and that you clear up any issues with the page. Excessive broken links will not only impact your visitors' experience; it may also cause search engines to diminish the importance of your website.
- Broken external links An external link is one that links to another website and is considered broken when the page cannot be accessed. Since the linking website is not under your control, your best option is to remove the link. Otherwise, this will diminish the reliability of your website according to a search engine or visitor.
- Broken internal images An internal image is one that links to another image within your website and is considered broken when it will not load. This could occur because the file does not exist, or the image could be too large and is randomly timing out when trying to load it.
- Broken External Images An external image is one that links to another image hosted on another website and is considered broken when it will not load. Generally, it is bad practice to reference external images since it limits your control. A simple solution is to download the image and host it internally.
- Pages with duplicate meta descriptions A meta description is a hidden tag that describes the purpose of a page. Search engines may use this description in the listing for this site and in determining the topic of the page. If the same description is used on other pages, it may be difficult to differentiate between pages. Make sure meta descriptions are unique and use topical keywords to describe the content of the page.
- Blocked crawls by robots.txt A Robots.txt file gives instructions to any web crawlers, including search engines, on which pages of a website they should crawl. This way, you can choose which pages you would want to be indexed on Google, for example. Any errors in this file could cause a search engine to not index your website at all.
- Invalid sitemap.xml format A sitemap.xml lists all the public pages of your website so a crawler can easily find them. You should only include pages that you wish a search engine to crawl. An error is triggered if the syntax of the XML is incorrect.
- Missing canonical tags in amp pages AMP stands for Accelerated Mobile Pages. This is used to strip down a page's HTML so it will render faster on mobile devices.
- Missing viewport tag This is a meta tag that allows you to control the scale in which the page appears on a mobile device. This will ensure that the page is not too small or large and is easily legible on a user's device.
- Large page size In order to keep page load time low, you should try to minimize the amount of content and HTML contained in it. Generally, a page file size should be less than 2 MB in order to avoid any search engine penalties.
- Missing Https redirect Every website should be accessible securely with an https URL. In order to make sure that you're always using https, your website should redirect any request from HTTP to HTTPS.
- Incorrect URLs in Sitemap.xml A sitemap.xml lists all the public pages of your website so a crawler can easily find them. You should only include pages that you want a search engine to crawl. An error is triggered if any URL is not found.
- Missing canonical tag Canonical tags help to avoid duplicate content when unique content is accessible via multiple URLs. Defining a correct canonical tag for all pages will keep them away from possible duplicate issues.
- Pages with a short title Generally, using short titles on web pages is a recommended practice. However, keep in mind that titles containing ten characters or less do not provide enough information about what your web page is about and limit your page’s potential to show up in search results for different keywords.
- Pages with a long title Any title with more than 70 characters is generally considered to be too long. Most search engines and sites will automatically shorten such a long title. A long title could penalize your site, especially if you are keyword stuffing.
- Pages with multiple H1 tags Generally, it is best to have only one H1 tag on a page to specifically define its topic. Multiple H1 tags can confuse a search engine or a user in determining the focus of the page.
- Pages with missing H1 tags H1 tags are considered to be the main heading of a page and are used to help define the topic of the page. Creating a descriptive heading is an effective way to improve your search engine presence and make it easier for a user to navigate your page.
- Pages with matching H1 and title content Using the same title as your H1 content is an ineffective way of defining the page topic. Use this opportunity to create two distinct phrases that illustrate the purpose of the page.
- Pages missing a meta description A meta description is a hidden tag that describes the purposes of a page. Search engines may use this description in the results listing and in determining the topic of the page. Make sure each of your pages has a meta description that is unique and topical.
- Pages with too many on-page links This test will fail if a page has more than 500 links. Having too many links on a page can overwhelm the user and offer too many exit options. Search engines also have a limit on the number of links that they crawl on a page.
- Pages with temporary redirects Temporary redirects are triggered when a page has a 302 or 307 HTTP status code. This means that the page has moved temporarily to a new location. Although the page will be indexed by a search engine, it will not pass any link juice to the redirected page.
- Images are missing an alt attribute An alt attribute is used to describe an image in a textual context. Search engines may interpret an alt tag to identify the purpose of the image. This is a great way to increase your page relevance as it relates to a topic.
- Pages with slow load time A slow page can be frustrating for a user and will lower your relevance in the eyes of a search engine. Most users will not put up with a slow page and go elsewhere. A search engine understands this and will do the same. This test fails if it takes longer than 7 seconds to load the page.
- Pages with a low text to HTML ratio The amount of text compared to HTML tags represents your text to HTML ratio. This test will fail if the ratio is less than 10%. A search engine can only look at your text to determine the page's relevance. If there is an abundance of HTML compared to the actual content, it will have difficulty segmenting the content. Furthermore, too much HTML may cause your page to load much slower.
- Pages with too many URL parameters Overusing parameters in a URL is not the proper way to segment a page. It often creates an ugly URL that is not easy to read or pick out any defining keywords. Generally, parameters should be transformed into a path based structure (i.e./param1/param2). This test fails if there are more than two parameters in the URL.
- Pages missing an encoding declaration Specifying an encoding for a page will ensure that each character is displayed properly. Generally, this usually set to utf-8, but depending on the language of the page, this could be different. Example: <meta charset='utf-8'>.
- Pages missing a doctype declaration A doctype is the first thing that appears in your page's source code and instructs a web browser which version of HTML you are using. If this is not specified, then your code could be interpreted incorrectly and become uncrawlable.
- Pages with low word count This test fails if the number of words on a page is less than 200. If a page does not have much content, it is hard for a search engine to properly assign a topic to it and may not bother indexing it. Try to make use of relevant content while using as many keywords as possible.
- Pages using flash It is generally a bad idea to use flash on your website. It isn't possible for a search engine to interpret flash content and may skip over your page when crawling it. Furthermore, it creates a bad user experience as they will have to wait for it to load and may not be able to see anything on their mobile device.
- Pages with underscores in the URL Semantically, underscores are allowed in a URL but is bad practice in terms of SEO. It is a good idea to separate words; however, you should use hyphens to accomplish this.
- Internal links containing NoFollow attributes If a link contains rel='nofollow', then it instructs a search engine to avoid crawling it. This may be done on purpose, but if you wish to pass link juice, then you should remove this.
- Missing sitemap.xml references in robots.txt If your site contains a robots.txt and sitemap.xml file, it is a good idea to reference the location to sitemap.xml within your robots.txt. Your robots.txt file is what a search engine will read when indexing your site, so it is good to make it easy for the crawler to find the links you want to be indexed.
- Missing sitemap.xml files A sitemap.xml is simply a way to list the pages of your site that you would like a search engine to index. This will make it easier and faster for a search engine to crawl your site and notify them about any new or updated pages.
- Pages with long URLs URLs longer than 100 characters are considered to be not ideal when it comes to SEO. A long URL can be difficult to read or share and can even cause problems with browsers or applications.
- Missing robots.txt file A Robots.txt file gives instructions to any web crawlers, including search engines, on what pages of a website they should crawl. This way, you can choose which pages you would want to be indexed on Google, for example. Missing this file may cause some of your pages to be ignored by Google.
- External links containing NoFollow attributes If a link contains rel='nofollow', then it instructs a search engine to avoid crawling it. This may be done on purpose, but if you wish to pass link juice, then you should replace this link.
- Pages blocked from being indexed If a page contains a noindex meta tag, then it will tell a search engine to avoid crawling this page. This may have been done intentionally, but if not, then this page will have no presence in a search engine. Simply remove the noindex meta tag to resolve this.
- Frames Used Using HTML frames is considered to be dated and should be avoided. It is difficult for a search engine to read them and creates a bad user experience. Try to remove any frames from your pages in favor of better and newer methods to accomplish the same thing.