The Statistics tab of Site Audit provides users with additional useful reports about the pages crawled within their Site Audit project. The tab contains reports on the following metrics:
With the buttons above the report, toggle the view between the List or Graph.
The first bar graph, Pages Markup, shows how many of your pages have markups, including:
- Schema.org (microdata)
- Schema.org (JSON-LD)
- Open Graph
- Twitter Cards
If you see in this report that most of your pages are not using any markups at all, you might want to try implementing some into your website’s HTML.
To the right you will see a chart showing the Pages Crawl Depth. The number of clicks in the Pages Crawl Depth chart refers to how many clicks the crawler had to make on your site to reach each page during the crawl. With a lower crawl depth, it’s easier for a web crawler to find the content on your website. With a crawl depth of 4+, there is a possibility that a user or bot won’t even reach the page.
So, if you see important pages on your website that can’t be reached in less than 4 clicks, you should consider adding more internal links to help people find them faster.
Sitemap vs. Crawled Pages
Below the Page Markups graph is the Sitemap vs. Crawled Pages chart. This will tell you how many pages were found in the sitemap (blue) compared to the total number pages that were crawled in your Site Audit (green).
If the number of pages specified in your sitemap.xml doesn’t match the number of crawled pages, this may be a sign that your website has a bad crawlability due to poor linking or other technical issues. To make sure that search engines can efficiently crawl and index your website, you should keep an eye on the number of pages in your sitemap.xml and the number of pages we crawled.
HTTP Status Codes
After the sitemap chart is a chart detailing any HTTP status codes that were found during the crawl. The chart will list if there were any 5xx, 4xx, 3xx, 2xx, or 1xx status codes on your site, and it will also tell you how many pages had no HTTP status codes. The chart can be filtered and can be clicked through to view a pre-filtered crawled pages report to show the exact pages with the status codes you want to see.
HTTP status codes refer to requests made to a web server by search engines or website visitors. Having a lot of pages on your site that return 4xx or 5xx status codes can negatively affect both your site's user experience and its crawlability, which could lead to a drop in traffic and organic positions.
Next is the AMP (Accelerated Mobile Pages) link graph. Here you’ll see the percentage of pages on your site that are linked with an AMP version by the rel=amphtml tag. AMP is a way to build web pages for static content that loads faster on mobile devices.
If you have only AMP pages without an equivalent desktop page, the widget would show 100% of pages without AMP link. In this case, for your actual amp mistakes, you’ll have to use the issues tab.
Below the Sitemap vs Crawled Pages chart is a bar graph detailing the Internal Links on your website.
Internal links are important for SEO because they can help structure the flow of your website and allow crawlers to easily navigate through your site’s content. Hover over each bar and you’ll see how many of your pages have each number of internal links pointing to them.
You’ll want to find an appropriate medium between having too many or too few internal links on your pages so that users and crawlers can easily navigate your site without being overwhelmed by too many link paths.
At the bottom of this report is the Canonicalization chart. This chart will tell you how many pages crawled were found having the rel=”Canonical” tag in their HTML.
The rel="canonical" tag helps solve SEO issues related to having duplicate content on your website. If there are multiple pages with the same content on your site, they will all suffer weakened SEO, because crawlers won’t know which page to list in their index. The use of a canonical tag clears up any confusion for crawlers and will communicate which page featuring duplicate content is the one that you want to be indexed.
The Hreflang attribute (rel="alternate" hreflang="x") is a HTML tag that is used to specify the language and regional URL of a web document. For webmasters that operate global business websites, this tag is an important part of an international SEO strategy. For more information on your website’s use of this tag, navigate to the International SEO tab of Site Audit or click on the red “with issues” section of the donut chart.