Understanding the Site Audit Ubersuggest’s Site Audit helps users identify On-Page SEO issues or errors that could be holding your site back from its potential.
Site Audit is one of the most complex features in Ubersuggest. Besides the technical output, many external variables influence a successful crawl, and Ubersuggest doesn’t control all of them. For the majority of issues, you can get help from a webmaster or developer.
How Does the Audit Work?
Ubersuggest will first verify (or hit) the HTTPS (without “www,” for example, https://example.com) version of your site. In case an SSL certificate is not present it will go for the HTTP (without prefix) version. If your main URL has the “www” and you don’t have a redirect either from https://example.com or http://example.com, it will not crawl. Which might look like a problem; in fact, it is not. Having redirects from different versions pointing to the main one is considered best practice.
Here is a great article on redirects: https://neilpatel.com/blog/301-redirects/.
It will only check HTML pages. It’s common for users to expect to have all pages to be crawled but unless they are all HTML pages, the crawlers won’t check.
Internal links are paths for our crawlers. Ubersuggest will follow the internal links of your website to scan pages. If a link has a “nofollow” tag on it, the crawlers will obey and will not go further. Checking for broken or misdirected links is a crucial aspect of SEO.
The following articles can help you understand more about internal linking:
- A Guide to Internal Linking - https://neilpatel.com/blog/the-complete-guide-to-internal-linking/
- Must Do on Internal Linking - https://neilpatel.com/blog/commandments-of-internal-linking/
Why Does It Not Crawl My Website?
It’s important to understand that Ubersuggest will try to visit as many URLs as possible. Still, those will be HTML pages. Ubersuggest does not check pages that are not HTML, but – on the other hand, it might get blocked by pages that use JavaScript, for example.
There are a couple of variables – apart from those already mentioned that could be preventing or affecting our tool to crawl your website:
- Loading time
If it takes too long for our crawlers to reach a given page, it might stop due to a time-out. Please, make sure to run a PageSpeed test from Google to verify if there isn’t an excessive loading time.
- Blocked by the server/host
We use several IPs to crawl; some hosts might see it as scrapping and blocking. Especially those with firewalls integrated or running Cloudflare. To avoid any potential blockage, add our IPs as exceptions in your host.
You can find our list of IPs here: Ubersuggest IP addresses to add as exceptions
There are other reasons that our Site Audit might get blocked, although less common but they still can happen:
- Blocking parameters in the robots.txt;
- Blocked by .htaccess file;
- Canonical loop.
It’s also important to understand that even though other crawlers and site audits can crawl your site, it doesn’t mean Ubersuggest can as well. That’s because each tool and site uses different parameters, settings, and means to do so.
Common Issues and How to Deal With Them
The Site Audit Shows Errors I’ve Already Fixed
If you’re a PRO user, it is worth trying a recrawl. The “Recrawl Website” button is under the “Search” one, and after the audit is complete, it will appear. Because of different external variables influencing a site audit, a “recrawl” can often show better results. It’s worth checking if other tools show the same results to cross-check the information and make sure the changes were made correctly.
It’s always recommended to give a couple of hours between the crawls. If the change doesn’t appear (and it should), clearing the browser cache might help.
No SSL Certificate and No Sitemap
Here, it is important to check how many pages Ubersuggest crawls. Because, by default, Ubersuggest will point out a website not having an SSL certificate and/or Sitemap when it can’t crawl the site. In the case Ubersuggest can’t crawl more than one page, please refer to the previous section, “Why does it not crawl my website?”
Google understands and reads different sitemap layouts and structures, but Ubersuggest might have difficulty reading certain sitemap structures. The most important element is if Google can read and validate your Sitemap, which you can confirm on your Search Console.
Here is a great article on creating an SEO sitemap.
4XX Pages That Don’t Exist or You Can’t Find Them
Ubersuggest will attempt to crawl all HTML pages on your domain. Within those pages, Ubersuggest will follow internal links, so if one is pointing to a page that doesn’t exist (returns the 404 error), Ubersuggest will show the URL that is returning the 404, but NOT the page where it is coming from.
It’s important to note that, even typos in internal links can generate a 404 error. For example, the internal link for your contact page: www.example.com/contact, if you have within any page a link pointing to www.example.com/conttact, this URL will show as 404.
How do you find those broken links? You can either copy the URL and use Inspect Element to search for that URL direct in the code of the pages that have the link. There are also free extensions that can help you with that. Here’s an article with more insight: finding and fixing 404 errors.
Duplicate Tags and Meta Descriptions
If Ubersuggest is showing issues with duplicate tags (titles and descriptions), on pages that are paginated, you can safely ignore it. For example, if you have pages like www.example.com/mybook/page-1 and www.excample.com/mybook/page-2 showing duplicate descriptions and/or titles, you shouldn’t be concerned. Google understands those are part of the same content, and having the same title and description is logical.
Ubersuggest picks and shows those because it wants you to know about it, to facilitate your decision to change it or not.
If duplicate tags are detected on pages that don’t have the same content or are related (such as pagination), then you probably need to check and fix that.
Low Word Count On Pages That Aren’t Relevant for Content
Remember that Ubersuggest will visit and analyze as many pages as possible.
Better safe than sorry, right?
Your audit can have a low word count on pages like “Contact”, “Categories” and other pages that might not seem relevant or even be set as “noindex”. In that case, take into consideration how important those pages are for your site and if you would like to rank them. For example, you might not want to rank your contact page, so having a low word count there is irrelevant to your SEO strategy.
In Ubersuggest, a page is considered to have a low word count if it has less than 200 words.
Ubersuggest’s Site Audit will count the number of words in the content of the page, the main text that is present in the body tag. Make sure your content is within that tag. Texts in the script, style, buttons, comments, etc., are ignored.
We recommend checking the following articles that will help you have a better understanding of content length in SEO.
- How Long Should Your Blog Articles Be? (With Word Counts for Every Industry)
- Is Longer Really Better? How Focused and Comprehensive Content Ranks Better
Poorly Formatted URL
Ubersuggest Site Audit can check if your URLs are within the best practices. Don’t know them? Check this article on How to Create SEO Friendly URLs.
There are three checks that the Site Audit performs on URLs
- Characters Check
- Dynamic Check
- Keywords Check
The check missed most often is the keywords check. Typically this is caused by a keyword not being present in the title <tag> and URL. Here Ubersuggest will check if the keywords present in the title <tag> are also present in the URL. Take the link above, Neil’s article, as a reference. You will notice that two of the main keywords in the page title (“How to Create SEO Friendly URLs”) are present in the URL (“SEO” and “URL”).
The other two checks are straightforward, both of which are related to unusual characters. For example, www.example.com/some_url.php?adsasd=5 – all these dynamic characters like question marks, equals sign, or anything that isn’t ‘readable’ by a human will be identified as dynamic. In regards to characters check, make sure you’re using safe characters. These include Alphanumerics [0-9a-zA-Z] and unreserved characters. Also reserved characters when used for their reserved purposes (e.g., question mark used to denote a query string).
For further information on each error message — next to each column title, there is a question mark, hover over and you will see more details regarding that topic.
Try to deliver friendly, easy-to-read URLs.
Site Audit Can’t Crawl My Domain (I Use Cloudflare)
Cloudflare is a CDN that usually blocks our crawlers. That said, there are certain things that we suggest you try in Cloudflare to allow our bots to visit the site:
- Disable Bot Fight Mode (free plans) in Cloudflare under the Firewall app > Bots tab. Toggle the Bot Fight Mode option to Off. Please visit this link for more information.
- If the above doesn’t seem to do the trick, perhaps it is worth enabling our IPs to access the site in your Cloudflare Dashboard under the Firewall app > Tools tab. More details for the above are here.
If you don’t have our IPs yet, you can find them here, and in case you need the name of our bot, please use RSiteAuditor as the agent.
After doing those changes, please, run a recrawl on your domain.
Site Audit Error Codes
Error code: “no_errors” – Scenario: the errors are not identified;
Copy: We couldn’t crawl your website. At this time, we haven’t been able to identify the cause of this error.
Possible Solutions:
1. Check Your Connection
2. Disable Firewall and Antivirus Software Temporarily
3. Clear Browser Cache
4. Go to the Site Audit to restart the crawl.
Error code: “site_unreachable” – Scenario: site is not available/the connection failed;
Copy: We couldn’t crawl your website. <b>Error</b>: Site is not available/Connection failed
A connection timed-out error appears when your website is trying to do more than your server can manage. It’s particularly common on shared hosting where your memory limit is restricted.
Possible Solutions:
1. Check Your Connection
2. Disable Firewall and Antivirus Software Temporarily
3. Clear Browser Cache
4. Go to Site Audit to restart the crawl.
Error code: “invalid_page_status_code” – Scenario: the first page is with not valid status-code;
Copy: We couldn’t crawl your website. <b>Error</b>: Invalid status code in the first page crawled.
In the case that the first page crawled has a bad status code the scan will stop. In this case Ubersuggest can’t crawl more than one page.
Error code: “forbidden_meta_tag” – Scenario: on the page meta robots noindex is present;
Copy: We couldn’t crawl your website. <b>Error</b>: Forbidden by Noindex instructions.
If you see the following code on the main page of a website, it tells us that we’re not allowed to index/follow links on it and our access is blocked.
<meta name=”robots” content=”noindex, nofollow” >
Or, a page containing at least one of the following: “noindex”, “nofollow”, “none”, will lead to the error of crawling.
To allow our bot to crawl such a page, remove these “noindex” tags from your page’s code. For more information on the noindex tag, please refer to this Google Support article.
Error code: “forbidden_robots” – Scenario: the page is disabled in the robots;
Copy: We couldn’t crawl your website. <b>Error</b> Blocked by robots.txt instructions.
A Robots.txt file gives instructions to bots about how to crawl (or not crawl) the pages of a website. You can allow and forbid bots such as Googlebot or Ubersuggest from crawling all of your site or specific areas of your site using commands such as Allow, Disallow, and Crawl Delay.
If your robots.txt is disallowing our bot from crawling your site, our Site Audit tool will not be able to check your site.
You can inspect your Robots.txt for any disallow commands that would prevent crawlers like ours from accessing your website. Please refer to this Google Support article.
Error code: “forbidden_http_header” – Scenario: in the header X-Robots-Tag: noindex is present;
Copy: We couldn’t crawl your website. <b>Error</b>: Forbidden by X-Robots-Tag: noindex instructions.
The X-Robots-Tag is a part of the HTTP header that controls the indexing of a page on the whole, in addition to specific elements on a page.
If the X-Robots-Tag is disallowing our bot from crawling your site, our Site Audit tool will not be able to check your site. Please refer to this Google Support article.
Error code: “too_many_redirects” – Scenario: there are recursive redirects on the page;
Copy: We couldn’t crawl your website. <b>Error</b>: Recursive redirects.
The reason for too_many_redirects sending is occurring because your website is experiencing an infinite redirection loop. Essentially the site is stuck (such as URL 1 points to URL 2 and URL 2 points back to URL 1, or the domain has redirected you too many times.
Possible Solutions:
1. Delete cookies on that specific site
2. Clear the WordPress site, server, proxy, and browser cache
3. Determine the nature of the redirect loop
4. Check your HTTPS settings
5. Check third-party services
6. Check your WordPress site settings
7. Temporarily disable WordPress plugins
8. Check redirects on your server
Error code: “unknown” – Scenario: none of the above and the reason is not identified;
Copy: We couldn’t crawl your website. At this time, we haven’t been able to identify the cause of this error. Please go to the Site Audit to restart the crawl.
Possible Solutions:
1. Check Your Connection
2. Disable Firewall and Antivirus Software Temporarily
3. Clear Browser Cache
4. Go to Site Audit to restart the crawl.
Error code:: “invalid_page_status_code” –Scenario: is caused by 400 or higher error codes;
These are error codes specifying that there’s a fault with your browser and/or request.
Kindly refer to this Google Support Article.
Still need help? Contact us at support@ubersuggest.com.