Skip to content

Robot Crawlability

By Matthew Edgar · Last Updated: December 16, 2023

Search engine robots need to crawl through your website’s pages, images, videos, and other content. This is the most fundamental level of technical SEO. If robots cannot crawl your website successfully, then they will not be able to index or rank your website.

How Crawling Works

Google has automated programs (referred to as robots, spiders, or crawlers) that crawl through the web. These programs perform distinct operations.

  1. Discoverability. Before robots can crawl a website, robots need to discover the URLs contained on the website. Robots discover new pages primarily through links, including external links from other websites and internal links within your website. This is why updating your website’s navigation and avoiding orphan URLs is so critical. Pages can also be discovered from your website’s XML sitemap.
  2. Fetching. After discovering a URL, Google queues up the page for crawling. You need to make sure that the page can be crawled and is not blocked from crawling by the robots.txt file. If the page can be crawled, Google will send its crawlers to your website to fetch the page’s contents. Along with fetching the page contents, Google will also check the page’s status code and the HTTP headers.
  3. Rendering. After fetching the page’s content, Google will queue the page up for rendering. This is where Google executes any JavaScript code needed to view the page’s content. You need to make sure Google can access those JavaScript files. Google renders pages using a headless version of Chrome, so you can set up a headless browser to test what Googlebot sees. Once rendered, Google will process the URLs and determine how to index those URLs. Learn more about page indexing.
  4. Refresh. Google wants to always have the latest version of every URL on the web. So, Google will return to the URLs it knows about frequently to check for any updates. The more important Google things a page is, the more Google will return to that page. However, even pages that don’t seem that important will be recrawled. For example, I regularly see Googlebot returning to redirected URLs or error URLs, even though those errors and redirects have been in place for years.

Checking Crawlability in Google Search Console

In Google Search Console, you can use the “Live Test” tool with URL Inspection to see your website as Google. In Google Search Console, click on “URL Inspection”, enter the URL you want to test at the top of the screen, and then press enter.

Inspect a URL in Google Search Console

This will load the URL Inspection report. By default, this will show you information about the last time Google crawled this page. It will also provide information about how Google discovered this URL. However, in my experience, the Discovery information is not always reliable.

You can also live test the URL to test any recent changes. At the top of the screen, click “Test Live URL” in the upper right corner of this page. This will refresh the report with information about a fresh crawl.

On URL Inspection, you can also view more details about the tested page. Click the link that says “View Tested Page”. This will expand a panel showing what Google crawled. This includes the HTML crawled and a screenshot of the crawled page. The “More Info” tab shows the page headers, the page status code, and the resources Google had to fetch to render this page. This information can help you diagnose any issues with how Google has crawled the page.

Live Test in Google Search Console

Checking Crawlability in Bing Webmaster Tools

Bing also offers a URL Inspection report so you can confirm Bing can crawl this webpage. After logging into Bing Webmaster Tools, click on URL Inspection in the sidebar and then enter the URL you want to inspect.

URL Inspection in Bing Webmaster Tools

After inputting your URL in the search box, Bing will provide information about the last time this page was crawled. Bing also indicates when this page was discovered. You can click on “Live URL” to test crawls on the current state of the page.

URL Inspection Report - Bing Webmaster Tools

Technical SEO Services

Want help improving your website’s technical SEO factors? Contact me today to discuss how I can help review and improve your current technical structure.

Prefer a more DIY approach? Order our Tech SEO Guide, a reference guide to help you address the technical SEO  issues on your website.

You may also like

How to Check HTTP Response Status Codes

Every page on every website returns an HTTP response status code. How do you check the status code for your website’s pages? What tools can you use to test status codes? What do the status codes mean?

How To Fix 404 Errors On Your Website

How do you find the 404 errors on your website? Once found, how do you fix the 404 errors? What tools can help? Find out in this in-depth 404 guide!

Noindex vs. Nofollow vs. Disallow

Are you confused about the difference between noindex, nofollow and disallow commands? All three are powerful tools to use to improve a website’s organic search performance, but each has unique situations where they are appropriate to apply.