Robot Crawlability
By Matthew Edgar · Last Updated: November 25, 2019
What It Is
The most critical technical SEO task is making your website accessible to search engine robots that need to crawl through the various pages, images, and other content on your website. To support robot crawlability, the majority of your content should be placed in text within HTML tags. That sounds simple. Occasionally, though, some content will need to be dynamically served via a JavaScript function or placed within an image.
Common Examples
If you need to place content within dynamic elements, like JavaScript, you want to make sure search engine robots can access the JavaScript code that renders that content. This typically allows robots to run the JavaScript code the same way your human visitors can, which lets them see the content resulting from that script. One common reason robots can’t run the JavaScript code is because the JavaScript files are blocked by the robots.txt file.
A common way content can be blocked from search engine spiders is by placing text within an image. Often, designers will place text that needs to be in a special font in an image. The advantage is that the image ensures the text looks the same way to all visitors. Unfortunately, search engine spiders cannot (easily) read the text contained in an image. As an alternative, you can use a font from Google’s font library. Using a font from the library, will allow the words to be placed in text where Google can easily access that text. This also ensures visitors see the content in the desired font. Along with using the font library, CSS and jQuery can give you other ways to style and animate the text, providing even more design choices than an image can provide.
Providing Alternative Text
If the content absolutely must be placed in an image or in JavaScript where a robot will be unable to view the content, then alternative text, or alt text, needs to be provided. For example, in an image, alt text can be used to tell Google what words the image contains. For content contained in JavaScript, you can place the content in a <noscript> tag so that Google could still find the right text.
One additional way content can be blocked from robots is within videos. While videos can provide a great means of communicating with visitors, robots aren’t able to watch and understand the video. You can use alternative text here too by providing the content in a transcript or plain text format that robots can more easily access. Along with helping bots access this material, the transcript will also help your human visitors who don’t want to watch the video (or who did watch the video but want to refer back to the transcript in the future).
Hidden Content
A final consideration when thinking about crawlability is hidden content. For example, it is common practice on websites to place some content behind tabs. However, this type of tab technique can hide your content from Google, which can get you penalized. Google’s general rule of thumb is that if the content is hidden but still available to humans (for instance, by clicking on a tab), then that is acceptable behavior. If it is hidden and there is no means of a human being able to un-hide the content (or if the means of doing so are buried where no human is likely to find it), then you run the risk of robots not finding the content (as well as running the risk of humans being unable to see the content too).
Checking Crawlability
Google Search Console
In Google Search Console, you can use the “Fetch As Google” tool to see your website as Google. In Google Search Console, click on crawl then click on Fetch As Google.
Input your page’s URL then click Fetch. After inputting your page’s URL, you can make sure Google can find the content you are expecting Google to find. Helpfully, you can also select from the dropdown to fetch on desktop or mobile devices since your mobile design might hide or position some elements differently than your desktop design.
As well, Google also offers a report on blocked resources. This blocked resources page shows you what content you are preventing Google from accessing. For example, this will show you if you are blocking a JavaScript file that is responsible for rendering some of your website’s content.
Bing Webmaster Tools
Bing also offers an ability to fetch a page on your website as Bingbot. After logging in, go to Diagnostics & Tools, then click on Fetch as Bingbot.
After inputting your URL in the search box, you will see a copy of the content as Bing sees it. You can search through this code to make sure that the most important content you need Bing to find is contained in this output.
Resources
- Can Google Crawl JavaScript?
- Hidden Text and Links
- When You Can Hide Content
- Fetch as Googlebot
- Crawling and Rendering
Technical SEO Services
Want help improving your website’s technical SEO factors? Contact me today to discuss how I can help review and improve your current technical structure.
Prefer a more DIY approach? Order our Tech SEO Guide, a reference guide to help you address the technical SEO issues on your website. Order now at Amazon.