Find & Fix Orphan Pages
By Matthew Edgar · Last Updated: June 14, 2022
What Are Orphaned Pages?
An orphaned page on a website is a page that has no internal links pointing to it. There is also the concept of a near-orphaned page, which is a page that has very few internal links pointing to it. Because there are no (or few) internal links referencing this page, the orphaned page is effectively cut off from the rest of the website, which can harm that page’s ability to perform.
Having orphaned pages isn’t great but isn’t always a problem. There are legitimate examples of pages that shouldn’t have or wouldn’t have internal links. For example, landing pages shouldn’t have any internal links pointing to them since those pages are only intended to receive traffic from ads. As another example, confirmation pages visitors see after placing an order or submitting a form wouldn’t have internal links since you wouldn’t want people to find that confirmation page in some other way.
Orphaned pages become a problem and hurt performance when pages that shouldn’t be orphaned are orphaned. To understand that, we need to understand why different types of internal links matter. When it comes to SEO, internal links help guide search engine robots through a website. If a page isn’t linked to (at all or at least not very much) from within the website, then robots will have a harder time finding that page and may not find the page at all. If they can’t find (or discover) that page, then that page won’t be able to rank in search results. You might be able to correct the discoverability problem by listing the page in an XML sitemap. However, discoverability isn’t the only SEO benefit of internal links.
The other benefit of internal links for SEO is that internal links pass authority between pages. By doing so, internal links act as a signal indicating the relative value of any given page on a website. If a website doesn’t even link to a page contained on that website, how important could that page be? The lack of internal links suggests the page isn’t very important. When discussing orphaned pages, John Mueller from Google said “Google Search will assume that these are not very critical for your website” and “probably we won’t give them as much weight in search.” In short, orphaning a page decreases Google’s overall assessment of the page, which reduces the page’s overall chance of ranking higher in search results.
Along with the SEO implications, orphaned content can also affect visitor engagement and conversions. When people come to your website, they will use internal links to navigate your website. If a page isn’t linked to or isn’t linked to that much, then visitors will have a harder time finding that page. This becomes a problem if the orphaned page is an important page that visitors ought to find or a page that is critical for visitors to find in order for them to convert. For example, if your contact page is orphaned then this might decrease conversions if you have a large number of people that want to find your contact page before placing an order.
How Do You Fix Orphan Pages?
Fixing orphaned pages seems simple: add internal links! However, simply adding a bunch of internal links pointing to various pages won’t necessarily correct the problem if those internal links aren’t relevant. Adding internal links where they aren’t relevant can actually create new problems that are far worse than orphaned pages.
One common way people try to fix orphaned pages is by adding a link to every page (or nearly every page) on their website into the website’s main navigation. If this isn’t done correctly, it can result in an overcrowded navigation menu that is almost impossible to use. Adding this many links to a website’s navigation requires a clear organizational plan that carefully considers what terms should be used, provides a clear hierarchical structure, and is designed to support visitors’ short-term memory limitations. Once you consider all these factors, it is almost never the case that every page on the website should be listed in the navigation.
Another potential solution some people try is adding a link to every page on their website in the website’s footer. This can result in an overcrowded footer, which can harm the website’s usability. As well, the website’s footer is a valuable space that can help visitors find information and if designed correctly, can help with conversions. It doesn’t make sense to fix orphaned content by ignoring the opportunities presented by a well-designed footer.
The even bigger reason to avoid fixing orphaned content by adding links to the navigation or footer is that Google tends to ignore footer and navigation links, preferring instead to rely on links contained within the website’s main content. While Google hasn’t confirmed this to be the case, it makes sense when you consider Google’s push to value user experience—Google’s engineers are smart enough to know people can put irrelevant and unnecessary links in the navigation or footer, so why would they let their bots rely heavily on these parts of a website?
In short, you don’t want to fix orphaned content by adding links to the navigation or footer. Of course, if the links make sense to add to your website’s navigation or footer, and those links fit within the organizational schemes for those parts of your website, then add the links. Otherwise, you want to resolve the orphaned content by finding highly relevant areas to link to the orphaned pages within your website’s content. When you find orphaned pages, review those pages and determine what other pages on your website could link to the orphaned page?
What do you do, though, if you’ve reviewed your orphaned pages but can’t find a relevant place to link to one of the orphaned pages? Every page on your website should discuss related topics, and therefore each page should be connected in some way. Those connections are represented by internal links—that’s what visitors and robots are expecting to see as they use internal links to understand your website. As a result, if you can’t find a relevant place to link to the orphaned page, then that suggests the solution is to delete that orphaned page from your website.
How Do You Find Orphan Pages?
There are a few different ways you can find orphaned pages. I’ll walk through two: a more involved method of analyzing your site and a simpler method you can use to locate orphaned pages.
Method #1: Comparing Pages and Links
Finding orphan pages requires two pieces of information: a full list of pages that exist on your website and a full list of pages linked to from within your website (internal links). You want to compare these two lists to see which pages have no links pointing to them or have only a few links pointing to them.
List of All Pages
Listing every page on your website usually requires working with your developers. The developers will need to query the database to locate every page on the website or will need to find every page hosted in your website’s root directory. However, there is an easier approach if you are using WordPress.
In WordPress, you can export a list of all pages and posts contained on your website via the Export All URLs plugin. Once the plugin has been installed, you can go to Tools and then click “Export All URLs”. You’ll want to export All Types of data to obtain pages, posts, and custom post types you might be using. Under Export Fields, select URLs, and under Post Status, select Published for all pages that are currently live on your website. For easier analysis, you’ll want to export this as a CSV file.
List of All Internal Links
Next, you need to obtain a list of all internal links contained on your website. The easiest way to do this is to run a crawl. In this example, I’ll demonstrate how to do obtain internal links using JetOctopus, though the methodology is similar to other site crawlers. In JetOctopus, click New Crawl. On the New Crawl screen, uncheck the box that says “Respect robots rules”. This will allow JetOctopus’s crawler to look at all links on the website, even if those links are typically blocked by robots, which will give you a complete understanding of the links on your website.
Once the crawl has run, you can go to Pages under Data Tables in the sidebar. In the All Pages table, you’ll see a lot of information about each page on your website, including the number of internal links pointing to that page (in the InLinks All column). You can also export this table to Excel or Google Spreadsheets.
As another example, you can also find this list of pages using Screaming Frog. You’ll want to ignore the robots.txt file. This can be found under the Configuration menu. Select Robots.txt and then click on Settings. From this screen, toggle the setting to Ignore Robots.txt.
Once the crawl has run, you can go to the Internal tab to view all pages. From here, you can see the number of internal links coming to a page on the Inlinks column. You can then export this to a CSV or Excel file.
Compare Pages and Links
Now that you’ve exported the list of pages and the list of links, you can bring these two lists together in Excel (or Google Sheets). On one sheet, you can have your list of pages and on a second sheet, you can have your list of links. Using a SUMIF, you can count up all the links referencing each page on your website. Once you’ve summed the list of links, you can sort by the Sum of Links column, smallest to largest, and see which links have no internal links-those are your orphaned pages.
Method #2: Finding Orphan Pages in Jet Octopus
Now the easier method and a method that doesn’t require running any formulas in Excel. You can find orphaned pages from JetOctopus’s crawl. When running a crawl but make sure you select “Process Sitemaps” before you run the crawl (this is checked by default). After the crawl has finished running, go to Pages under Data Tables. On this page, click “Add a Filter” above the table. Select “Is Orphaned” and “Yes” from the two filter dropdowns, then click Apply. The table will reload and show you all orphaned pages JetOctopus found on your website.
One quick note: JetOctopus is looking at a full list of URLs contained on your XML sitemap and seeing how many links there are to each of those pages. This connects back to the idea discussed earlier that simply listing a page in the XML sitemap isn’t enough; if you don’t have links (or only a few links) referencing the page on your website as well, Google will typically devalue the pages.
Cleaning up orphaned pages is an essential part of improving the overall quality of your website and ensuring that each page is performing as best as possible. If you need help finding and fixing orphaned pages please contact me. Or, for more information, check out my new book, Tech SEO Guide, which provides a reference for the technical SEO factors that can help your website perform better in organic search results.