Skip to content

Understanding Entities

By Matthew Edgar · Last Updated: December 15, 2023

How does Google understand what a word or phrase is referring to? When Google encounters a word or phrase, Google’s algorithms determine what thing that particular word or phrase references. It could be a product for sale, a location, a business, or a person. Once they determine what item that thing being referenced is, Google then begins assigning unique attributes to that item. The item has a name, an address, an email, a website, and so on.

To make this more concrete let’s consider the phrase “Matthew Edgar”. When Google encounters that phrase, their bots first determine that this represents a person’s name. Now the bots have to determine who this person is and what information may already be known about this person.

Unfortunately for me, Matthew Edgar isn’t that unusual a name. There are other people with the name Matthew Edgar and, not surprisingly, Google will find references to multiple people with this name while crawling the web. So, when Google finds a reference to “Matthew Edgar”, its bots need to decide which “Matthew Edgar” is being referenced.

Once they know which “Matthew Edgar” is being referenced, they assign attributes to each person who happens to share that name. When Google comes across a “Matthew Edgar” referenced alongside terms like “technical SEO” or “web consultant”, they know to assign attributes to me instead of the other people who share my name. Of course, I can do things to help Google make the right decisions.

All of this work Google does is referred to as entity recognition. In this article, I’ll review what entity recognition is and how you can use optimize entities to improve your SEO performance.

Entity Recognition

Google approaches the web by looking at entities. They define an entity as:

entity is a thing or concept that is singular, unique, well-defined and distinguishable. For example, an entity may be a person, place, item, idea, abstract concept, concrete element, other suitable thing, or any combination thereof.

Seeing it this way provides a deeper way of understanding how crawling and indexing works. As Google crawls the web, its bots aren’t just trying to pick up on commonly used words or phrases. Instead, bots are trying to learn what entities a given page discusses.

As an example, when Google crawls my site, they understand that one of the entities mentioned on this site is a person named Matthew Edgar. Of course, there are other entities mentioned too, like my books, Tech SEO Guide and Elements of a Successful Website, or my company, Elementive.

If I want my website to show up when somebody searches Matthew Edgar, then I need to make sure that entity is being discussed within my website’s text. When evaluating your website, you need to determine if your content correctly discusses the entities you want to be associated with and if you are discussing those entities in a way that Google’s bots can detect.

To know what entities our websites are discussing, we need to try to review our content the same way Google does. While we don’t know exactly how Google is detecting entities, there are free code libraries we can use that help us do something like this. In Python’s Spacy library, you can use named entity recognition (NER). As a simple example, running my professional bio through NER code we find these entities:

  • Matthew Edgar PERSON
  • Colorado GPE
  • SEO ORG
  • Tech SEO Guide ORG
  • Matthew PERSON
  • 2001 DATE
  • Forbes ORG
  • American Express ORG
  • MozCon ORG
  • SMX ORG
  • MarTech ORG
  • O’Reilly Media ORG
  • Matthew PERSON
  • Information and Communications Technology ORG
  • the University of Denver ORG

Thankfully, my name is detected and it is recognized as a person. Unfortunately, Elementive is not detected as an entity even though other organizations listed in my bio are. That is likely a limitation of this library and chances are Google’s more sophisticated library probably does understand Elementive is an organization. You can also see that some entities are labeled incorrectly. For example, SEO is detected as an organization instead of as a concept.

Entity Relationships

However, it isn’t just about knowing the entities discussed on the website. In addition, Google needs to understand how those entities are connected. This is where links come into the picture. In this case, I’m not discussing links in terms of quality or quantity but rather, links strictly as a way to explain relationships. That is, if two pages link to each other, chances are those pages discuss subjects (or entities) that are somehow connected. Relationships can be established via external and internal links.

Google uses the words around that link and the words within the link itself (the anchor text) to help learn something about the way these pages might be connected. If the anchor text includes entities, that helps Google make a clearer connection.

Let’s bring this back to the example of my website. As Google crawls through the web, they see references to an entity called “Matthew Edgar” and they can detect that entity is a person. However, Google wants to understand more about that entity, so they look for other entities related to that person by following related links. They see that sometimes the entity of “Matthew Edgar” is connected to the entity of “SEO” or “web consulting” and knows that is referring to me. Other times the entity of “Matthew Edgar” is referenced alongside other words and phrases.

As Google’s bots crawl the web and find these references, Google starts to learn there are multiple people with the name Matthew Edgar. Google collects information about each person with this name as a separate entity. As Google continues to crawl the web, they can look for more information about each distinct entity.

The action item is to review all the entities related to your name, company name, and product names within links (external or internal). You want to know if you are explaining the right relationships to Google. The cleaner the signals about the relationships, the easier it will be for Google to rank your website when people search those entities.

Which Entity Is The Right Entity

We’ve now talked through how Google has found all the entities and the relationships between these entities. But what does this mean for search results? That is, Google knows there are multiple people with the name Matthew Edgar and knows something about each person who shares that name. Great, but which Matthew Edgar should appear when somebody searches for that name?

As a first step, Google has to figure out how notable the various entities are. This is partly due to how frequently an entity is mentioned on the web. The more mentions, the chances are that the entity is more important and, therefore, more deserving of ranking in the search results. As another way of thinking about that, the more mentions there are, the more likely it is that this is the entity people conducting searches would prefer to find. In the case of a search for my name, I’m not the most notable Matthew Edgar given how few mentions I have relative to other people who share my name.

However, Google also has to consider the relationships. Some entities might be discussed on the web infrequently but are important because of the other related entities. This can happen with product names where the product name itself might not be discussed too often, but a bigger name company behind the product might be discussed more often. Understanding that a big company is related to the product name, Google might give preference to the big company’s website in search results for that product name. In the case of my name, another Matthew Edgar is related to more prominent entities, giving Google greater reasons to rank that Matthew Edgar higher than me.

Google’s bots are smart enough, though, to know that search preferences may differ and that some people searching for a particular entity might want to find an alternative entity that has fewer mentions and doesn’t have as strong of connections. In these cases, Google will highlight the entities on the search result page. That could be in the suggested search near the bottom of the page or in a “See Results About” card in the sidebar.

See Results About - Matthew Edgar

The final thing to keep in mind is that Google’s consideration isn’t just determining which entity is the best overall. Instead, it is also about looking at the entities within their proper context. These contexts are understood from the relationships Google built up about a given entity. In certain contexts, the entity that represents me is the more correct Matthew Edgar to surface than any other entity who shares my name. For example, when you search “Matthew Edgar SEO”, Google knows to show the entity representing me as the top search result and not an entity representing somebody else. This is because Google understands that I am more related to SEO than other people who share my name.

Google search for Matthew Edgar SEO

If you do happen to be in a situation where other people share your name, you may not be able to become the primary entity—the competitors with the same name might have far more mentions from other prominent entities, making it a challenge to become the primary entity with that name in Google’s eyes. However, you can work to show up within specific contexts or be a strong alternative entity by building up the right types of relationships.

Key Takeaways

  • Google crawls the web learning what entities a given page discusses.
  • Make sure your pages are discussing appropriate entities.
  • Google sees how entities are connected by reviewing internal and external links.
  • Check search results to make sure Google is detecting the correct entities on your website.

Final Thoughts

It is important to think about your website’s content in terms of the entities being discussed and the ways those entities are connected. You want to be intentional in explaining the most important entities to Google’s bots and you want to be intentional in showing how those entities connect to other entities. You want to know what information backlinks are providing about the entities related to your website. If you aren’t ranking highly enough in search results, whether for your name or any term, reviewing the entities can help highlight why that might be the case.

If you need help reviewing the entities related to your website and your organization or need help with other aspects of your website’s SEO, please let me know.

You may also like

How to Use a Headless Browser

Learn what a headless browser is, why you should use a headless browser and how to launch a headless browser on your own computer.

Handling Out of Stock & Removed Product Pages

How do you remove products from your website without harming your users or SEO performance—or at least minimize the harm? In this post, Matthew walks through the different options available.

How to Check HTTP Response Status Codes

Every page on every website returns an HTTP response status code. How do you check the status code for your website’s pages? What tools can you use to test status codes? What do the status codes mean?