LSI and SEO
Latent semantic indexing, or LSI, is one of those terms that means something different depending on which industry you’re in. For those of us in SEO, it is one of those metrics, like domain authority, that sounds good and jargon-y — however, understanding what it does can be a different story.
There’s no proof that LSI has ever been a part of the Google algorithm, despite the two being closely related. Keywords, which are an important part of LSI, have always been a part of the Google algorithm. However, updates to the algorithm, specifically the Panda and Hummingbird updates, have drastically changed how keywords are treated. Because of this, there’s no evidence that the LSI process has ever been significant to Google’s ability to index a page correctly.
What Is Latent Semantic Indexing?
Latent semantic indexing is a method of keyword analysis, intended to provide searchers with pages that are most statistically relevant to them. By analyzing pages for co-occurrences of certain words, you can gain insight into the topic of those pages. In theory, this allows Google to more properly index your page based on your LSI. This method of analysis can be particularly helpful for synonymous words and homonyms, or words that are spelled the same but have different meanings.
A History of LSI
The patent for Latent Semantic Indexing was filed in 1988, long before Google ever existed. Latent semantic indexing exists outside the world of SEO, and was first used as an information retrieval tool in computer science. For professionals and amateurs looking to improve their pages in the early days of SEO, having something like LSI was an important part of understanding how their page might be indexed.
When Google came around, it was initially heavily reliant on keywords to index pages. This made LSI a valuable practice, as people could look at how certain keywords were getting indexed on the SERP, and ensure those keywords were exactly matched in their content. However, black hat practices like keyword stuffing were running rampant at this time, because of how heavily weighted exact-match keywords were in determining ranking. Keyword stuffing made it hard for Google to determine what was quality content and deserved to rank. This prompted a slew of updates to how the Google algorithm treats keywords.
How Does LSI Work in SEO?
For an example of LSI at work, we can look at the word ‘jaguar.’ Jaguar can mean either an animal or a car brand, making it a homonym. If you searched for ‘jaguar purchase,’ an LSI analysis would look for terms like ‘dealership,’ ‘price,’ and ‘for sale,’ to take you to pages that relate to the selling and buying of jaguar cars. This is because it’s more statistically probable that someone is looking to purchase a jaguar-the-car over a jaguar-the-animal.
LSI keywords have never been a ranking factor for Google, as directly stated by search advocates like John Mueller. In fact, “LSI keywords” is a misnomer — latent semantic indexing is the process of analyzing keywords, not a type of keywords themselves. To understand how keywords are ranked by Google, and how they contribute to your page rankings, you need to understand the two major keyword updates — Panda and Hummingbird.
Panda Update
The Google Panda update was an early major update to Google’s algorithm. It was implemented in 2011 to filter out low-quality content from top-ranking results. It targeted practices like keyword stuffing, which lowered the importance of the direct presence of a keyword and upped the importance of the contextual presence of keywords.
This affected LSI as an SEO practice because Google was already doing a lot of the analysis that professionals who were using LSI were doing, but doing it automatically.
Hummingbird Update
Google’s Hummingbird update was focused on the fine-tuning of contextual and natural language, creating space for content creators to use keyword variants, rather than exact-match keywords, that would still signal the same relevance. This was a push even further away from keyword-stuffing and other black hat practices. Keyword-focused content writers now had to show a better understanding of the topic they’re writing for through contextual language, rather than just reducing their exact-match keyword uses to appropriate places.
The Hummingbird update made using LSI as a way to predict what SERPs your page might show up functionally useless, because it created a balance between contextual information and keyword variants, rather than relying on exact-match keywords. This update expanded the idea of contextual searching past the limitations of LSI.
Natural Language and Keyword Variants
The above algorithm updates have created a shift toward using organic variation in language, also known as natural language. Not only has this allowed Google to filter out lower-quality content, but it also tends to match how people search better. Your average searcher won’t type in specific queries or keyword combinations when looking for something. Rather, they’ll use conversational queries as if they were talking to another person. This tendency has only increased with the advancements of voice-command search tools, like Alexa and Siri.
These shifts in how Google is indexing content, as well as how people are searching for content, have rendered LSI pretty redundant as an SEO practice. People are contextualizing synonymous words and homonyms themselves within their queries, and if they aren’t, the panda and hummingbird updates are filling in the gaps for them.
Why (Some) People Prioritize LSI
There are a few reasons why some people, both within and outside of the professional SEO industry, may prioritize LSI. The first could be simple ignorance of current best practices. SEO is a fast-paced industry, and there is a lot of contradicting information out there. If you don’t have the time or capacity to devote to understanding new updates and why best practices change, it can be easy to retain more antiquated views. If you’re interested in working with an SEO firm, it may be worth refreshing your knowledge or reading strategic breakdowns provided by your firm, so you can maintain an updated grasp on exactly what you’re paying for.
LSI can also be a jargon word that less reputable SEO professionals use to make clients think they’re getting more for their contract price. This is why it’s important for not only SEO professionals, but clients who work with SEO firms too, to understand what practices are of actual value.
The Big Picture on LSI
You can find some dissenting opinions on whether or not LSI is worth your money or time. A lot of the sites that claim LSI is still important are just trying to say that knowing the importance of contextual keywords, and the role they play in indexing your webpage on the SERP, is important. This is very true. Keyword-focused content is still an important part of SEO practices. It helps build your EAT and signal what your domain is about. Focusing on specific keywords and terms, which is what LSI does, just isn’t as important as it used to be.
As Google continues to update its indexing, a lot of the contextual work is being done behind the scenes. This makes processes like LSI not worth your time or energy, as they’re being automated faster, and oftentimes better, by the very search engine you’re trying to rank highly on.