One of the web’s biggest mysteries could be solved: Google’s algorithm leaked

2,500 pages of an internal Google document have been leaked. They reveal the secrets of the Google Search algorithm, which has ruled the Internet for several decades. After initially remaining silent, the company confirms that this leak is authentic.

Google’s search engine has rarely been so talked about.

A few weeks after Google announced a major transformation of the service which made it known with summaries of the results by artificial intelligence (summaries which have also earned it strong criticism), the company is the victim of a leak unprecedented.

Since May 28, a 2,500-page document claims to contain the secrets of the Google Search algorithm. It reveals the search engine’s sorting method when you enter a query. A secret that weighs heavily on the Internet, given the influence of Google on a website’s audiences. Rand Fishkin, an SEO expert (the optimization of a page for search engines), says he received this mysterious document from an anonymous Source on May 5, 2024. Caution was initially advised, but Google confirmed its authenticity.

Google results soon to be manipulated?

In the 2,500-page document, we find mainly technical details. The general public should not find it of interest, but SEO specialists will surely use it to better understand Google’s priorities when it chooses to promote a site.

In certain aspects, the document shows that Google has not always been very honest. The company says it doesn’t collect data from its Chrome browser or rank article authors based on trustworthiness, but its algorithm says it does. Chrome would send Google the list of the most popular sites, so that the search engine highlights them.

type="image/avif"> type="image/webp">>>
Why does Google choose one site over another? This is the mystery of its algorithm. // Source: Capture Numerama

Among the revelations of this document, we discover several technical terms used by Google to classify the web. We learn, for example, that NavBoost technology measures clicks and engagement rate, that human evaluations allow a site to become more reliable and that whitelists exist for important subjects (covid for example). The age of a domain name would also be used to estimate its reliability, in the same way as the reputation of its brand.

On the other hand, some data in the document suggests that many SEO experts overestimate certain optimizations. The EEAT (for experience, expertise, authority and reliability) would not have the importance previously considered by search engine specialists. Rand Fishkin recommends that industry experts study the document to understand what Google is really doing, instead of relying on the group’s statements.

Google confirms the leak, but maintains doubts about its obsolescence

In an email sent to certain American media, such as The VergeGoogle confirms the authenticity of the document, while emphasizing the fact that it should not be based only on these 2,500 pages. “We caution against making inaccurate assumptions about how search works based on out-of-context, outdated, or incomplete information”says a company spokesperson.

“We’ve shared a lot of information about how research works and the types of factors our systems take into account, while working to protect the integrity of our results from manipulation,” completes Google, which suggests that the whole truth about its PageRank algorithm is not in this document, or that things may have changed since then.

type="image/avif"> type="image/webp">The document looks like this. You have to master the code to analyze it.>>The document looks like this. You have to master the code to analyze it.
The document looks like this. You have to master the code to analyze it. // Source: Spartoro

Now that they have access to this large amount of data, SEO specialists will be able to carry out experiments to try to manipulate the Google algorithm. The risk is to see certain sites take advantage of the breach and paralyze others, while waiting for Google to change its algorithm again. It’s also one of the mysteries of the web: why does Google change its website sorting system so often?

Do you want to know everything about the mobility of tomorrow, from electric cars to e-bikes? Subscribe now to our Watt Else newsletter!



NEXT Good deal – The Netatmo connected object Connected thermostatic heads for radiators Additional “5-star” valve at €59.99 (-22%)