Google's "Florida" Update: 2


In the first installment of this series, we looked at some of the common theories surrounding the reasoning behind Google’s Florida update. Today, we’ll look at some alternate views about what really happened.

What Really Happened
Although Google's results state how many matches were found for a searchterm (e.g. "1 - 10 of about 2,354,000"), they will only show a maximum of 1000 results. I decided to compare the entire sets of results produced, with and without the -nonsense word, and compare them to see if I could discover why a page would be filtered out and why other pages made it to the top. I did it with a number of searchterms and I was very surprised to find that, in some cases, over 80% of the results had been filtered out and replaced by other results. Then I realised that the two sets of results were completely different - the filtered sets were not derived from the unfiltered sets.

The partners in each pair of result sets were completely different. The 'filtered' set didn't contain what was left from the 'unfiltered' set, and very low ranking pages in the 'unfiltered' set got very high rankings in the 'filtered' set. I saw one page, for instance, that was ranked at #800+ unfiltered and #1 filtered. That can't happen with a simple filter. It can't jump over the other pages that weren't filtered out. All the theories about various kinds of filters and lists were wrong, because they all assumed that the result set is always compiled in the same way, regardless of the searchterm, and then modified by filters. That clearly isn't the case.

In it inescapable that the Google engine now compiles the search results for different queries in different ways. For some queries it compiles them in one way, and for others it compiles them in a different way. The different result sets are not due to filters, they are simply compiled differently in the first place; i.e. the result set without the -nonsense word, and the result set with the -nonsense word are compiled in different ways and are not related to each other as the filter theories suggest. One set is not the result of filtering the other set.

The most fundamental change that Google made with the Florida update is that they now compile the results set for the new results in a different way than they did before. That's what all the previous theories failed to spot. The question now is, how does Google compile the new results set?

Back in 1999, a system for determining the rankings of pages was conceived and tested by Krishna Bharat. His paper about it is here. He called his search engine "Hilltop". At the time he wrote the paper, his address was Google's address, and people have often wondered if Google might implement the Hilltop system.

Hilltop employs an 'expert' system to rank pages. It compiles an index of expert web pages - these are pages that contain multiple links to other pages on the web of the same subject matter. The pages that end up in the rankings are those that the expert pages link to. Of course, there's much more to it than that, but it gives the general idea. Hilltop was written in 1999 and, if Google have implemented it, they have undoubtedly developed it since then. Even so, every effect that the Florida update has caused can be attributed to a Hilltop-type, expert-based system. An important thing to note is that the 'expert' system cannot create a set of results for all search queries. It can only create a set for queries of a more general nature.

We see many search results, that once contained useful commercial sites, now containing much more in the way of information or authority pages. That's because expert pages would have a significant tendancy to point to information pages. We see that the results with and without the -nonsense word are sometimes different and sometimes the same. That's because an expert system cannot handle all search queries, as the Krishna Bharat paper states. When it can't produce a set of results, Google's normal mechanisms do it instead. We see that a great many home pages have vanished from the results (that was the first thing that everyone noticed). It's because expert pages are much more likely to point to the inner pages that contain the information that to home pages. Every effect we see in the search results can be attributed to an expert system like Hilltop.

I can see flaws in every theory that has been put forward thus far. The flaw in the seo filter idea is that there are highly SEOed pages still ranking in the top 10 for searchterms that they should have been filtered out for. The flaw in the LocalRank theory is that LocalRank doesn't drop pages, but a great many pages have been dropped. The flaw in the list of searchterms is that if a filter can be applied to one searchterm, it can be applied to them all, so why bother maintaining a list. The flaw in the money-words list idea is that, if it ever came out that they were doing it, Google would run the risk of going into a quick decline. I just don't believe that the people at Google are that stupid. The flaw in the stemming theory is not that Google hasn't introduced stemming, it's that the theory doesn't take into account the fact that the Florida results set is compiled in a different way to the -nonsense set. Stemming is additional to the main change, but it isn't the main change itself.

The expert-system, or something like it, accounts for every Florida effect that we see. I am convinced that this is what Google rolled out in the Florida update. Having said that, I must also add that it is still a theory, and cannot be relied upon as fact. I cannot say that Google has implemented Hilltop, or a development of Hilltop, or even a Hilltop-like system. What I can say with confidence is that the results without a -nonsense word (the normal results) are not derived from the results with a -nonsense word, as most people currently think. They are a completely different results set and are compiled in a different way. And I can also say that every effect that the Florida update has caused would be expected with a Hilltop-like expert-based system. At the moment, Google's search results are in poor shape, in spite of what their representatives say. If they leave them as they, they will lose users, and risk becoming a small engine as other engines have done in the past.

Where Do We Go From Here?
At the moment, Google's search results are in poor shape, in spite of what their representatives say. If they leave them as they, they will lose users, and risk becoming a small engine as other top engines have done in the past. We are seeing the return of some pages that were consigned to the void, so it is clear that the people at Google are continuing to tweak the changes.

If they get the results to their satisfaction, the changes will stay and we will have to learn how to seo Google all over again. But it can be done. There are reasons why certain pages are at the top of the search results and, if they can get there, albeit accidentally in many cases, other pages can get there too.

If it really is an expert system, then the first thing to realise is that the system cannot deal with all searchterms, so targeting non-generalised and lesser searchterms, using the usual search engine optimization basics, will still work.

For more generalised searchterms, the page needs to be linked to by multiple expert pages that are unaffiliated with the page. By "unaffiliated" I mean that they must reside on servers with different IP C block addresses than each other and than the target page, and their URLs must not use the same domain name as each other or as the target page. These expert pages can either be found and links requested or they can be created.

The Latest News
8th December 2003:
Since soon after the Florida update began, some pages that disappeared from the results have been returning. In some cases they are back at, or close to, the rankings that they had before Florida. In other cases they are quite highly ranked but lower than before. Day after day, more of them are returning.

I put this down to Google recognizing that Florida caused a sharp decline in the quality (relevancy) of the search results. It appears that they are adjusting the algorithm's parameters, trying to find a balance between the new page selection process and good relevancy. In doing so, some of the pages that were dumped out of the results are getting back into the results set, and they are achieving high rankings because they already matched the ranking algorithm quite well, so once they are back in the results set, they do well in the rankings.

Reminder: Don't forget that all this is just theory, but what we see happening does appear to fit an expert sytem, although there can be other explanations. We can be sure that Google compiles the results sets in different ways depending on the searchterm, and that the Florida results are not derived, via one or more filters, from the -nonsense results, but we can't yet be certain that an expert system is used to compile the Florida results set.

22nd December 2003:
Google has now dealt with the -nonsense search trick of seeing the non-Florida results, and it no longer works. It doesn't mean that they are not different to the Florida results; it's just that we can no longer see them.

5th January 2004:
Dan Thies, of Seo Research Labs, came up with the interesting theory that the Florida changes are due to Google now using Topic Sensitive PageRank (TSPR). His PDF article can be found here. It's an interesting theory because, like the 'expert system' theory, it would cause Google to use 2 different algorithms depending on the searchterm used. To date, it's the only other theory that I believe has a chance of being right.

Editor's Note: Understanding how and why these changes were made at Google may help you to regain lost position, or gain a new and improved position. We'll be featuring more Google tech and tips here at XBiz over the coming weeks to help you out: Stay Tuned! ~ Stephen