Hold the Pickles, Hold the Lettuce

September 19th, 2007

Personalization has been around as long as, well, people. Each of us is different, so it’s no surprise that people and businesses strive to customize their messages and products to cater to individual preferences. Occasionally this process can take a while (Henry Ford wrote in his autobiography, “Any customer can have a car painted any colour that he wants so long as it is black”), but eventually everyone gets there. It is practically an economic imperative. The internet is, of course, no different. Despite current limitations with personalization, the march towards greater degrees of personalization is inevitable, and welcome.

While as recently as 2003 Jupiter Research slammed personalization in a report entitled “Beyond the Personalization Myth,” Read/WriteWeb this month listed “Personalization” as one of the Top 10 Future Web Trends, and rightly so. The My.Yahoo! homepage has been hugely popular for many years and personalized home page sites, such as Pageflakes, have also received rave reviews. Book, movie and music recommendations, from sites such as Amazon and Netflix, personalized to individuals’ tastes, are not only well-received, but are also lucrative. And then, of course, there is search.

In their paper entitled “Beyond the Commons: Investigating the Value of Personalizing Web Search,” Teevan et al. make the case that not only is search personalization beneficial, but perhaps essential: “Web queries are very short, and it is unlikely that a two- or three-word query can unambiguously describe a user’s informational goal.” Not only that, but as the quantity of information indexed by the major search engines grows (Google announced 6 billion items in 2004, and today’s figure is probably much higher), it becomes increasingly difficult, if not impossible, to access all of that information with two- and three-word queries. Something additional is needed to help users locate the information they desire, and that requires personalization.

A June 3rd, 2007 New York Times article entitled “Google Keeps Tweaking Its Search Engine” talked about Google’s continuous efforts to improve the relevancy of its search results, and personalization technology (much of which was acquired from Kaltix in 2003) plays a big role: “Increasingly, Google is using signals that come from its history of what individual users have searched for in the past, in order to offer results that reflect each person’s interests. For example, a search for ‘dolphins’ will return different results for a user who is a Miami football fan than for a user who is a marine biologist.” This is likely why in February Google decided to turn on personalization by default for any new accounts.

While personalization holds a great deal of promise, it has, however, yet to fully deliver. Last month Read/WriteWeb conducted a poll where the majority of respondents didn’t see any improvement with their personalized results on Google. Certainly the poll was less than scientific and hardly exhaustive, but nevertheless current implementations of personalized search suffer from a few drawbacks:

  1. Without any history, it is difficult to personalize. This would be the case for anyone who’s new to a search engine or who has erased his or her history. Importantly, however, it is also the case for anyone who might be researching a topic for the first time or is interested in discovering new results for a previously researched topic: a common occurrence since, after all, that’s why the user is searching. If there is no data in the user’s search history that’s related to the content of the investigation in progress, then there is not much current personalization technology can do.
  2. Interests can change significantly with time. One classic example is the purchase of a new car. The intent of a student who enters the query “ford cars” will change significantly after the student buys the car. It will change significantly again once the student is assigned to write a report on the history of the Ford Motor Company. Existing personalization technologies, even those that utilize decay and topic drift, have difficulty accounting for this.
  3. Incomplete and imperfect information significantly restricts customization. Primarily for the reason given above, the intent that can be inferred from a user’s query history can only be given a limited amount of confidence. What happens when the marine biologist wants to attend a Miami Dolphins game? It is for this reason that personalized search results can only be “nudged” slightly in one direction or the other. Typically this involves customizing only a few, lower-ranked items on the search result page.
  4. Finally, even when the results are personalized to the benefit of the user, it can be challenging to “perceive” any advantage. In addition to the limited degree to which the results are customized, unless the user is able to observe personalized and non-personalized results side by side in order to see the difference, it may be difficult to get enthusiastic about the technology.

In spite of these limitations, personalization continues to be one of the most promising areas of web technology. The next generation of personalized search technology will address these issues by being able to significantly customize the user’s search experience based on inferred intent without having to rely on potentially obsolete and deficient search histories. As such, Surf Canyon is developing Discovery for Search technology to enable post-query disambiguation. Please stay tuned for upcoming announcements.

Tags: Discovery Personalization Research