Anomaly and similarity detection in multidimensional series have a long
history and have found practical usage in many different fields such as
medicine, networks, and finance. Anomaly detection is of great appeal for many
different disciplines; for example, mathematicians searching for a unified
mathematical formulation based on probability, statisticians searching for
error bound estimates, and computer scientists who are trying to design fast
algorithms, to name just a few.
The correlation of the result lists provided by search engines is fundamental
and it has deep and multidisciplinary ramifications. Here, we present automatic
and unsupervised methods to assess whether or not search engines provide
results that are comparable or correlated. We have two main contributions:
First, we provide evidence that for more than 80% of the input queries -
independently of their frequency - the two major search engines share only
three or fewer URLs in their search results, leading to an increasing
divergence.