Web-scale distributed search

Michael Isard
Microsoft Research

There are a number of practical complications that arise when scaling a search engine from a prototype using tens of computers to a web-scale production system. These include issues relating to fault-tolerance, responsiveness, and serving results from a corpus while it is being continuously re-crawled. I will discuss some design hints and principles for large-scale search based on publically available information about leading search engines, and my own experience building large-scale distributed systems.

Presentation (PowerPoint File)

Back to Workshops II: Numerical Tools and Fast Algorithms for Massive Data Mining, Search Engines and Applications