Bigfoot, Sasquatch, the Yeti and Other Missing Links: What We Don't Know About the AS Graph.

Matthew Roughan
University of Adelaide

Study of the Internet's high-level structure has for some time
intrigued scientists. The AS-graph (showing interconnections between
Autonomous Systems) has been measured, studied, modelled and
discussed in many papers over the last decade. However, the quality
of the measurement data has always been in question. It is by now
well known that most measurements of the AS-graph are missing some
set of links. Many efforts have been undertaken to correct this,
primarily by increasing the set of measurements, but the issue
remains: how much is enough? When will we know that we have enough
measurements to be sure we can see all (or almost all) of the
links. This paper aims to address the problem of estimating how many
links are missing from our measurements. We use techniques pioneered
in biostatistics and epidemiology for estimating the size of
populations (for instance of fish or disease carriers). It is rarely
possible to observe entire populations, and so sampling techniques
are used. We extend those techniques to the domain of the
AS-graph. The key difference between our work and the biological
literature is that all links are not the same, and so we build a
stratified model and specify an EM algorithm for estimating its
parameters. Our estimates suggest that a very significant number of
links (many of thousands) are missing from standard route monitor
measurements of the AS-graph. Finally, we use the model to derive
the number of monitors that would be needed to see a complete
AS-graph with high-probability. We estimate that 700 route monitors
would see 99.9\% of links.


Back to Long Programs