Modeling of micro-satellite genotyping data and allele-calling

Lei Li
Florida State University

A key practice in current genomic research is large scale genotyping. The availability of dense, highly polymorphic, and uniformly distributed genetic markers is crucial for capturing cross-overs occurring at each meiosis. This information is essential for the building-up of the human genomic map, on which the disease genes are to be located. The density of the markers and the degree of the polymorphisms determine the resolution of the map. We also desire that the resolution is more or less uniform across the genome. Microsatellite interspersed repeats such as $(CA)_n$ belong to such a class of genetic markers. The availability of machinery and techniques for large-scale, economical, and fast genotyping is, on the other hand, a practical need. The amplification technique, polymerase chain reaction, and fragment length sizing technique, electrophoresis, form the basis of implementation of microsatellite genotyping. Allele calling concerns collecting, processing and analyzing microsatellite genotyping data. In this talk, we provide a mathematical accountability of the microsatellite genotyping procedure. That is, we model each step of the process explicitly using the available knowledge. This separates different complications and sharpens the focus of technical efforts to be made. Consequently, optimal algorithms can be proposed according to relevant criteria based on these models. More importantly, it provides us with a platform for more criticism and improvements at the modeling level.