Algorithms for the Multidimensional Analysis of Multidimensional LC-Mass Spectrometry Data

Benno Schwikowski
Institute for Systems Biology
Computational Biology

Abstract: The analysis of complex protein mixtures on the basis of liquid chromatography (LC), followed by mass spectrometry (LC-MS or LC-MS/MS), is one of the key technologies for the systematic large-scale exploration of cellular processes. Mass spectrometry itself is exquisitely sensitive, and reproducibility of experiments at the signal level is high. It is generally suspected, however, that a significant number of peptides - especially those with modifications - are missed in current computational analyses. We present an approach that globally integrates all data acquired in the course of one experiment. Instead of attempting to detect the presence of a protein or its fragments from individual signals (peaks) in a single mass spectrograph, all data acquired across a whole experiment are first aligned into an n-dimensional space, where n is the number of dimensions used for the LC separation. This brings together all peaks that were generated by the same protein fragment throughout the experiment, which constitutes a much stronger signal than a single peak. I will present algorithms for the implementation of this approach in n = 2 dimensions, and demonstrate that, on a large data set from yeast whole-cell lysate, independently identified peptides indeed achieve the desired integration of peaks, and many yet-unidentified, but arguably peptide-generated spectra suggest a large number of yet-unidentified peptides in typical experiments.


Back to Workshop I: High Throughput Technologies and Methods of Analysis