Semiconductor sequencing uses direct detection of hydrogen ions from polymerase synthesis to detect sequence. This simple statement conceals interesting signal processing and inference challenges in turning voltage data from the chip into usable DNA bases. These problems can be divided into ‘hydrogen ion accounting’ in which we infer a mean number of hydrogen ions produced per molecule from the time-varying signal and “phase correction” in which we reconstruct the ensemble populations of polymerase locations on the copies of DNA and recover the sequence. As the system operates without termination chemistry, we recover runs of identical bases in the sequence as “flow grams”, necessitating thinking in “flow space” for optimal downstream analysis.
Back to Workshop I: Next-generation Sequencing Technology and Algorithms for Primary Data Analysis