We present a statistical/computational approach to modeling DNA methylation sequencing data using the Ising model of statistical physics. We introduce the model and discuss a parametric form that accounts for underlying genomic features and known properties of the methylation system. We discuss solutions to statistical and computational problems that arise in the course of modeling whole-genome bisulfite sequencing (WGBS) data, such as computing the partition function of the Ising model, addressing the highly-censored nature of sequencing data in statistical estimation, and computing the probability distribution of the methylation level. We compare the efficacy of our approach to empirical and marginal modeling methods currently employed by state-of-the-art WGBS analysis tools and discuss a number of new biological results we obtained using this approach.
Back to Regulatory and Epigenetic Stochasticity in Development and Disease