Suppose we have a Gaussian graphical model with sample observations of only a subset of the variables. Can we separate the extra correlations induced due to marginalization over the unobserved, hidden variables from the
structure among the observed variables? In other words, is it still possible to consistently perform model selection despite the latent
variables? As we shall see, the key issue here is to decompose the concentration matrix of the observed variables into a sparse matrix
(representing graphical model structure among the observed variables) and a low-rank matrix (representing the effects of marginalization over the hidden variables). This estimator is given by a tractable convex program, and it
consistently estimates model structure in the high-dimensional regime in which the number of observed/hidden variables grow with the number of
samples of the observed variables. In our analysis the algebraic varieties of sparse matrices and low-rank matrices play an important role. Joint work with Venkat Chandrasekaran and Alan Willsky (MIT).