Matrix Factorization with Missing Data

Andrew Fitzgibbon
Microsoft Research

Work done with Aeron Buchanan, Oxford University.

Matrix factorization underpins numerous applications in computer vision, machine learning, and information retrieval. In many instances, this factorization must be performed despite missing data, a problem which has seen significant research attention. Two classes of approach dominate the literature: EM-like alternation of closed-form solutions for the two factors of the matrix; and non-linear optimization of both factors simultaneously. As has been argued previously, nonlinear optimization based on a second-order approximation of the error function should be the faster approach. However alternation schemes remain enduringly popular.

I will show through theoretical and experimental arguments why this dicohotomy remains: it’s not just that lazy programmers don’t bother to implement second order methods, but that alternation schemes often do well on "easy" problems, and that many problems are "easy". Secondly, one must take care that the second-order methods do not throw away the nice properties of the alternation schemes. We introduce a class of hybrid algorithms which combine elements of both strategies, and show that these algorithms reach the correct minimum more frequently and at reasonable speed.

Finally, I show that for some well-known structure-from-motion data, the global optimum of the factorization problem does not, in fact, yield a realistic reconstruction. To achieve a realistic result, Bayesian priors and robust error kernels must be introduced. This is pretty obvious; what is less obvious is that this further cements the advantage of the direct optimization approaches.

Audio (MP3 File, Podcast Ready) Presentation (PowerPoint File)

Back to Workshops II: Numerical Tools and Fast Algorithms for Massive Data Mining, Search Engines and Applications