The last two dominant paradigms for computational imaging are compressed sensing and deep learning. Compressed sensing leverages convex optimization and comes with strong theoretical guarantees. But such guarantees typically require linear measurements, and multidimensional images require substantial memory footprints. In contrast, deep learning methods can incorporate nonlinear measurements, arbitrarily complex forward models, a variety of side information, and vast training data sets. The downside of such flexibility is a lack of theoretical grounding and robustness.
This talk investigates whether we can split the difference. What sorts of nonlinearities can enhance the power of compressive sensing and still yield tractable inverse problems with potential theoretical explanation? As an extended example, I will discuss the Plenoxels system for photorealistic view synthesis. Though prior art had relied on neural networks for such view synthesis, Plenoxels represent a scene as a sparse 3D grid with spherical harmonics. This representation can be optimized from calibrated images via gradient methods and convex regularization. On standard, benchmark tasks, Plenoxels can be computed over 100 times faster than Neural Radiance Fields with no loss in visual quality.
Back to Workshop IV: Multi-Modal Imaging with Deep Learning and Modeling