In addition to being very attractive at the theoretical level, sparse signal modeling has been shown to lead to numerous state-of-the-art results in signal processing. The standard model assumes that a signal can be efficiently represented by a sparse linear combination of atoms from a given or learned dictionary. The selected atoms form what is usually referred to as the active set, whose cardinality is significantly smaller than the size of the dictionary. In recent years, it has been shown that adding structural constraints to this active set has value both at the level of representation robustness and at the level of signal interpretation (in particular where the active set indicates some physical properties of the signal). This leads to group or structured sparse coding, where instead of considering the atoms as singletons, the atoms are grouped, and a few groups are active at a time. An alternative way to add structure (and robustness) to the problem is to consider the simultaneous encoding of multiple signals, requesting that they all share the same active set. This is a natural collaborative filtering approach to sparse coding.
In this work we extend these models in a number of directions. First, we present a hierarchical sparse model, where not only a few (sparse) groups of atoms are active at a time, but also each group enjoys internal sparsity. At the conceptual level, this means that the signal is represented by a few groups (models), and inside each group only a few members are active at a time. A simple example of this is a piece of music (numerous applications in genomics exist as well), where only a few instruments are active at a time (each instrument is a group), and the actual music played by the instrument is efficiently represented by a few atoms of the sub-dictionary/group corresponding to it. Thereby, this proposed hierarchical sparse coding framework permits to efficiently perform source identification and separation, where the individual sources (models) that generated the signal are identified at the same time as their efficient representation (the sparse code inside the group) is reconstructed. An efficient optimization procedure is proposed to solve this hierarchical sparse coding framework.
Then, we go a step beyond this. Imagine now that we have multiple recordings of the same two instruments, each time playing different songs. Then, if we collaboratively apply this new hierarchical sparse coding approach, we expect that the different recordings will share the same groups (since they are of the same instruments), but each will have its unique sparsity pattern inside the group (since each recording is a different melody). We propose a collaborative hierarchical sparse coding framework addressing exactly this, a powerful new framework for collaborative source separation. An efficient optimization procedure for this case is derived as well.
During the talk we introduce these new models and their corresponding optimization, theoretical bounds for the new models, present numerous examples illustrating them (both in audio and image processing), and provide possible directions of research opened by these new frameworks, including some theoretical ones.
We conclude the talk with a brief presentation of a different model on structure sparsity, tuned to image analysis, that relates sparse modeling with Gaussian Mixture Models, and via very simple and computational efficient linear operations, achieves state-of-the-art performance in a number of image enhancement application.
(joint work with P. Sprechmann, I. Ramirez, Y. Eldar, G. Yu, and S. Mallat)
[SRSE] Pablo Sprechmann, Ignacio Ramirez, Guillermo Sapiro, and Yonina Eldar,
C-HiLasso: A Collaborative Hierarchical Sparse Modeling Framework, http://arxiv.org/abs/1006.1346, June 2010.
[YSM] Guoshen Yu, Guillermo Sapiro, and Stephane Mallat, Solving Inverse Problems with Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity, http://arxiv.org/abs/1006.3056, June 2010.
Back to Machine Reasoning Workshops I & II: Mission-Focused Representation & Understanding of Complex Real-World Data