Nested Dictionary Learning for Hierarchical Organization of Imagery and Text

Lawrence Carin
Duke University
Elec and Computer Engineering

A tree-based dictionary learning model is developed, for joint analysis of imagery and associated text. The dictionary learning may be applied directly to the imagery from patches, or to feature vectors extracted from patches or superpixels, using any type of image features. Each image is associated with a tree branch, and each of the multiple image patches is associated with one branch node. Nodes near the tree root are shared between multiple branches, representing image characteristics that are common among different types of images. Moving toward the leaves, nodes become specialized, representing details in image classes. If available, words (text) are also jointly modeled, with a node-dependent probability over words, thereby relating image patches to words. The tree structure is inferred via a nested Dirichlet process, and a retrospective stick-breaking sampler is used to automatically infer the tree depth and width.


Back to Advances in Scientific Computing, Imaging Science and Optimization: Stan Osher's 70th Birthday Conference