Conditional random fields

John Lafferty
Carnegie Mellon University
Computer Science

Many classification problems involve the annotation of data items having multiple components, with each component requiring a class
label. Such problems are challenging because the interaction between the components can be rich and complex. In text, speech, and image
processing, for example, it is often useful to label individual words, sounds, or image patches with categories to enable higher level
processing; but these labels can depend on one another in a highly complex manner. Other natural examples include gene prediction and
the annotation of proteins with labels representing the global geometric structure of the molecule. Such classification tasks violate
the assumption of independent and identically distributed instances that is made in the majority of classification procedures in
statistics and machine learning.

A conditional random field (CRF) is an undirected graphical model globally conditioned on a set of input covariates. Conditional random
fields can be effective for structured classification tasks; as conditional models of the labels given inputs, they relax the
independence assumptions made by generative models, such as hidden Markov models, that have been used more traditionally. They also
provide a framework for the natural integration of kernel methods and probabilistic graphical models. In this tutorial overview we discuss
the basic ideas and techniques behind CRFs, and some of the applications areas they have recently been applied to.


Video of Talk (RealPlayer File)

Back to Graduate Summer School: Intelligent Extraction of Information from Graphs and High Dimensional Data