Recent Results in Parsing

Eugene Charniak
Brown University
Computer Science

Parsing is the problem of mapping a string (in, say, English) to a phrase structure. It is important because it gives us a first rough cut at meaning. During the 1990s there was a flurry of new results using statistical techniques that gave us our first robust parsers ready for every-day use. While there has been continued results since then, the practical parsers at the start of 2005 were no better than what has available in 2000. The first part of the talk will recap this ancient history.


The last 12 months, however, have seen a dramatic turn-around with error rates decreasing by 25%. The second and third parts of the talk describe the two techniques responsible for this state of affairs: discriminative reranking and self training. We also show that the latest results seem to be less corpus specific than the previous results. (That is, they carry over to text corpora reasonably different than those upon which they were trained.)

Audio (MP3 File, Podcast Ready)

Back to Long Programs