Modeling semantic overlap for applications

Bill Dolan
Microsoft Research

The last few years have seen increased interest in measuring the semantic overlap between two segments. Work on paraphrase recognition, for instance, attempts to identify when two sentences “mean the same thing” at some abstract level, despite superficial differences:



On its way to an extended mission at Saturn, the Cassini probe on Friday makes its closest rendezvous with Saturn's dark moon Phoebe.

The Cassini spacecraft, which is en route to Saturn, is about to make a close pass of the ringed planet's mysterious moon Phoebe



Modeling this sort of semantic overlap poses major challenges, as it encompasses issues of lexical choice, syntactic alternation, and reference/discourse structure. The assumption driving work in this area, though, is that reliable metrics for modeling semantic overlap will play a crucial role in building applications that appear to “understand” natural language. Problems as diverse as question answering, multi-document summarization, and text editing can all benefit from advances in this area.



This talk will focus on the role of semantic overlap metrics in real applications such as search, where “deep” natural language processing techniques have so far had disappointingly little impact.

Audio (MP3 File, Podcast Ready)

Back to Workshop I: Dynamic Searches and Knowledge Building