Wide variety and amounts of protein-protein interaction data is available now in the forms of protein interaction networks and protein complexes.
Valuable biological knowledge can be derived from this data about protein functions and mechanisms involved in the accomplishment of these functions.
However, this wide availability of data is often eclipsed by a dominant amount of noise, which manifest themselves primarily in the form of spurious interactions between proteins. In this talk, we will describe how association analysis techniques developed in our group, primarily hypercliques and h-confidence, can be used to filter these spurious interactions and extract more accurate functional information than that from the original data. These techniques enable the extraction of tight groups from binary data, a form in which several types of interaction data can be represented. In particular, we show how hypercliques can be used to extract accurate functional modules from protein complex data, and how the h-confidence measure can be used to generate a new version of a given interaction network, which contains less noise and includes biologically viable interactions.