In this talk I will describe our recent comparative study of methods for predicting gene function from heterogeneous data. All the methods we consider are based on a common ensemble of individual discriminative classifiers, where a separate classifier is built for each GO term and each data type. Our work focuses on the calibration of these classifiers and on combining their predictions to obtain a set of probabilistic predictions that are consistent with the topology of the ontology. We call this procedure "reconciliation." We consider three heuristic methods, four variants of a Bayesian network, an extension of logistic regression to the structured case, and we propose two projection methods: isotonic regression and a Kullback-Leibler projection. Our results indicate that, except for terms with few annotations, enforcing consistency is typically detrimental to the precision: no reconciliation method performs systematically better in terms of precision than independent logistic regression predictions, which are a priori inconsistent. We suggest, however, isotonic regression as a reasonable method, rarely performing worse than logistic regression and performing significantly better in several modes of evaluation than others.