When you make a change to a complex piece of software, how do you know that your change has the intended effect? How do you detect unintended effects? How, in short, do you rectify problems without creating others?
We answer these general questions for the specific case of the structure prediction software Rosetta. Rosetta originated in the Baker Lab (U
Wash) and is under active, collaborative development by the
RosettaCommons, a loose collection of 15 universities and labs. Like
most successful structure prediction tools, it is based on a collection of "knowledge-based potentials" --- energy functions, parameterized by fitting to sets of known structures, that try to capture different
aspects of molecular structure. As researchers add new potentials to
try to rectify problems they see in predictions or add new modules for new application areas, the software grows almost monotonically, with an
occasional upheaval for redesign or refactoring. Infrastructure for
version control and testing helps ensure that code continues to compile and execute, and performance benchmarks report if the running time suddenly increases or if new values are computed for test cases.
We provide a framework that supports creation of scientific benchmarks based on databases of geometric models -- from known molecular structures or from structure prediction runs with varying parameters.
Module developers can write short R programs that query an SQL database of models, and produce small multiple plots (of distances, angles, packing, or any local measures of quality) for different atom or residue types, secondary structure classes, etc. They can test against thresholds to raise alerts if unexpected changes occur. This framework helps support the development of new modules, as well as recording the developers' tests that would make good scientific benchmarks for their modules.
This is the work of Matt O'Meara with the assistance of many members of the Rosetta Commons.