Model engineering project logo
Model engineering project logo

The University of Washington, along with collaborators from the Icahn Mount Sinai and University of Connecticut Schools of Medicine, has recently received a $6.5M grant from the National Institutes of Health to fund a Center for Reproducible Biomedical Modeling. Among the activities of the Center will be the development of technologies adapted from software engineering to improve the reproducibility and reuse of biomedical models.

Quantitative models are at the core of science and engineering. However, current practice in biomedical modeling has numerous shortcomings that greatly limits the value provided to translational medicine and related areas. In particular, published models are: (a) difficult to reproduce, and so are not widely used or trusted; (b) hard to understand, especially to ascertain the biology being modeled; and (c) difficult to embed, and so it is challenging to use existing models as building blocks to construct larger models (e.g., whole cell models).

Over the last 40 years, the software industry has developed a large inventory of tools and practices to address these issues. The result is that software grew into a multi-trillion dollar industry. The success of the software industry has motivated a research initiative that we refer to as model engineering. The objective of model engineering is to improve the quantity and quality of biomedical models by using best practices and tools adapted from software engineering.

Model engineering is organized along the lines of software engineering, with separate phases for model requirements, design, and construction. Requirements involve the development of “use cases” that describe the information provided by a model. Design focuses on model reuse. Applicable techniques from software engineering include name scoping (e.g., by biological compartment to handle interactions between submodels) and code structuring such as modularization, object hierarchies, and aspect oriented programming. Construction emphasizes: (a) domain specific languages (DSLs), (b) dependency management that assists with handling embedded models, (c) testing to detect defects from running models, and (d) linters that do static error checking.

Our initial effort is to develop a linter for biomedical models that involve reactions, such as kinetics and constraint models. A well-formed reaction should preserve mass balance in that the sum of the masses of the reactants should equal the sum of the masses of the products. Mass balance is easy to check if reactants and products have annotations that provide machine readable chemical formulas. Unfortunately, in current practice, it is unusual to have such detailed specifications. We are developing techniques that can check the consistency of mass balance relationships without knowledge of the chemical structures used in the reactions.For further information about the Reproducibility Center, contact Herbert Sauro at the Department of BioEngineering at the University of Washington. A forthcoming F1000 paper provides details on Model Engineering; for additional information, contact Joseph Hellerstein at the eScience Institute.