Biochemical reaction networks represent complex cellular regulatory mechanisms. These networks are typically analyzed using discrete stochastic simulation models. The models may involve numerous reactions involving a large number of chemical species, governed by highly uncertain parameters.
Given existing data pertaining to a biochemical reaction network, one is often interested in inferring the values of the model parameters that likely generated the data. The data itself may come from models simulated in the past, or physical experiments. Approximate Bayesian Computation (ABC) is a proven approach that effectively solves such parameter inference problems by using simulation models as a tool to find the region in the parameter space corresponding to least deviation from given data.
The rejection sampling algorithm forms the basis of the ABC framework. Samples are drawn from a specified prior distribution, and subsequently simulated. The simulated responses are compared to existing data by means of a distance function and appropriate summary statistics. Samples that result in distance function values below a specified tolerance threshold are accepted, and the rest rejected. The sampling algorithm proceeds until the desired number of accepted samples have been obtained. The inferred parameters are then reported as the mean parameter values corresponding to the accepted samples.
Design choices such as selection of distance functions, summary statistics and acquisition function for the inference process have a deep impact on the solution quality. Furthermore, increasing problem complexity often leads to impractically high inference times using rejection sampling.
Our research explores methods to accelerate high-quality parameter inference by leveraging state-of-the-art methods from the fields of computational biology, machine learning, optimization and statistics. Some of our active research topics include investigating intelligent construction of priors, methods for automated large-scale summary statistic selection, and training fast local and global approximations or surrogate models of computationally expensive simulators.
The HASTE project, a SSF-funded project on computational science and big data, takes a holistic approach to new, intelligent ways of processing and managing very large amounts of microscopy images to leverage the imminent explosion of image data from modern experimental setups in the biosciences. One central idea is to represent datasets as intelligently formed and maintained information hierarchies, and to prioritize data acquisition and analysis to certain regions/sections of data based on automatically obtained metrics for usefulness and interestingness.
The project is a collaboration between the Wählby lab (PI), Hellander lab (co-PI), both at the Department of Information Technology, Uppsala University, the Spjuth lab (co-PI) at the Department of Pharmaceutical Biosciences, Uppsala University, the Nilsson lab at the Department of Biochemistry and Biophysics at Stockholm University and SciLifeLab, Vironova AB and AstraZeneca AB.
Read more on the project webpage.
The integration between on the one hand data, modeling and algorithms, and on the other hand the specification, coordination and execution of large scale and data-intensive computational experiments poses a fundamental problem in all scientific disciplines relying on modeling and simulation. Today it is largely left to the modeler or engineer to manually tune models to fit data, to choose algorithms, to configure simulation workflows and to analyze simulation result. This is a big burden to place on e.g. a biologist who is mainly interested in how she can use modeling and simulation to learn new things about a biological system of interest. By utilizing machine learning and cloud computing, we are developing smart systems for scalable and efficient model exploration. An example of a workflow is shown in the image below, where a high-dimensional parameter sweep application is augmented with automated feature extraction and clustering, followed by training a model for classification based on user-defined labels (such as interesting or non-interesting realizations). With this model, the smart sweep application will learn to more efficiently explore areas of interestingness in the parameter space.
Open source computational science and engineering (CSE) software is an integral part of methodology-oriented computational research and a priority in the group. Due to the ongoing transformation of e-infrastructure to clouds, methods and workflows that promote horizontal scalability and elasticity for cloud applications are needed, and this may in many cases require re-thinking of how we best make use of computational resources. Other important questions include reproducibility and handling of large and complex data.
Selected recent publications:
- B. Drawert, A. Hellander, B. Bales, D. Banerjee, G. Bellesia, B.J. Daigle, Jr. G. Douglas, M. Gu, A. Gupta, S. Hellander, C. Horuk, D. Nath, A. Takkar, S. Wu, P. Lötstedt, C. Krintz, L. R. Petzold (2016) Stochastic Simulation Service: Bridging the gap between the computational expert and the biologist, PloS Comp. Bio. (to appear)
- B. Drawert, M. Trogdon, S. Toor, L. Petzold and A. Hellander (2016) MOLNs: A cloud appliance for interactive, reproducible and scalable spatial stochastic computational experiments, SIAM J. Sci. Comput. 38(3), C179–C202.
- J. H. Abel, B. Drawert, A. Hellander, and L. R. Petzold (2015). GillesPy: A Python package for stochastic model building and simulation, IEEE LSL (to appear)
- C. Horuk, G. Douglas, A. Gupta, C. Krintz, B. Bales, G. Bellesia, B. Drawert, R. Wolski, L. Petzold, and A. Hellander, Automatic and Portable Cloud Deployment for Scientific Simulations, IEEE/ACM International Conference on High Performance Computing and Simulation, July 2014.
Life spans in size from small organisms consisting of single cells to complex organisms built up of billions of cells. Even the single-cell organisms are challenging to fully understand and study—their function is dependent on a rich set of reaction networks. Important molecules inside a cell may exist in only a few copies, and that makes them exceedingly difficult and costly to study.
The aim of our research is to develop algorithms and software that can assist in discoveries in basic science and medicine. We use mathematical models to describe how molecules move and interact inside cells, and then simulate these models to gain an understanding of how cells work. The multiscale nature of the problem is an interesting challenge. At the finest level we would consider single biomolecules and their exact molecular structure. There are models and methods for simulating systems at that level, but they are computationally expensive.
We couldn’t simulate the behavior of a large, complex system with such a method. Instead of considering the true structure of molecules, we could use a model that approximates them by spheres. At this level we can simulate medium-sized systems inside a cell on a time scale of seconds to minutes. An even more coarse-grained model doesn’t model individual molecules, but counts the number of molecules of different species in different parts of the domain. At this scale we can simulate bigger systems for hours.
We have developed methods with the aim of coupling accurate fine-grained methods with less computationally expensive coarse-grained methods. In doing so, we obtain methods that are more accurate than the coarse-grained method, but still more efficient than the fine-grained method. These methods are called multiscale methods. By adding scales to our simulations—more accurate models, incorporating some of the many complex internal structures that are vital to the function of the cell, but also more coarse-grained models, we attempt to move beyond the boundaries of what is currently possible to simulate with state-of-the-art methods.
- S. Hellander, A. Hellander, and L. Petzold (2017) Mesoscopic-microscopic spatial stochastic simulation with automatic system partitioning, Submitted.
- E. Blanc, S. Engblom, A. Hellander and P. Lötstedt (2016) Mesoscopic modeling of stochastic reaction-diffusion kinetics in the subdiffusive regime, Multiscale Model. Simul., 14(2), 668–707.
- L. Meinecke, S. Engblom, A. Hellander, P. Lötstedt (2016) Analysis and design of jump coefficients in discrete stochastic diffusion models, SIAM J. Sci. Comput. 38(1), A55–A83.
- M. Lawson, L. Petzold and A. Hellander (2015) Accuracy of the Michaelis-Menten approximation when analyzing effects of molecular noise, Roy. Soc. Interface, 12(106) 2015
- S. Hellander, L. Petzold and A. Hellander (2015), Reaction rates for mesoscopic reaction-diffusion kinetics, Phys. Rev. E., 92(2), 023312.