The HASTE project, a SSF-funded project on computational science and big data, takes a holistic approach to new, intelligent ways of processing and managing very large amounts of microscopy images to leverage the imminent explosion of image data from modern experimental setups in the biosciences. One central idea is to represent datasets as intelligently formed and maintained information hierarchies, and to prioritize data acquisition and analysis to certain regions/sections of data based on automatically obtained metrics for usefulness and interestingness.
The project is a collaboration between the Wählby lab (PI), Hellander lab (co-PI), both at the Department of Information Technology, Uppsala University, the Spjuth lab (co-PI) at the Department of Pharmaceutical Biosciences, Uppsala University, the Nilsson lab at the Department of Biochemistry and Biophysics at Stockholm University and SciLifeLab, Vironova AB and AstraZeneca AB.
Read more on the project webpage.
The integration between on the one hand data, modeling and algorithms, and on the other hand the specification, coordination and execution of large scale and data-intensive computational experiments poses a fundamental problem in all scientific disciplines relying on modeling and simulation. Today it is largely left to the modeler or engineer to manually tune models to fit data, to choose algorithms, to configure simulation workflows and to analyze simulation result. This is a big burden to place on e.g. a biologist who is mainly interested in how she can use modeling and simulation to learn new things about a biological system of interest. By utilizing machine learning and cloud computing, we are developing smart systems for scalable and efficient model exploration. An example of a workflow is shown in the image below, where a high-dimensional parameter sweep application is augmented with automated feature extraction and clustering, followed by training a model for classification based on user-defined labels (such as interesting or non-interesting realizations). With this model, the smart sweep application will learn to more efficiently explore areas of interestingness in the parameter space.
Open source computational science and engineering (CSE) software is an integral part of methodology-oriented computational research and a priority in the group. Due to the ongoing transformation of e-infrastructure to clouds, methods and workflows that promote horizontal scalability and elasticity for cloud applications are needed, and this may in many cases require re-thinking of how we best make use of computational resources. Other important questions include reproducibility and handling of large and complex data.
Selected recent publications:
- B. Drawert, A. Hellander, B. Bales, D. Banerjee, G. Bellesia, B.J. Daigle, Jr. G. Douglas, M. Gu, A. Gupta, S. Hellander, C. Horuk, D. Nath, A. Takkar, S. Wu, P. Lötstedt, C. Krintz, L. R. Petzold (2016) Stochastic Simulation Service: Bridging the gap between the computational expert and the biologist, PloS Comp. Bio. (to appear)
- B. Drawert, M. Trogdon, S. Toor, L. Petzold and A. Hellander (2016) MOLNs: A cloud appliance for interactive, reproducible and scalable spatial stochastic computational experiments, SIAM J. Sci. Comput. 38(3), C179–C202.
- J. H. Abel, B. Drawert, A. Hellander, and L. R. Petzold (2015). GillesPy: A Python package for stochastic model building and simulation, IEEE LSL (to appear)
- C. Horuk, G. Douglas, A. Gupta, C. Krintz, B. Bales, G. Bellesia, B. Drawert, R. Wolski, L. Petzold, and A. Hellander, Automatic and Portable Cloud Deployment for Scientific Simulations, IEEE/ACM International Conference on High Performance Computing and Simulation, July 2014.
A theme in the last decade of computational systems biology has been how molecular noise is a factor that needs to be acc
ounted for, both to understand how gene regulatory networks are able to operate robustly in a noisy molecular environment and to explain phenotypic variability on both the individual cell and population levels. A particularly intriguing question is the interplay between spatial and temporal aspects of intracellular signaling is organized. Numerically, efficient spatial stochastic methods are needed to study this, but they become much more computationally demanding, largely due to the multiscale nature of the pathways and processes. A central area in the group is have the development of hybrid simulation methods for stochastic reaction-diffusion processes.
- E. Blanc, S. Engblom, A. Hellander and P. Lötstedt (2016) Mesoscopic modeling of stochastic reaction-diffusion kinetics in the subdiffusive regime, Multiscale Model. Simul., 14(2), 668–707.
- L. Meinecke, S. Engblom, A. Hellander, P. Lötstedt (2016) Analysis and design of jump coefficients in discrete stochastic diffusion models, SIAM J. Sci. Comput. 38(1), A55–A83.
- M. Lawson, L. Petzold and A. Hellander (2015) Accuracy of the Michaelis-Menten approximation when analyzing effects of molecular noise, Roy. Soc. Interface, 12(106) 2015
- S. Hellander, L. Petzold and A. Hellander (2015), Reaction rates for mesoscopic reaction-diffusion kinetics, Phys. Rev. E., 92(2), 023312.