DSI Predictive Biology Logo


 Research Spotlight: Multiscale Modeling for Cancer


Drawing of proteins next to the text “Precision Medicine: Modeling Proteins Linked to Cancer” on a blue background

This work shows that lipids are a key player… We can see how RAS interacts in all our simulations at different angles.

– Helgi Ingólfsson

The Department of Energy and National Cancer Institute have joined forces with researchers from multiple institutions, including LLNL, to simulate and explain interactions between cell membranes and specific proteins that induce many forms of cancer. The project—part of the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) program and led by LLNL’s Fred Streitz and Dwight Nissley from Frederick National Laboratory—has made great strides toward this goal in just a few years.

RAS proteins are a family of proteins whose mutations are linked to more than 30% of all human cancers. Understanding protein biology requires modeling at different spatial and temporal scales: from nano- to milliseconds and from nano- to micrometers. An LLNL-led team has developed a machine learning (ML)–based simulation for next-generation supercomputers capable of modeling the RAS protein signaling complex.

The sophisticated ML model is trained on coarse macroscale simulations before resources are spent on more detailed microscale molecular dynamics simulations. The team began by simulating the impact of the cell membrane on RAS proteins at long timescales and incorporated an ML algorithm to determine which lipid “patches” (local environments) were interesting enough to model in more detail with a molecular-level model.

The result is the Massively parallel Multiscale Machine-Learned Modeling Infrastructure (MuMMI) framework, which scales up efficiently on large, heterogeneous high-performance computing systems like LLNL’s Sierra supercomputer. A paper describing the workflow that drives this first-of-its-kind multiscale simulation won the Best Paper Award at the 2019 International Conference for High Performance Computing, Networking, Storage and Analysis (SC19).

“The ML model lets us remove the human from the loop while still generating relevant seed data,” says computer scientist and lead author Francesco Di Natale. “The benefit to automating the process is that we can identify which lipids drive multiple RAS proteins to colocalize, which would be hard to do manually. With all this new data, we can ask questions about what realistically happens rather than guessing at valid parameters.”

A subsequent publication in the Proceedings of the National Academy of Sciences details MuMMI’s methodology. The team simulated a 1µ x 1µ patch on Sierra and observed how hundreds of different RAS proteins interacted with eight kinds of lipids. They created more than 100,000 molecular dynamic simulations from ML-selected snapshots of the larger macro-model simulation, enabling them to determine the probabilities of RAS binding to other proteins with a given orientation on a cell membrane. Combined with experimental results, the work demonstrates the strong link between lipids and RAS orientation and ability to bind downstream signaling molecules.

“This work shows that lipids are a key player. By modulating the lipids and different lipid environments, RAS changes its orientation, and we can see how RAS interacts in all our simulations at different angles,” states LLNL scientist and lead author Helgi Ingólfsson.

With MuMMI, researchers can simulate thousands of different lipid compositions derived from the macro model, and experimentalists can test new hypotheses. Knowledge gained from experiments will feed back into the ML model, creating a validation loop that will improve its accuracy over time.

Team Acknowledgments

Alongside Streitz, Di Natale, and Ingólfsson, LLNL researchers include Harsh Bhatia, Peer-Timo Bremer, Tim Carpenter, Gautham Dharuman, Jim Glosli, Felice Lightstone, Shusen Liu, Adam Moody, Tomas Oppelstrup, Tom Scogland, Shiv Sundram, Michael Surh, Brian Van Essen, Yue Yang, and Xiaohua Zhang.

The team collaborated with Frederick National Laboratory for Cancer Research; Los Alamos National Laboratory; Argonne National Laboratory; the University of California, San Francisco; IBM’s Thomas J. Watson Research Center; the National Cancer Institute; and San Jose State University. Funding also comes from the National Nuclear Security Administration’s Advanced Simulation and Computing program.