Our mission at the Data Science Institute (DSI) is to enable excellence in data science research and applications across the Laboratory's core missions.
Data science has become an essential discipline paving the path of LLNL's key program areas, and the Laboratory is home to some of the largest, most unique, and most interesting data and supercomputers in the world. The DSI acts as the central hub for all data science activity—in areas of artificial intelligence, big-data analytics, computer vision, machine learning, predictive modeling, statistical inference, uncertainty quantification, and more—at LLNL working to help lead, build, and strengthen the data science workforce, research, and outreach to advance the state-of-the-art of our nation's data science capabilities. Read more about the DSI.
Data Scientist Spotlight
With a PhD in Mathematics from the University of Illinois at Urbana-Champaign, Sarah Mackay enjoys using mathematical techniques to make inferences about real-world systems. She draws on her experience in combinatorial optimization, network science, and statistics to perform risk analyses for LLNL’s Cyber and Infrastructure Resilience program. Mackay designs and implements algorithms to secure infrastructure such as power grids, gas pipelines, and communication systems. “This work involves making assumptions about the structure of the system we’re studying. It can be challenging to know if the assumptions are valid and, thus, if we can trust our conclusions,” she explains. Mackay, who also coordinates the DSI’s virtual seminar series, thrives in the Lab’s culture of interdisciplinary teamwork. She states, “The set of problems one can tackle becomes so much larger when the pool of expertise grows.”
New Research in AI: COVID-19 Risks for Cancer Patients
Analyzing one of the largest databases of patients with cancer and COVID-19 with machine learning models, researchers from LLNL and the University of California, San Francisco, found previously unreported links between a rare type of cancer—as well as two cancer treatment-related drugs—and an increased risk of hospitalization from COVID-19. The findings appear in the journal Cancer Medicine. Using a logistical regression approach, the team examined de-identified electronic health record data from the UC Health COVID Research Data Set on nearly a half-million patients who underwent COVID-19 testing at all 17 UC-affiliated hospitals. The dataset included nearly 50,000 patients with cancer—more than 17,000 of whom also had tested positive for COVID—and contained information on patient demographics, comorbidities, lab work, cancer types, and various cancer therapies.