Our mission at the Data Science Institute (DSI) is to enable excellence in data science research and applications across the Laboratory's core missions.
Data science has become an essential discipline paving the path of LLNL's key program areas, and the Laboratory is home to some of the largest, most unique, and most interesting data and supercomputers in the world. The DSI acts as the central hub for all data science activity—in areas of artificial intelligence, big-data analytics, computer vision, machine learning, predictive modeling, statistical inference, uncertainty quantification, and more—at LLNL working to help lead, build, and strengthen the data science workforce, research, and outreach to advance the state-of-the-art of our nation's data science capabilities. Read more about the DSI.
Data Scientist Spotlight
Marisol Gamboa thrives at the intersection of solving challenges in unique ways and mentoring the next generation. Over her 18-year career at LLNL, she has honed expertise in software engineering, web applications, and big data analytics by developing solutions for numerous defense and counterproliferation programs—such as tools that help Department of Defense personnel distill, combine, relate, manipulate, and access massive amounts of data in a timely manner. “The many lessons I’ve learned over the years have positioned me to tackle any challenge knowing that I am able to learn quickly and adjust to any situation in real-time,” she says. Gamboa is the Deputy Division Leader for LLNL’s Global Security Computing Applications Division as well as Computing’s Workforce Team Lead. She formerly co-directed the Data Science Summer Institute and created the annual Data Science Challenge with UC Merced. Active in outreach to young women and underrepresented minorities in STEM—including LLNL’s Women in Data Science regional events—Gamboa holds a B.S. in Computer Science from the University of New Mexico.
New Research in AI
An LLNL team proposes a framework that leverages DL for symbolic regression via a simple idea—use a large model (neural network) to search the space of small models (mathematical expressions). The research was accepted as an Oral Presentation (with an acceptance rate of 1.5%) at the upcoming International Conference on Learning Representations (ICLR), ranking fifth out of approximately 3,000 scored papers.
- Deep symbolic regression: recovering mathematical expressions from data via risk-seeking policy gradients (preprint) – Brenden Petersen, Mikel Landajuela Larma, Nathan Mundhenk, Claudio Santiago, Soo Kim, and Joanne Kim
Lead author Petersen explains, “From the algorithmic perspective, our approach is not specific to the problem of symbolic regression. More broadly, our framework applies to discrete optimization problems where the user may want to incorporate some knowledge into the search. We are just now beginning to apply it to other tasks, such as finding interpretable reinforcement learning policies or optimizing amino acid sequences.”
Petersen credits the team’s expertise in optimization, mathematics, physics, deep learning, and reinforcement learning with making the approach successful. The research will also be featured at an upcoming DSI virtual seminar.