Volume 42

Dec. 4, 2024

DSI logo cropped FY22

Our mission at the Data Science Institute (DSI) is to enable excellence in data science research and applications across LLNL. Our newsletter is a compendium of breaking news, the latest research, outreach efforts, and more. Past volumes of our newsletter are available online.

Seventeen people stand as a group in front of a TV screen

Summit Sparks Cross-Pollination of DSI Programs

The DSI’s expansion this year to add several new programs—and therefore new ambassadors representing myriad areas of the Lab—has opened the door to collaboration opportunities. On October 22, the ambassadors convened in a first-time DSI Summit to share their 2025 plans and discuss ways to benefit the data science community. These DSI programs are the Consulting Service, student internships (the Data Science Summer Institute and Data Science Challenge), scientific outreach (including workshops and seminars), staff training, open data science (including the Open Data Initiative), and Women in Data Science (WiDS). 

Thanks to successful “mini road shows” in 2024 to raise awareness among LLNL staff, the Consulting Service has advised on more than three dozen projects and is onboarding almost two dozen new consultants. As highlighted in previous issues of this newsletter, consultants have won appreciation awards for their work (you can read more about the program below). 

As the Consulting Service demonstrates, data scientists are crucial to the Lab’s workforce, which is why the student intern pipeline is a key DSI focus. The team discussed managing the several hundreds of applications received in a typical year, developing curricula based on students’ experience levels, and offering additional activities such as resume workshops and career coaching. Attending the Summit was the first task for newly appointed DSSI lead Min Priest (see story below). 

The DSI’s scientific outreach efforts have made waves this year with an AI safety workshop and seminars by noted experts. Additionally, this area of the DSI aims to elevate the Lab’s accomplishments and capabilities to international audiences, such as by co-organizing an external workshop on AI for critical infrastructure. In a similar vein, the DSI’s Open Data Science program focuses on public engagement through releasable datasets as well as shareable data-analysis and -processing tools. 

For LLNL employees, the SEAM (Shared Education in Artificial intelligence and Machine learning) program completed its first year of courses for staff who want to learn more about AI/ML tools. The next session of courses is already under way after receiving more than 220 applications. Last but not least, planning is in full force for the 2025 WiDS Livermore datathon and conference (see story below). 

By sharing each program’s wins and gaps, DSI’s ambassadors readily identified several ways to enhance existing activities, coordinate upcoming events, and reach new audiences. For example, students with relevant experience could shadow data science consultants or help clean up raw datasets before release. Another collaborative idea was to bring seminar/workshop and WiDS awareness to their respective audiences. “Visibility and collaboration are critical to our overall success,” stated DSI deputy director Cindy Gonzales. “Everybody—the DSI, the Lab, and the larger data science community—wins when we don’t work in a vacuum, when we have cross-pollination like we had the Summit.” Finally, the Summit provided the DSI with an opportunity to thank outgoing communications manager Holly Auten for her support (see story below).


Three individuals in photo, with middle person holding award

Honoring Holly Auten With Appreciation Award

Since joining the Lab in 2016, Holly Auten has built a remarkable career and earned the deep respect of leaders and peers alike. This has recently resulted in some well-deserved recognition as she concludes her tenure as the DSI’s communications manager. At the DSI Summit (see story above), Holly was honored with an Appreciation Award, acknowledging her exceptional leadership and dedication to building and maintaining a vibrant data science community across LLNL and its partners.

In October 2018, Holly joined the DSI and played a pivotal role in bringing the vision for this institute to life. From Michael Goldman, DSI’s first director: “When we started DSI, we hardly knew what we were doing, and I knew even less about how to effectively communicate across and out of an organization. Holly embraced every challenge in an unknown landscape and excelled at communicating complex technical concepts to our organization and stakeholders. She helped organize first-of-their-kind workshops and seminars, maintained timely consistency of our newsletter, developed and implemented our strategic plan, created a lasting brand with large external visibility, and worked diligently to represent and promote our incredible data science workforce through spotlights, highlight videos, technical articles, social media presence, and our website. Working with Holly as the DSI grew over the five years I had the privilege to lead it will forever be one of my most valued professional experiences.”

If helping to establish one organization’s communications function wasn’t enough, Holly took on a new challenge in 2023 with the AI Innovation Incubator (AI3). She was instrumental in supporting director Brian Spears in foundational efforts. Says Spears, “Holly has an incredible vision, not only for how to communicate messages, but for the ways that well-delivered messages can change what’s possible for an organization. She has an unparalleled talent for finding the key story in a sea of activities, then using that story to uplift teams. In our AI world, the pace is high, and Holly captures lightning-fast changes to give the entire community critical situational awareness.”

In addition to the DSI, Holly supports LLNL’s Computing directorate in a role encompassing the open-source software community, the High Performance Computing Innovation Center, the RADIUSS project, the Modular Finite Element Methods team, and most recently the El Capitan communications team. As Holly returns to Computing full time, we are excited to welcome Elisa Esme Abadi as her well-qualified successor. Congratulations, Holly, on this well-deserved recognition! Your legacy at DSI will not be forgotten.


Headshots of Eric and Bhavya announcing their appointment to the DSI Council

Welcoming New Members to the DSI Council

We are thrilled to welcome two new members to the Data Science Institute (DSI) Council: Bhavya Kailkhura and Eric B. Duoss. Their selection reflects their exceptional leadership and significant contributions to the field of data science, and comes with strong recommendation from their peers.

Bhavya Kailkhura is a Staff Scientist in the Center for Applied Scientific Computing at Lawrence Livermore National Laboratory. His primary area of expertise is in developing “Safe and Trustworthy AI” for applications in scientific research and national security. A Senior Member of IEEE, Bhavya has received several prestigious awards, including the All University Doctoral Prize from Syracuse University (2017), the Deputy Director for S&T Excellence in Publication Award from LLNL (2019 and 2024), and multiple best paper awards. Additionally, he was honored with the LLNL Early and Mid-Career Recognition Award in 2024. Currently, he is leading projects aimed at improving the safety of large language models, with the goal of building trust in AI technologies across the DOE/NNSA mission space.

Eric B. Duoss is the Director of the Center for Engineered Materials and Manufacturing at Lawrence Livermore National Laboratory, where he directs research activities and maps strategic directions in the areas of advanced materials and manufacturing. At LLNL, Eric leads teams that invent novel materials and manufacturing technologies, with focus on creating designer architectures for chemical, mechanical, thermal, and functional properties for applications in the fields of defense, climate, transportation, energy, aerospace, human health, and others. Eric's projects increasingly use, adapt, and apply novel data science techniques for these applications. Eric is a recipient of the Presidential Early Career Award in Science and Engineering (2016) and he leads a team that was honored with the Department of Energy Secretary’s Achievement Award (2019). Eric has co-authored over 90 peer-reviewed technical publications that have collectively received over 16,000 total citations. He has also been awarded over 50 U.S. patents. Eric has a Ph.D. in Materials Science and Engineering from the University of Illinois at Urbana-Champaign (2009) and dual B.S. degrees in Chemistry and Mathematics from St. Norbert College (2003).

As we welcome Bhavya and Eric, we also extend our thanks to departing council member Dan Merl for his invaluable contributions to the DSI Council. Dan's leadership has greatly shaped our strategic vision and community over the last two years.

The DSI Council is vital in guiding LLNL’s data science strategy and fostering collaboration since its inception in 2018. We look forward to the insights Bhavya and Eric will bring as we continue to advance our mission.


Headshot of Min with their dog and text announcing the leadership change

DSSI Welcomes Min Priest, New Program Lead

DSI is pleased to announce the selection of Min Priest as the new lead for DSSI. This well-deserved appointment is a natural extension of Min's significant contribution to the program over the last four years, including serving as DSC mentor, DSSI seminar lecturer, and project mentor to five DSSI students. Min is a computing scientist in LLNL’s Center for Applied Scientific Computing, specializing in the intersection of data science and high-performance computing. Their research includes streaming and sketching algorithms, massive scale graph algorithms, and high-dimensional and scalable statistics. As DSSI Lead, Min will plan and execute the internship program, collaborating closely with the DSSI Student Programs team: Omar DeGuchy (DSC Lead), Mary Silva (Student Outreach Lead), Kerianne Pruett (Challenge Problem Lead), Kendall Luna (Administrative Support), and Brian Gallagher (DSI Student Programs Director). We would like to thank Amanda Muyskens for her service to DSSI for many years.


A circle with three women looking forward and the text Women in Data Science Worldwide Livermore

Save the Date: Women in Data Science (WiDS) Livermore 2025

You're invited to join us on March 12, 2025 for WiDS Livermore, an inspiring event organized by LLNL, dedicated to increasing the participation of women in data science and showcasing the outstanding work of women in the field. Our annual event will take place at the UC Livermore Collaboration Center and will feature keynote speakers discussing important topics such as data ethics, privacy, healthcare, and data visualization, along with networking opportunities to connect with fellow data scientists and industry professionals.

We are also hosting our annual Datathon on February 19 in the same location. This one-day event at LLNL is designed for data science enthusiasts at beginner and intermediate levels, and provides participants with the opportunity to collaborate, innovate, and tackle a challenging data science problem.

Everyone is welcome, whether you are affiliated with LLNL, a university, a commercial company, or any other organization. We look forward to seeing you there! Registration for both events will open on January 15 on the WiDS page.


Flowchart demonstrating AMS workflow moving from AMS-driven physics code to different steps in the HPC system

Embedded Machine Learning for Smart Simulations

In virtually all mission-critical applications, the phenomena of interest are dependent on a wide range of length and timescales that cannot be fully resolved in a single simulation. Instead, the problem is typically formulated at the largest scale necessary—for example, an entire facility, device, or manufactured part—and finer scales are represented through so-called subscale models. These subscale models approximate processes such as atomic physics, grain-scale responses, and chemical kinetics, among many others. But these subscale codes themselves also use approximations of even finer scales, creating a nested hierarchy of models that are called at each time step at each grid element in the primary simulation. Scientists routinely must balance the desired fidelity of these subscale models with the available compute resources. Ultimately, the achievable accuracy is strictly limited by the corresponding costs.

LLNL’s Autonomous Multiscale Simulation (AMS) Strategic Initiative has demonstrated an alternative strategy built on a combination of software engineering, trustworthy machine learning, and advanced modeling to make multiphysics simulations faster, more accurate, and portable. AMS provides the end-to-end infrastructure to automate all steps in the process from training, testing, and deploying machine learning surrogate models in scientific applications. Read more about this multidisciplinary project led by DSI Council member Timo Bremer at LLNL Computing.


Stock cartoon image of two data scientists collaborating in front of a projector with graphs

DSI Consulting Service Spurs Innovation

DSI runs an innovative initiative called the DSI Consulting Service (DSICS), aimed at bridging the gap between scientific research and advanced data science techniques. This program places data science consultants directly with research teams to enhance their projects with cutting-edge statistical methods and data management strategies. As DSICS director Jason Bernstein notes, “DSICS is perfect for filling small, short-term gaps in research projects. The Laboratory’s demand for this type of service is large, and it’s growing.” The program has seen significant growth, with many early- to mid-career consultants from various fields such as Physical and Life Sciences, Engineering, and Computing joining the initiative.

The impact of DSICS is already evident through the success stories of its consultants. Tyler Alcorn, a health physicist, utilized data science techniques to revolutionize radiation protection work by creating a vector database and employing Monte Carlo simulations. His work exemplifies how data science can transform traditional fields, as he explains, “One of the ways I started applying data science techniques at Livermore was by creating a vector database of historical environmental testing and usage information to enable the tracking, trending and analysis of health physics data.” Similarly, consultants like Josh Ottaway and Mike Boyle have applied machine learning and artificial intelligence to enhance the safety, security, and reliability of the U.S. nuclear arsenal and support forensic science efforts, respectively. These achievements underscore the program's potential to drive innovation and efficiency across diverse scientific disciplines. Read the full article here.


Group photo of four individuals seated on a stage presenting and one person standing at the podium

DOE Data Days 2024: Advancing Data Management and AI

LLNL recently hosted the Department of Energy's (DOE) annual Data Days event, uniting data scientists, researchers, and policymakers to explore advancements in data management, AI, and high-performance computing. The three-day workshop, held from October 22-24, focused on challenges in nuclear security, energy, and scientific discovery.

Keynote speakers, including DOE Chief Data Officer Rob King, highlighted the importance of structured data in initiatives like nuclear safety and clean energy transitions. Discussions covered cloud data management, AI governance, and the integration of AI with high-performance computing. Notable projects such as Project Alexandria and the Open Energy Data Initiative (OEDI) were showcased, emphasizing data accessibility and AI readiness.

Interactive sessions and panels addressed data governance, security, and the future of AI in energy and defense. LLNL's John Westlund introduced the Unified Storage Namespace (USN), enhancing data management across HPC systems, while Sandia National Laboratories' Tom Trodden emphasized the need for AI literacy and governance.

The event concluded with a focus on data curation and governance, featuring insights from DOE Deputy Chief Data Officer Seth Berl and LLNL experts on building interconnected data ecosystems. The discussions underscored the importance of data stewardship, FAIR principles, and the challenges of digital transformation in scientific research. 

DOE Data Days 2024 highlights the critical role of data in driving scientific and technological advancements, fostering collaboration across national labs and beyond. More on the event on the DSI site.


Headshot of Jen Caseres next to a spotlight icon and the words “Data Scientist Spotlight”

Meet an LLNL Data Scientist: Jen Caseres

Jen Caseres is a staff scientist in LLNL’s Nuclear and Chemical Sciences Division, where she works on chemical and isotopic data analysis for nuclear forensics. She joined the Lab in 2020 after completing an M.S. in Geology at the University of Minnesota and a B.S. in Geochemistry at Caltech. Caseres previously worked in the geology subfield of petrology and used mass spectrometry and electron microscopy to analyze the history and origin of rocks from various settings. Since starting at LLNL, she has developed an interest in applying data science to chemical data for nuclear forensics, which involves the same analytical techniques as petrology—a natural transition. Beyond her job duties, Caseres is also involved in the Girls Who Code program, and she served as a member of the organizing committee for Livermore’s 2024 Women in Data Science (WiDS) event after an insightful experience attending the event in 2023. “LLNL has been a great place to learn data science because of its opportunities for professional development and interdisciplinary collaboration,” she says. “I hope by working with WiDS and other programs like Girls Who Code, I can encourage students and non-data-scientists to start thinking about how data science and programming can help with their data problems.”