Volume 43

Jan. 27, 2025

DSI logo cropped FY22

Our mission at the Data Science Institute (DSI) is to enable excellence in data science research and applications across LLNL. Our newsletter is a compendium of breaking news, the latest research, outreach efforts, and more. Past volumes of our newsletter are available online.

Graphic with cover image from AI Safety Report and words “New Report Out Now”.

Hot Off the Press: AI Safety Report

Launched in December 2024, “Safety in Artificial Intelligence: Challenges and Opportunities for the U.S. National Labs and Beyond” is a collaborative report resulting from last year’s “Strategy Alignment on AI Safety” workshop convened by LLNL and University of California (UC) at the UC Livermore Collaboration Center. The report underscores a need for safeguarded AI technologies, making recommendations for the role that national labs and others can play in developing novel evaluation methodologies that allow full considerations of risks/threats of AI technologies in different domains. The authors highlight the need for a multilayered solution combining the development of new methods and algorithmic approaches to mitigate threats with an active participation of the government(s) in setting high industry standards and regulations based on state-of-the-art technology.

The report was written by: Felipe Leno da Silva, Ruben Glatt, Brian Giera, Cindy Gonzales, Peer-Timo Bremer (LLNL); Jessica Newman (UC Berkeley); Courtney Corley (PNNL), David Stracuzzi, Philip Kegelmeyer (SNL); Francis Joseph Alexander (ANL); Yarin Gal (UK AI Safety Institute); Mark Greaves (Schmidt Sciences); Adam Gleave (FAR AI); Timothy Lillicrap (DeepMind & UCL); Jean-Pierre Falet, Yoshua Bengio (Mila and University of Montreal).

Access the full report on the Data Science Institute (DSI) website report page or on the OSTI website.


On left, DSI Deputy Cindy Gonzales, DSSI student intern Patrick McHugh, and Kevin Quinlan, his mentor; on right, headshot of Kevin

How Mentors Shape Lab Futures: Kevin Quinlan for National Mentoring Month

For National Mentoring Month, Kevin Quinlan, an Applied Statistician in LLNL’s Computational Engineering Division, reflects on the significant impact mentors have had on his career, particularly his Ph.D. advisor, among many others. Since joining the Lab in 2019, Kevin has walked in his mentors’ footsteps and has been actively involved in mentoring students through the Data Science Summer Institute (DSSI) program.

Kevin has some advice for other Lab employees who are newer in their career: “You don’t have to have a million published papers to be a good mentor.” He invites others to join the DSSI program, with project submissions due by January 15, and student selection by February 7. The application window for interns closes on January 31. More information is available on the DSSI web page.


Group picture of Ruben Glatt, Conor Hayes, Joe Wakim, and Leno da Silva at NeurIPS conference.

Strong Representation at NeurIPS 2024

LLNL was well represented at the Conference on Neural Information Processing Systems (NeurIPS) 2024, the prestigious annual AI scientific conference that attracts leading researchers and practitioners in the field of machine learning and computational neuroscience. “LLNL researchers had over 12 workshops and posters accepted for the weeklong event in Vancouver. This is a huge achievement for the Lab and demonstrates recognition of our leadership in the field, ” says DSI Deputy Director, Cindy Gonzales. One paper in particular, “Transformers Can Do Arithmetic with the Right Embeddings”, sheds new light on a troublesome aspect of large language models.  

The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. In the paper, the authors mended this problem by adding an embedding to each digit that encodes its position relative to the start of the number. In addition to the boost these embeddings provide on their own, they showed that this fix enables architectural modifications such as input injection and recurrent layers to improve performance even further. 

With positions resolved, the authors could study the logical extrapolation ability of transformers. Can they solve arithmetic problems that are larger and more complex than those in their training data? The authors found that training on only 20-digit numbers with a single GPU for one day, they could reach state-of-the-art performance, achieving up to 99% accuracy on 100-digit addition problems. Finally, the authors show that these gains in numeracy also unlock improvements on other multi-step reasoning tasks including sorting and multiplication. 

Read the full paper on arxiv.


Timo Bremer at Sidekick demo at SC’24

LLNL Delivers at SC24

Lab staff were present in great number at the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC24) in Atlanta, Georgia in December. Not only is it one of the most anticipated supercomputing events of the year, but the event also provides researchers a platform to demonstrate the latest advances in supercomputing technology. Among its many notable activities at the event, LLNL debuted the new El Capitan system which was recognized by Top500 as the most powerful system with a Linpack score of 1.742 exaflops.

Also significant was LLNL's Sidekick demo given by Abhik Sarkar and Timo Bremer at the Department of Energy’s exhibit booth. State-of-the-art physics drivers, such as LLNL’s National Ignition Facility and ITER, along with multi-million-dollar university-scale facilities, are not optimal platforms for pioneering wide-ranging explorations in digital infrastructure and closed-loop control schemes using AI, as they are engaged in critical scientific research with expensive equipment. To address this challenge, LLNL is developing modular, flexible small-scale surrogate facilities, termed “sidekick facilities,” which replicate the complex non-physics aspects of closed-loop autonomous operations and enhance data generation and acquisition rates.

Read more about the event on LLNL News.


Graphic with a screenshot of a journal article and text with “as seen in Nature”

Enabling AM Part Inspection of Digital Twins

Digital twins (DTs) are an emerging capability in AM, set to revolutionize design optimization, inspection, in situ monitoring, and root cause analysis. AM DTs typically incorporate multimodal data streams, ranging from machine toolpaths and in-process imaging to X-ray CT scans and performance metrics. Despite the evolution of DT platforms, challenges remain in effectively inspecting them for actionable insights, either individually or in a multidisciplinary, geographically distributed team setting. Quality assurance, manufacturing departments, pilot labs, and plant operations must collaborate closely to reliably produce parts at scale. This is particularly crucial in AM where complex structures require a collaborative and multidisciplinary approach. Additionally, the large-scale data originating from different modalities and their inherent 3D nature pose significant hurdles for traditional 2D desktop-based inspection methods. To address these challenges and increase the value proposition of DTs, a team of LLNL researchers have developed a novel virtual reality (VR) framework to facilitate collaborative and real-time inspection of DTs in AM. This framework includes advanced features for intuitive alignment and visualization of multimodal data, visual occlusion management, streaming large-scale volumetric data, and collaborative tools, substantially improving the inspection of AM components and processes to fully exploit the potential of DTs in AM.

From first-author Vuthea Chheang: “Our innovative VR framework for collaborative and real-time inspection of digital twins has the potential to transform the additive manufacturing inspection process. By enhancing multimodal data visualization, overcoming the limitations of traditional inspection methods, and fostering improved collaboration, this framework goes beyond quality control. It can facilitate process and facility planning, serve as an effective training platform, expedite root cause analysis, and allow quick utilization of data.”

Read the full article in Scientific Reports.


Image of the Capitol building in Washington DC with an American flag flying

LBANN Team Makes an Impact in DC

The Livermore Big Artificial Neural Network (LBANN) group just released their parallel LLaMA model in PyTorch, optimized for AMD chips. This is possible because the HPC-centric, deep learning training framework is optimized to compose multiple levels of parallelism. LBANN provides model-parallel acceleration through domain decomposition to optimize for strong scaling of network training. It also allows for composition of model parallelism with both data parallelism and ensemble training methods for training and inference of large neural networks with massive amounts of data. LBANN is able to advantage of tightly-coupled accelerators, low-latency high-bandwidth networking, and high-bandwidth parallel file systems. 

The model has already demonstrated useful applications for small molecule drug design (the paper for which appeared as a finalist for the 2020 Gordon Bell Special Prize in The International Journal of High Performance Computing Applications).     

Even more exciting, the team was recently invited to showcase this work in Washington, DC. They used the model, deployed on Tenaya, an early access system for the El Capitan supercomputer, to assist in a high-value demonstration of model capability to multiple U.S. government partners. Congratulations to team members Brian Van Essen, Tal Ben Nun, Pier Fiedorowicz, Tom Benson, and Nikoli Dryden, as well as Josh Kallman, Tom Stitt, Charles "CJ” Jekel, Brian Bartoldson, and Bhavya Kailkhura for assisting with the demo.  

To learn more, visit the LBANN GitHub page.  


Ippolito Imani Caradonna works in the Rapid Response Lab, which offers a high-throughput protein encoding and extraction instrument for generating and testing hundreds of computational designs

How the Lab is Revolutionizing Drug Development

LLNL's Generative Unconstrained Intelligent Drug Engineering (GUIDE) program is revolutionizing the development of medical countermeasures against biological threats. Funded by the Department of Defense's Chemical and Biological Defense Program, GUIDE integrates high-performance computing with rapid experimental validation to expedite the creation of antibody-based therapeutics.

Traditional drug development is often a lengthy and costly process. GUIDE addresses these challenges by employing advanced computational models to predict and design effective antibody candidates swiftly. These predictions are then rapidly tested and validated in laboratory settings, significantly reducing the time required to identify promising therapeutic agents.

A notable application of GUIDE's capabilities was demonstrated during the COVID-19 pandemic. The program's approach enabled the rapid development of antibody candidates targeting the SARS-CoV-2 virus, showcasing its potential to respond swiftly to emerging biological threats.

By leveraging LLNL's expertise in computational science and biotechnology, GUIDE is setting new standards in the field of biological defense. Its innovative methodology not only accelerates the drug development process but also enhances the precision and effectiveness of medical countermeasures, providing a robust defense against both current and future biological challenges.

Read the article in Science & Technology Review or watch the explainer video on YouTube.


Logo with three women looking forward and text “Women in Data Science Worldwide – Livermore”

Register for WiDS Livermore & Datathon 2025

You’re invited to join the Lab’s Women in Data Science (WiDS) conference taking place on Wednesday, March 12. This hybrid event is free and open to everyone—inside or outside the Lab, any career level, and data science experience level. The registration link and other details are posted at data-science.llnl.gov/wids.  

  • Register by Friday, February 28

  • Hosted at the University of California Livermore Collaboration Center (UCLCC) and virtually

  • Sponsored by LLNL’s Data Science Institute; Computing Principal Directorate; and Office of Inclusion, Diversity, Equity, and Accountability

  • This regional conference will include a tie-in with the LLNL Datathon (held separately on Feb. 19) as well as keynote speakers, technical talks, career-focused panel discussions, and networking opportunities.

  • More information about the speakers and panelists will be available on the event page in the near future.

This is the 8th year for WiDS Livermore, which is independently organized by LLNL to be part of the mission to increase participation of women in data science and to feature outstanding women doing outstanding work. Contact WiDS-Committee [at] llnl.gov (WiDS-Committee[at]llnl[dot]gov) with any questions. We look forward to welcoming you on February 19 and March 12. Register for both events on the event page.


Headshot of Brian Bartoldson with graphic of Data Scientist Spotlight.

Meet an LLNL Data Scientist: Brian Bartoldson

Brian Bartoldson is an AI researcher in LLNL’s Computational Engineering Division. Beginning his time at the Lab as an intern in 2017 and 2018, he became a postdoctoral researcher full-time in 2020. He now works on a variety of projects centered on the efficiency, safety, and helpfulness of AI including autonomous multi-scale, zeroth order ML, efficient large-scale ML, localizing and explaining performance defects, and GUARD (Guaranteeing AI Risk Deterrence). “[My work advancing AI] excites me because better AI can accelerate the development of a wide array of important technologies,” Bartoldson says. “I am motivated by technological improvement’s ability to improve lives, and I pursue research in areas where I think my work will have the most positive long-term effect.” Some of Bartoldson’s research papers accepted this year have focused on improving the ability of large language models to follow instructions and perform reasoning. Additionally, he is active in the Lab community, serving as a mentor to several students every year and playing basketball with an employee group. He emphasizes the importance of his role as a mentor for students pursuing science, noting, “As a former intern, it’s easy to put myself in their shoes and see the importance of the internship opportunity. Interning is a great stepping stone to a career in science, and I want to support that.”