Nov. 21, 2022
Award-Winning Papers
LLNL’s data science community continues to receive accolades for ground-breaking research and techniques. PDFs or full-text web pages are linked where available.
The 2022 IEEE VIS Test of Time Awards recognize papers that are “still vibrant and useful today and have had a major impact and influence within and beyond the visualization community” (read more at LLNL News). The conference is premier forum for advances in visualization and visual analytics.
- 25-year award (published in 1997): ROAMing Terrain: Real-Time Optimally Adapting Meshes – Mark Miller and collaborators
- 14-year award (published in 2008): A Practical Approach to Morse-Smale Complex Computation: Scalability and Generality – Peer-Timo Bremer and collaborators
The 2022 Director’s S&T Excellence in Publication Awards honor outstanding scientific and technical publications by LLNL staff. These papers are noted as having an especially significant impact on the Lab’s missions and/or external research community.
- Artificial Intelligence Detection Algorithms – Ryan Goldhahn, Michael Goldman, Mary Gullett, Anna Hiszpanski, Andrew Horning, Goran Konjevod, Braden Soper
- Deep Symbolic Regression: Recovering Mathematical Expressions from Data via Risk-Seeking Policy Gradients – Mikel Landajuela Larma, Terrell Mundhenk, Brenden Petersen, Claudio Santiago
- Designing Counterfactual Generators Using Deep Model Inversion – Jayaraman Thiagarajan and Vivek Narayanaswamy
- Improved Protein–Ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference – Jonathan Allen, William Bennett, Derek Jones, Hyojin Kim, Dan Kirshner, Felice Lightstone, Garrett Stevenson, Sergio Wong, Adam Zemla, and Xiaohua Zhang
- Self-Training with Improved Regularization for Sample-Efficient Chest X-Ray Classification – Jayaraman Thiagarajan
HPCwire Award for Cognitive Simulation Application
The high-performance computing (HPC) publication HPCwire announced LLNL as the winner of its Editor’s Choice award for Best Use of HPC in Energy for applying cognitive simulation (CogSim) methods to inertial confinement fusion (ICF) research. The award was presented on November 14 at SC22, the largest supercomputing conference in the world, and recognizes the team for progress in their ML-based approach to modeling ICF experiments performed at the National Ignition Facility and elsewhere, which has led to the creation of faster and more accurate models of ICF implosions. Emerging at LLNL over the past several years, the CogSim technique uses the Lab’s cutting-edge HPC machines to combine deep neural networks with the massive databases of historical ICF experiments to calibrate the models. Applying CogSim to ICF research has resulted in faster, better-performing models that can predict experimental outcome with higher accuracy than simulations alone and with fewer experiments, according to researchers.
Members of the CogSim team include LLNL researchers Brian Spears, Timo Bremer, Luc Peterson, Kelli Humbird, Rushil Anirudh, Brian Van Essen, Shusen Liu, Jim Gaffney, Bogdan Kustowski, Gemma Anderson, Francisco Beltran, Michael Kruse, Sam Ade Jacobs, David Hysom, Jae-Sung Yeom, Peter Robinson, Jessica Semler, Ben Bay, Scott Brandon, Vic Castillo, David Domyancic, Richard Klein, John Field, Steve Langer, Joe Koning, Michael Kruse, Dave Munro, and Robert Hatarik.
Video: Understanding the Universe with Applied Statistics
In a new video posted to the Lab’s YouTube channel, statistician Amanda Muyskens describes MuyGPs, her team’s innovative and computationally efficient Gaussian Process hyperparameter estimation method for large data. The method has been applied to space-based image classification and released for open-source use in the Python package MuyGPyS. MuyGPs will help astronomers and astrophysicists working with the massive amounts of data gathered from the Vera C. Rubin Observatory Legacy Survey of Space and Time (also known as LSST), as well as numerous other laboratory and science applications.
Using Social Media Data to Inform Seismology
Researchers often mine crowdsourced data—such as images of damage posted after an earthquake—from social media platforms to better understand natural disasters and guide rescue efforts. In a new Scientific Reports paper, LLNL seismologist Qingkai Kong and UC Berkeley collaborators introduce a transfer learning method that detects damaged buildings in earthquake-aftermath images. The team manually labeled 6,500 images from social medial platforms and trained a deep learning model via transfer learning to recognize damaged buildings. They also visualized the features that are important for the model to make decisions. For example, the damaged building shown at left is highlighted with important features for the model’s decision making.
The team’s model achieved good performance when tested on newly acquired images of earthquakes at different locations, and when run in near real-time on a Twitter feed after the 2020 Aegean Sea earthquake (magnitude 7.0). A future goal is for users to upload images after earthquakes to the MyShake app and for the model to identify the images containing damaged buildings, helping to keep the regional community informed about damage location and severity. The method described in the paper could also be expanded to extract social media images after other types of disasters. “Machine learning models like this will provide us with more information about natural hazards so we can prepare for the next one,” states Kong.
Recent Research
- Asilomar Conference on Signals, Systems and Computers: Bayesian Multiagent Active Sensing and Localization via Decentralized Posterior Sampling (link forthcoming from conference proceedings) – Braden Soper, Priyadip Ray, Jose Cadena, and Ryan Goldhahn
- International Journal of Greenhouse Gas Control: Deep Learning-Accelerated 3D Carbon Storage Reservoir Pressure Forecasting Based on Data Assimilation Using Surface Displacement from InSAR – Hewei Tang, Pengcheng Fu, Honggeun Jo, Su Jiang, Christopher Sherman, Joseph Morris, and collaborators
- Journal of Biomedical Informatics:
- Continuous-Time Probabilistic Models for Longitudinal Electronic Health Records – Alan Kaplan, Uttara Tipnis, and collaborators
- Unsupervised Probabilistic Models for Sequential Electronic Health Records – Alan Kaplan, Priyadip Ray, and collaborators
- Journal of Intelligent Manufacturing: In-Process Monitoring and Prediction of Droplet Quality in Droplet-on-Demand Liquid Metal Jetting Additive Manufacturing Using Machine Learning – Tammy Chang, Brian Giera, Nicholas Watkins, Saptarshi Mukherjee, Andrew Pascall, David Stobbe, and collaborators
- Physics Letters B: Controlling Extrapolations of Nuclear Properties with Feature Selection – Nicolas Schunck and collaborator
- Physics of Plasmas: Transfer Learning Driven Design Optimization for Inertial Confinement Fusion – Kelli Humbird and Luc Peterson
Improving Visualization of Large-Scale Datasets
Researchers are starting work on a three-year project aimed at improving methods for visual analysis of large heterogeneous datasets as part of a recent Department of Energy (DOE) funding opportunity. The joint project, titled “Neural Field Processing for Visual Analysis,” will be led at LLNL by co-PI Andrew Gillette, with colleagues from Vanderbilt University and the University of Arizona. The newly funded project will explore methods for processing implicit neural representations (INRs)—datasets that incorporate coordinate-based neural networks to represent scientific datasets efficiently and compactly. Currently, traditional processing algorithms and visual analysis techniques cannot be applied to INRs directly.
“It’s an honor to have been selected to carry out this research for the DOE,” Gillette said. “Fast and accurate visualization is essential for a wide variety of activities underway at DOE laboratories. My goal over the next three years is to partner closely with application domain specialists and demonstrate how advances in visualization methodologies can directly benefit scientific inquiry.”
Hackathon Puts Machine Learning in the Driver’s Seat
After 10 years, the “try something new” spirit of LLNL’s seasonal hackathon is alive and well. The fall 2022 event featured an Amazon Web Services (AWS) DeepRacer machine learning competition, in which participants used a cloud-based racing simulator to train an autonomous race car with reinforcement learning algorithms. Sponsored by LLNL’s Office of the Chief Information Officer and the Computing Directorate, the hackathon provided a unique opportunity to combine cloud, data science, and computing technologies.
Working in teams or individually, drivers trained their cars for a time trial—the fastest car wins—and submitted their models ahead of race day. The AWS team set up the physical track in the parking lot of the Livermore Valley Open Campus, where drivers took turns running their models with the DeepRacer car. Data scientist Mary Silva recalled, “We were cheering for each other like at a sporting event. Everyone was kind of holding their breath to see who would win.” The world record run for the event’s specific track layout is 7 seconds. The Lab’s winning time was 8.873 seconds.
Virtual Seminar Explores Data Dimensionality
In the DSI’s November virtual seminar, Alexander Cloninger of UC San Diego presented “Networks that Adapt to Intrinsic Dimensionality Beyond the Domain.” His talk focused on central questions in deep learning: the minimum size of the network needed to approximate a certain class of functions, and how the dimensionality of the data affects the number of points needed to learn such a network. He discussed his work in the context of two-sample testing, manifold autoencoders, and data generation.
Cloninger is an associate professor in the Department of Mathematical Sciences and the Halıcıoğlu Data Science Institute at UC San Diego. He received his PhD in Applied Mathematics and Scientific Computation from the University of Maryland. He researches problems in the area of geometric data analysis and applied harmonic analysis. A recording of the seminar will be posted to the YouTube playlist. The next seminar, scheduled for December 1, will be the DSI’s first in a hybrid format.
Student Internship Application Deadline
The Data Science Summer Institute (DSSI) application window is now open through January 31. The 2023 program will run for 12 weeks and is open to both undergraduate and graduate students. Visit the DSSI website for information about how to apply, including a list of FAQs—or share this link with students who may be interested in an internship.
Class of 2022 intern Jonathan Anzules said of the program, “New technological advances and the cheapening of data acquisition have vastly expanded what is possible in bioinformatics. Things like predicting protein folding and interactions, which I previously believed impossible, are not anymore. My experience at LLNL has changed what I think is possible.”