Volume 4

Oct. 26, 2020

DSI logo cropped FY22

Our mission at the Data Science Institute (DSI) is to enable excellence in data science research and applications across LLNL. Our newsletter is a compendium of breaking news, the latest research, outreach efforts, and more. Past volumes of our newsletter are available online.

portraits of Jay and Timo side by side

Spotlight: Special Recognition for Researchers

Since joining LLNL as a postdoc in 2013, Jayaraman Thiagarajan’s research has grown to include multiple related fields. This exploration ranges from deep learning–based graph analysis to machine learning (ML) and artificial intelligence (AI) solutions for computer vision, healthcare, language modeling, and scientific applications. Thiagarajan recently received an LLNL Director’s Early Career Recognition Award for his authoritative work and key contributions. He earned a PhD in Electrical Engineering from Arizona State University.

Peer-Timo Bremer has accepted the role as LLNL’s Point of Contact for AI/ML to the DOE Office of Science Advanced Scientific Computing Research (ASCR) Program. Since joining the Lab in 2006, Bremer has worked on a variety of projects related to large-scale data analysis and visualization and leads several ML-related research efforts. He has been heavily involved in ASCR projects ranging from base program research projects to large-scale SciDAC collaborations, and most recently, he co-organized the ASCR workshop on Scientific Machine Learning. Bremer holds a PhD in Computer Science from the University of California, Davis.


NeurIPS logo

Spotlight: Papers Accepted at Top ML Conference

LLNL researchers have had two papers accepted to the prestigious NeurIPS (Neural Information Processing Systems) conference, which promotes the exchange of research on such systems in their biological, technological, mathematical, and theoretical aspects. The NeurIPS paper submission process is extremely competitive.

  • A statistical mechanics framework for task-agnostic sample design in machine learning – Bhavya Kailkhura, Jayaraman Thiagarajan, Qunwei Li, Jize Zhang, Peer-Timo Bremer, and a colleague from the University of Utah. The team’s framework aims to understand the effect of sampling properties of training data on the generalization gap of ML algorithms.
  • Provable, scalable and automatic perturbation analysis on general computational graphs – Bhavya Kailkhura with colleagues from Northeastern University, Tsinghua University, and UCLA. The team has developed a flexible, automatic framework to enable perturbation analysis on any neural network structure.

PSAAP logo

Spotlight: Predictive Science Academic Alliance Program

LLNL will provide significant computing resources to students and faculty from nine universities that were newly selected for participation in the NNSA’s Predictive Science Academic Alliance Program (PSAAP). In this third phase of the program, the focus will be to develop and validate large-scale physics simulation codes for exascale systems.

PSAAP is funded by NNSA’s Office of Advanced Simulation and Computing (ASC) program. ASC’s acting program director Chris Clouse notes, “The promise of greatly enhanced productivity through the use of artificial intelligence, enabled with heterogeneous architectures, is something we see both our staff and PSAAP students to be highly energized over.”

The PSAAP selections required universities to exhibit excellence in advancing predictive science with applications to exascale computing, verification and validation, and uncertainty quantification. “Students will be exposed to working environments similar to what they would experience at a national laboratory, collaborating with interdisciplinary teams on high-performance computing modeling and simulation,” says Ana Kupresanin, who is the PSAAP representative from LLNL and a member of the DSI Council.


seminar icon next to shirley's portrait

Virtual Seminar

In the DSI’s September seminar, Dr. Shirley Ho from NYU presented “Discovering Symbolic Models in Physical Systems Using Deep Learning.” She described a general approach to distill symbolic representations of a learned deep model by introducing strong inductive biases. Graph neural networks are trained in a supervised setting, then Ho’s team applies symbolic regression to the learned model to extract explicit physical relations.


Histogram of ages of women at time of positive human papillomavirus test and age at screening

Recent Research


highlights icon with nisha's portrait

DSSI Leadership Transition

The Data Science Summer Institute (DSSI) completed a successful virtual session with 28 students. As the application window opens in November for the class of 2021, the DSSI is pleased to announce its new co-director, Nisha Mulakken (pictured at left). Her work includes enhancing the Lawrence Livermore Microbial Detection Array (LLMDA) system with detection capability for all variants of SARS-CoV-2 and using ML to trace unethical use of CRISPR technology to the source lab. Mulakken will replace Marisol Gamboa, who is transitioning into other roles in LLNL’s Computing Workforce Program and Global Security Computing Applications Division. We are grateful to Gamboa for channeling her passion for outreach and education into the DSSI since its inception.


Computing Expo 2020 logo atop a rendering of the virtual booth with DSI branding

LLNL Computing Virtual Expo

The DSI joined LLNL’s first Computing Virtual Expo on Sept. 30. The event was open to all employees and the public, with handouts and videos available for 30 days afterward (registration is open through Oct. 30). The DSI’s booth featured a summary of the DSI’s five-year strategic plan and a video detailing data science contributions to LLNL’s healthcare advances. DSI director Michael Goldman also recorded a “lightning talk” about the DSI and DSSI. The booth was visited more than 130 times in the first 24 hours, and the live chatroom was active with many queries about projects and job opportunities.