Data Science Spotlight Archive

2019

Ghaleb Abdulla

Ghaleb Abdulla

Computer Scientist

Since joining LLNL in 2000, Ghaleb Abdulla has embraced projects that depend on teamwork and data sharing. His tenure includes establishing partnerships with universities seeking LLNL’s expertise in HPC and large-scale data analysis. He supported approximate queries over large-scale simulation datasets for the AQSim project and helped design a multi-petabyte database for the Large Synoptic Survey Telescope. Abdulla used machine learning (ML) to inspect and predict optics damage at the National Ignition Facility, and leveraged data management and analytics to enhance HPC energy efficiency. Recently, he led a Cancer Registry of Norway project developing personalized prevention and treatment strategies through pattern recognition, ML, and time-series statistical analysis of cervical cancer screening data. Today, Abdulla is co-PI of the Earth System Grid Federation—an international collaboration that manages a global climate database for 25,000 users on 6 continents. “The ability to move between different science domains and work on diverse data science challenges makes LLNL a great place to pursue a career in data science,” he says. Abdulla holds a PhD in computer science from Virginia Tech.

seven men standing in the grass

CASC ML team

Computer Scientists

As machine learning (ML) research heats up at LLNL, a team of computer scientists from the Center for Applied Scientific Computing (CASC) is leading the way. Pictured here are Harsh Bhatia, Shusen Liu, Bhavya Kailkhura, Peer-Timo Bremer (also a member of the DSI Council), Jayaraman Thiagarajan, Rushil Anirudh, and Hyojin Kim. Their research was recently featured in LLNL’s magazine, Science & Technology Review. As the cover story, “Machine Learning on a Mission,” explains, ML has important implications for scientific data analysis and for the Lab’s national security missions. This CASC team takes a bidirectional approach to ML, both advancing underlying theory and solving real-world problems—an effort that includes scaling algorithms for supercomputers and developing ways to analyze different types and varying volumes of data. Bremer states, “Commercial companies don’t solve scientific problems, just as national labs don’t optimize selections of movie reviews. So we build on commercial tools to create the techniques we need to analyze data from experiments, simulations, and other sources.”

Laura Kegelmeyer

Laura Kegelmeyer

Scientist/Engineer

Laura Kegelmeyer embraces her role as a problem solver. Since arriving at LLNL in 1988, she has brought her expertise to bear on image processing and analysis—first in biomedical applications, such as DNA mapping and breast cancer detection, and now at the National Ignition Facility (NIF), home of the world’s most energetic laser. Her Optics Inspection team combines large-scale database integration with custom machine learning algorithms and other data science techniques to analyze images captured throughout NIF’s 192 beamlines. This inspection process informs an automated “recycle loop” that extends optic lifetimes. Based on this work and previous involvement with Women in Data Science (WiDS) events, Kegelmeyer was invited to speak at the 2019 WiDS conference. “It’s an amazing opportunity to present an example of applying machine learning to ‘big science.’ NIF’s exploration of physical phenomena under extreme conditions has far-reaching impact across the globe and for future generations,” she says. “I hope to inspire data scientists to use their skills to address challenges in exciting scientific areas.” Kegelmeyer holds degrees in Biomedical Engineering and Electrical Engineering from Boston University.
Brenden Petersen

Brenden Petersen

Research Scientist

Brenden Petersen isn’t content merely applying advanced data science methods to real-world problems. He’d rather tackle challenges where, he says, “the state-of-the-art doesn’t cut it.” Since joining LLNL’s Computational Engineering Division in 2016, he pursues deep reinforcement learning (RL) solutions for many fields including cybersecurity, energy, and healthcare (see DSI workshop slides [PDF]). Whereas deep learning traditionally addresses prediction problems, RL solves control problems. He explains, “RL provides a framework for learning how to behave in a task-completion scenario. Working in the field feels very goal-oriented, even competitive. Each application is a new personal challenge.” Petersen recently launched an RL reading group to help other LLNL staff get started in the field. “At the first meeting, I recognized only about 20% of the attendees, which was awesome! A major goal of the group, and DSI as a whole, is to connect researchers across the Lab,” he states. Petersen earned his biomedical engineering PhD through a joint program at UC Berkeley and UC San Francisco.

2018

Bhavya Kailkhura

Bhavya Kailkhura

Computer Scientist

Kailkhura thrives on solving challenging problems in data science, focusing on improving the reliability and the safety of machine learning systems. “Reliability and safety in AI should not be an option but a design principle,” he states. “The better we can address these challenges, the more successful we will be in developing useful, relevant, and important ML systems.” Kailkhura also pursues mathematical solutions to open optimization problems, including a novel sphere-packing theory. He is building provably safe, explainable deep neural networks to enable reliable learning in applications for materials science, autonomous drones, and inertial confinement fusion. Thanks to his efforts with gradient-free algorithms and experiment designs, LLNL is the only national lab with research accepted at two high-profile venues—NIPS and JMLR—in 2018. Prior to joining LLNL’s Center for Applied Scientific Computing, Kailkhura attended Syracuse University where his PhD dissertation won an all-university prize. Recently, he co-authored the book Secure Networked Inference with Unreliable Data Sources.

Marisa Torres

Marisa Torres

Senior Bioinformatics Software Developer

Since joining LLNL in 2002, Torres has combined her love of biology with coding. She serves as lead bioinformatics software developer on biosecurity projects supporting the Global Security Program. Her team is building the Gene Surprise Toolkit, which determines biothreat severity and detects potential genetic engineering of pathogens. In addition, Torres contributes to the Accelerating Therapeutics for Opportunities in Medicine consortium. The project aims to accelerate the drug discovery pipeline by building predictive, data-driven pharmaceutical models. In March 2018, Torres organized a regional symposium in conjunction with Stanford University’s Women in Data Science conference. She also encourages local middle school students to explore computer science through the Girls Who Code program and mentors student interns for LLNL’s Data Science Summer Institute (DSSI). “I’m interested in collaborating across domains with similar data analysis needs,” says Torres. “I look forward to strengthening networking and educational opportunities through DSI, especially for the DSSI.”

T. Nathan Mundhenk

T. Nathan Mundhenk

Computer Scientist

Mundhenk enjoys “nerding around” in LLNL’s Computational Engineering Division, especially when it comes to research aimed at improving people’s lives. With a PhD in computer science from the University of Southern California, he works on projects that use LLNL’s powerful computing capabilities to advance neural network technologies. Mundhenk recently co-authored a paper, “Improvements to Context Based Self-Supervised Learning,” which was accepted to the 2018 Computer Vision and Pattern Recognition conference. His team is developing a state-of-the-art technique for refining unsupervised deep learning. In their method of self-supervision, a deep neural network can be pre-trained on a large generic dataset before training on a small labeled dataset, resulting in better accuracy (e.g., of image recognition) in the latter. “The entire field of artificial intelligence is bursting with new innovation,” says Mundhenk. “It’s challenging to keep up with the extraordinary pace of research, but also very exciting to be part of it.”

Rushil Anirudh

Rushil Anirudh

Research Scientist

With a PhD in computer vision and machine learning, Anirudh joined LLNL’s Center for Applied Scientific Computing in 2016. He enjoys the challenges of an exponentially growing field, noting, “Something on a whiteboard today is likely to end up being used by someone within a few months.” Anirudh develops convolutional neural networks that can complete computed tomography (CT) images when the scanned object is only partially visible. His team’s paper, “Lose the Views: Limited Angle CT Reconstruction via Implicit Sinogram Completion,” is one of only 7% selected for a spotlight presentation at the 2018 Computer Vision and Pattern Recognition conference. Anirudh’s related work with generative adversarial networks was recently featured in NVIDIA’s developer blog. “I am very glad the Lab has the DSI,” says Anirudh. “A central institute that brings together everyone working on similar ideas is a great step toward becoming a leader in artificial intelligence and machine learning.”

Jose Cadena Pico

Jose Cadena Pico

Postdoctoral Researcher

Cadena Pico enjoys the discovery process when analyzing new data sets, despite the difficulties in preparing data before building machine learning models. “Often a data set is incomplete or contains errors from different sources. Sometimes its size makes it difficult to extract knowledge,” he says. “Solving these challenges and knowing that I’m helping other researchers advance their work is very gratifying.” Once a PhD student at Virginia Tech, Cadena Pico now contributes to LLNL’s brain-on-a-chip project by studying complex networks among brain cells. He also investigates ways to detect anomalous activity in networks, and his recent work—developing a method for finding clusters of under-vaccinated populations to inform public health resources—was presented at the 24th KDD Conference. Formerly a three-time LLNL summer intern, Cadena Pico values ongoing education: “I like to keep learning about different research domains while developing a data science skill set applicable to many problems of global importance.”

Kassie Fronczyk

Kassie Fronczyk

Applied Statistics Group Leader

Fronczyk is a “total nerd” whose multifaceted job makes her an ideal panelist for the Women in Statistics and Data Science conference, where she recently discussed research opportunities at national labs. Fronczyk leads LLNL’s Applied Statistics Group while providing statistical analysis and uncertainty quantification for several projects, including a warhead life-extension program and the U.S. Nuclear Detection System. “I love learning new things and tackling interesting problems,” states Fronczyk. “Standard approaches rarely work on real-world data, so finding the right tool for the job often means exploring new methods and combining or modifying others.” She brings this creative mentality to on- and offsite collaborations, such as with the Innovations and Partnerships Office and the Institute of Makers of Explosives Science Panel. She also sits on LLNL’s Engineering Science & Technology Council, manages two seminar series (including DSI’s), and co-organized DSI’s inaugural workshop. Fronczyk holds a PhD in statistics and stochastic modeling from UC Santa Cruz.

group of students gathered in front of the supercomputing building

DSSI class of 2018

Aspiring Data Scientists

The DSSI class of 2018—26 students in all—were selected from a highly competitive applicant pool of more than a thousand. While at LLNL, they participated in Grand Challenge team exercises and displayed their research posters at the DSI’s summer workshop. These bright students are among the next generation of promising data scientists, and we look forward to seeing their careers develop.