Publications

Jacobs, S. A., Moon, T., McLoughlin, et al. (2021). “Enabling Rapid COVID-19 Small Molecule Drug Design through Scalable Deep Learning of Generative Models.” The International Journal of High Performance Computing Applications. []

We improved the quality and reduced the time to produce machine learned models for use in small molecule antiviral design. Our globally asynchronous multi-level parallel training approach strong scales to all of Sierra with up to 97.7% efficiency. We trained a novel, character-based Wasserstein autoencoder that produces a higher quality model trained on 1.613 billion compounds in 23 minutes while the previous state of the art takes a day on 1 million compounds. Reducing training time from a day to minutes shifts the model creation bottleneck from computer job turnaround time to human innovation time. Our implementation achieves 318 PFLOPs for 17.1% of half-precision peak. We will incorporate this model into our molecular design loop enabling the generation of more diverse compounds; searching for novel, candidate antiviral drugs improves and reduces the time to synthesize compounds to be tested in the lab.

Zhang, Z., Kailkhura, B., Han, T. Y.-J. (2021). “Leveraging Uncertainty from Deep Learning for Trustworthy Material Discovery Workflows.” ACS Omega. []

In this paper, we leverage predictive uncertainty of deep neural networks to answer challenging questions material scientists usually encounter in machine learning-based material application workflows. First, we show that by leveraging predictive uncertainty, a user can determine the required training data set size to achieve a certain classification accuracy. Next, we propose uncertainty-guided decision referral to detect and refrain from making decisions on confusing samples. Finally, we show that predictive uncertainty can also be used to detect out-of-distribution test samples. We find that this scheme is accurate enough to detect a wide range of real-world shifts in data, e.g., changes in the image acquisition conditions or changes in the synthesis conditions. Using microstructure information from scanning electron microscope (SEM) images as an example use case, we show that leveraging uncertainty-aware deep learning can significantly improve the performance and dependability of classification models.

Nguyen, P., Loveland, D., Kim, J. T., et al. (2021). “Predicting Energetics Materials’ Crystalline Density from Chemical Structure by Machine Learning.” Journal of Chemical Information and Modeling. []

To expedite new molecular compound development, a long-sought goal within the chemistry community has been to predict molecules’ bulk properties of interest a priori to synthesis from a chemical structure alone. In this work, we demonstrate that machine learning methods can indeed be used to directly learn the relationship between chemical structures and bulk crystalline properties of molecules, even in the absence of any crystal structure information or quantum mechanical calculations. We focus specifically on a class of organic compounds categorized as energetic materials called high explosives (HE) and predicting their crystalline density. An ongoing challenge within the chemistry machine learning community is deciding how best to featurize molecules as inputs into machine learning models—whether expert handcrafted features or learned molecular representations via graph-based neural network models—yield better results and why. We evaluate both types of representations in combination with a number of machine learning models to predict the crystalline densities of HE-like molecules curated from the Cambridge Structural Database, and we report the performance and pros and cons of our methods. Our message passing neural network (MPNN) based models with learned molecular representations generally perform best, outperforming current state-of-the-art methods at predicting crystalline density and performing well even when testing on a data set not representative of the training data. However, these models are traditionally considered black boxes and less easily interpretable. To address this common challenge, we also provide a comparison analysis between our MPNN-based model and models with fixed feature representations that provides insights as to what features are learned by the MPNN to accurately predict density.

Hatfield, P. W., Gaffney, J. A., Anderson, G. J., et al. (2021). “The Data-Driven Future of High-Energy-Density Physics.” Nature. []

High-energy-density physics is the field of physics concerned with studying matter at extremely high temperatures and densities. Such conditions produce highly nonlinear plasmas, in which several phenomena that can normally be treated independently of one another become strongly coupled. The study of these plasmas is important for our understanding of astrophysics, nuclear fusion and fundamental physics—however, the nonlinearities and strong couplings present in these extreme physical systems makes them very difficult to understand theoretically or to optimize experimentally. Here we argue that machine learning models and data-driven methods are in the process of reshaping our exploration of these extreme systems that have hitherto proved far too nonlinear for human researchers. From a fundamental perspective, our understanding can be improved by the way in which machine learning models can rapidly discover complex interactions in large datasets. From a practical point of view, the newest generation of extreme physics facilities can perform experiments multiple times a second (as opposed to approximately daily), thus moving away from human-based control towards automatic control based on real-time interpretation of diagnostic data and updates of the physics model. To make the most of these emerging opportunities, we suggest proposals for the community in terms of research design, training, best practice and support for synthetic diagnostics and data analysis.

Djordjević, B. Z., Kemp, A. J., Kim, J., et al. (2021). “Modeling Laser-Driven Ion Acceleration with Deep Learning.” Physics of Plasmas. []

Developments in machine learning promise to ameliorate some of the challenges of modeling complex physical systems through neural-network-based surrogate models. High-intensity, short-pulse lasers can be used to accelerate ions to mega-electronvolt energies, but to model such interactions requires computationally expensive techniques such as particle-in-cell simulations. Multilayer neural networks allow one to take a relatively sparse ensemble of simulations and generate a surrogate model that can be used to rapidly search the parameter space of interest. In this work, we created an ensemble of over 1,000 simulations modeling laser-driven ion acceleration and developed a surrogate to study the resulting parameter space. A neural-network-based approach allows for rapid feature discovery not possible for traditional parameter scans given the computational cost. A notable observation made during this study was the dependence of ion energy on the pre-plasma gradient length scale. While this methodology harbors great promise for ion acceleration, it has ready application to all topics in which large-scale parameter scans are restricted by significant computational cost or relatively large, but sparse, domains.

Jacobs, S. A., Moon, T., McLoughlin, K., et al. (2021). “Enabling Rapid COVID-19 Small Molecule Drug Design through Scalable Deep Learning of Generative Models.” The International Journal of High Performance Computing Applications. []

We improved the quality and reduced the time to produce machine learned models for use in small molecule antiviral design. Our globally asynchronous multi-level parallel training approach strong scales to all of Sierra with up to 97.7% efficiency. We trained a novel, character-based Wasserstein autoencoder that produces a higher quality model trained on 1.613 billion compounds in 23 minutes while the previous state of the art takes a day on 1 million compounds. Reducing training time from a day to minutes shifts the model creation bottleneck from computer job turnaround time to human innovation time. Our implementation achieves 318 PFLOPs for 17.1% of half-precision peak. We will incorporate this model into our molecular design loop enabling the generation of more diverse compounds; searching for novel, candidate antiviral drugs improves and reduces the time to synthesize compounds to be tested in the lab.

Zhang, J., Kailkhura, B., and Han, T. Y-J. (2021). “Leveraging Uncertainty from Deep Learning for Trustworthy Material Discovery Workflows.” ACS Omega. []

Deep Learning models are emerging machine learning approaches that are being proven to be useful for a number of material science applications, including materials discovery, microstructure analysis and property predictions. In a recent publication, material and computer scientists at LLNL propose a unified framework that leverages the predictive uncertainty from deep neural networks to answer challenging questions material scientists usually encounter in machine learning-based material application workflows. Specifically, the team demonstrates that predictive uncertainty from uncertainty-aware Deep Learning approaches (particularly Deep Ensembles) can be used to determine the number of required training data to achieve the desired prediction accuracy without relying on labelled data. Further, the team shows that the predictive uncertainty guided decision referral is highly effective in detecting and refraining deep neural networks from making wrong predictions on confusing material samples and out-of-distribution samples that deviate from the training data. The newly proposed uncertainty-enabled decision-making method is quite generic and can be used in a wide range of scientific domains to ensure trust, dependability, and usefulness of Deep Learning models.

Anirudh, R., Lohit, S., and Turaga, P. (2021). “Generative Patch Priors for Practical Compressive Image Recovery.” 2021 IEEE Winter Conference on Applications of Computer Vision. []

In this paper, we propose the generative patch prior (GPP) that defines a generative prior for compressive image recovery, based on patch-manifold models. Unlike learned, image-level priors that are restricted to the range space of a pre-trained generator, GPP can recover a wide variety of natural images using a pre-trained patch generator. Additionally, GPP retains the benefits of generative priors like high reconstruction quality at extremely low sensing rates, while also being much more generally applicable. We show that GPP outperforms several unsupervised and supervised techniques on three different sensing model—linear compressive sensing with known, and unknown calibration settings, and the non-linear phase retrieval problem. Finally, we propose an alternating optimization strategy using GPP for joint calibration-and-reconstruction which performs favorably against several baselines on a real world, un-calibrated compressive sensing dataset.

Shanthamallu, U. S., Thiagarajan, J. J., and Spanias, A. (2021). “Uncertainty-Matching Graph Neural Networks to Defend Against Poisoning Attacks.” 35th AAAI Conference on Artificial Intelligence. []

Graph Neural Networks (GNNs), a generalization of neural networks to graph-structured data, are often implemented using message passes between entities of a graph. While GNNs are effective for node classification, link prediction and graph classification, they are vulnerable to adversarial attacks, i.e., a small perturbation to the structure can lead to a non-trivial performance degradation. In this work, we propose Uncertainty Matching GNN (UM-GNN), that is aimed at improving the robustness of GNN models, particularly against poisoning attacks to the graph structure, by leveraging epistemic uncertainties from the message passing framework. More specifically, we propose to build a surrogate predictor that does not directly access the graph structure, but systematically extracts reliable knowledge from a standard GNN through a novel uncertainty-matching strategy. Interestingly, this uncoupling makes UM-GNN immune to evasion attacks by design, and achieves significantly improved robustness against poisoning attacks. Using empirical studies with standard benchmarks and a suite of global and target attacks, we demonstrate the effectiveness of UM-GNN, when compared to existing baselines including the state-of-the-art robust GCN.

Gokhale, T., Anirudh, R., Kailkhura, B., Thiagarajan, J. J., Baral, C., and Yang, Y. (2021). “Attribute-Guided Adversarial Training for Robustness to Natural Perturbations.” 35th AAAI Conference on Artificial Intelligence. []

While existing work in robust deep learning has focused on small pixel-level `p norm-based perturbations, this may not account for perturbations encountered in several real world settings. In many such cases although test data might not be available, broad specifications about the types of perturbations (such as an unknown degree of rotation) may be known. We consider a setup where robustness is expected over an unseen test domain that is not i.i.d. but deviates from the training domain. While this deviation may not be exactly known, its broad characterization is specified a priori, in terms of attributes. We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space, without having access to the data from the test domain. Our adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial perturbations, and the outer minimization finding model parameters by optimizing the loss on adversarial perturbations generated from the inner maximization. We demonstrate the applicability of our approach on three types of naturally occurring perturbations—object-related shifts, geometric transformations, and common image corruptions. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations. We demonstrate the usefulness of the proposed approach by showing the robustness gains of deep neural networks trained using our adversarial training on MNIST, CIFAR-10, and a new variant of the CLEVR dataset.

Thiagarajan, J. J., Narayanaswamy, V., Anirudh, R., Bremer, P.-T., Spanias, A. (2021). “Accurate and Robust Feature Importance Estimation under Distribution Shifts.” 35th AAAI Conference on Artificial Intelligence. []

With increasing reliance on the outcomes of black-box models in critical applications, post-hoc explainability tools that do not require access to the model internals are often used to enable humans understand and trust these models. In particular, we focus on the class of methods that can reveal the influence of input features on the predicted outputs. Despite their wide-spread adoption, existing methods are known to suffer from one or more of the following challenges: computational complexities, large uncertainties and most importantly, inability to handle real-world domain shifts. In this paper, we propose PRoFILE, a novel feature importance estimation method that addresses all these challenges. Through the use of a loss estimator jointly trained with the predictive model and a causal objective, PRoFILE can accurately estimate the feature importance scores even under complex distribution shifts, without any additional re-training. To this end, we also develop learning strategies for training the loss estimator, namely contrastive and dropout calibration, and find that it can effectively detect distribution shifts. Using empirical studies on several benchmark image and nonimage data, we show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.