UC has top-notch researchers, students, and faculty across the subject matter domains relevant to LLNL’s data science mission. We want to leverage our proximity and affiliation as best we can.
—Michael Goldman, DSI director
The University of California (UC) system has 10 campuses across the state and more than a quarter of a million students. LLNL has cultivated a long-standing partnership with UC campuses and UC National Laboratories (UCNL), a division of the UC Office of the President. UCNL works closely with Lawrence Livermore, Lawrence Berkeley, and Los Alamos national laboratories to share expertise and technology, provide research opportunities, and develop the Labs’ workforce pipeline. Accordingly, the DSI actively continues the existing partnerships and builds close collaborations with many UC campuses with a goal of reaching all of them in the coming years.
Soon after the DSI was launched in 2018, we co-hosted a workshop with UC's National Laboratories division. The vision for the workshop revolved around the potential of established relationships. “We wanted to share our successes, failures, experiences, knowledge and problems in data science with UC campuses and other UC-affiliated labs to help foster interest in what we do at LLNL,” explained DSI director Michael Goldman in his opening address.
Kim Budil—then UC’s vice president for national laboratories and currently LLNL's director—welcomed attendees with a charge to drive the field forward. “You name it, someone in this room is researching it and applying data science to it,” she said. In addition to LLNL attendees, workshop participants hailed from Los Alamos and Lawrence Berkeley national laboratories, Bay Area research organizations, and several UC campuses: Berkeley, Davis, Irvine, Los Angeles, Merced, and Santa Cruz.
Our second co-sponsored workshop took place in Livermore in 2019 with more than 200 attendees. UC Santa Cruz professor Abel Rodriguez (pictured here) kicked off the workshop with a keynote address on emerging ethical issues in mainstream data analytics. The 2018 and 2019 workshop agendas are available on this website.
We continued the workshop tradition in 2021 with a three-day event focused on artificial intelligence in healthcare. Spread out over three weeks, the all-virtual, invite-only workshop featured panelists and speakers from clinical settings, academia, industry, government entities, and UC campuses (San Francisco and San Diego).
Data Science Challenge
Our annual Data Science Challenge is a three-week course in which LLNL mentors guide UC students through solving a unique problem in a scientific discipline. This intensive virtual training program provides challenging exercises and assignments, virtual tours, and seminars. Students learn from experts, network with peers, develop skills for future internships, and get a taste of day-to-day life at a national lab. The program launched in 2019 with Merced students and expanded in 2021 to include a Riverside cohort.
"My hope is that this is a really good experience for them and that they go back and tell their friends what a cool place Livermore is to work," explains computer scientist Brian Gallagher, who served as a mentor for the 2020 Challenge and co-organized the 2021 Challenge. The 2020 and 2021 Challenges were held virtually due to the COVID-19 pandemic. Topics have ranged from identifying novel therapeutic strategies for cancer treatment to detecting, distinguishing, and characterizing asteroids that may pass near Earth in the future.
Open Data Initiative
The DSI’s Open Data Initiative (ODI) enables us to share LLNL’s rich, challenging, and unique datasets with the larger data science community. Our goal is for these datasets to help support curriculum development, raise awareness around LLNL’s data science efforts, foster new collaborations, and be leveraged across other learning opportunities. These datasets represent a wide variety of key LLNL mission areas and range in complexity from dense, featureful, labeled datasets with well understood solutions to those that are sparse, noisy, and largely unexplored.
We have partnered with the UC San Diego Library and the Halıcıoğlu Data Science Institute to enable library patrons to access and analyze ODI datasets.
ODI director Rushil Anirudh notes, "The ODI is a Laboratory-wide effort to make our rich data ecosystem available to the broader data science community. Open datasets have been a crucial factor in the past decade’s progress in machine learning. Our open datasets will help drive the next decade of advances while addressing unique challenges in scientific machine learning." Datasets include videos for two-photon lithography, drug compounds and therapeutic agents, trace files from HPC simulations, and much more.