Sponsored by the DSI, LLNL’s winter hackathon took place on February 16–17. Hackathons are 24-hour events that encourage collaborative programming and creative problem solving. In addition to traditional hacking, the hackathon included a special datathon competition in anticipation of the Women in Data Science (WiDS) conference on March 7. Hackathon and datathon participants presented their projects on the second day.
The datathon challenged participants to create a predictive model using a climate dataset curated by the global WiDS organization, and teams had the option of submitting their results to the worldwide competition. LLNL organizers were Cindy Gonzales, Amar Saini, Mary Silva, and Jennifer Bellig.
LLNL data scientists Olivia Miano and Juanita Ordoñez led a Jupyter Notebooks tutorial on how to analyze the datathon dataset with Pandas. They demonstrated applying various machine learning models to the data, including linear regression, root mean squared error, random forest, and gradient boosting regression.
The datathon was designed to accommodate participants of any skill level. “The tutorial helped those with no experience get up and running fast,” stated Gonzales. “And the problem was challenging enough that those with data science experience could collaborate with others and still not necessarily place in the global datathon leaderboard.”
Continuing the datathon theme, LLNL researchers Gemma Anderson and Aaron Donahue presented an overview of global climate models, explaining how the Lab uses deep learning to extract complex patterns and connections from climate data. Deep learning and other techniques help improve the resolution and accuracy of climate predictions, which are important for assessing the impacts of climate change and building resiliency.
Anderson and Donahue also described improvements in high-resolution climate simulations that are run on supercomputers. In contrast to weather models, which can capture fine-scale detail for only near-term periods, climate models are more accurate with longer forecast lead times. Although uncertainties exist in these predictive models, climate scientists are confident about the overall trends.
The datathon subject matter will carry over into Livermore’s regional WiDS event, where Miano, Ordoñez, and Anderson will lead a workshop. According to Gonzales, WiDS activities provide an important opportunity for professional curiosity and growth. “Getting into the data sciences is more than learning about the fundamentals. It’s about applying that foundational knowledge to a real-world problem. The WiDS datathon provides just that—a challenging problem for those looking to explore the data sciences in a relaxed environment,” she said. “These experiences can be a pivotal moment for anyone exploring the field.”