The Department of Energy (DOE) has joined the larger scientific community in the promotion of data management as a means to higher quality, more efficient research and analysis, and as a critical component of data science. Tools and platforms to support data management and analysis are rapidly evolving and provide enormous opportunities. They also pose challenges that can be specific to DOE but are common across DOE mission areas and organizations.
The DOE Data Day workshop, abbreviated to D3, was born from this critical work. D3’s primary goals are:
- Bring DOE institutions together to share their data management use cases, challenges, and solutions;
- Identify potential synergies and efficiencies; and
- Establish proactive channels for future collaborations.
The event crosses program boundaries and mission areas, with participants exploring best practices and the latest technologies to help DOE researchers leverage new techniques, respond to data security threats, and advance fundamental science in valuable ways.
Read about this year's event via LLNL News: Data Days workshop gathers DOE national labs to discuss future of data management
Workshop dates: October 22–24, 2024
Hosted by: LLNL
Themes:
Cloud and Hybrid Data Management
Data Intensive Computing
Data Curation and Governance
Resources:
- 2024 Agenda (PDF 223KB)
- 2024 Photo album
Tuesday, October 22
- Welcome and LLNL perspective
- NNSA HQ perspectives and goals
Cloud and Hybrid Data Management
- Keynote: DOE’s Enterprise Data Management Strategy
- ARMFlow: An Event-Driven Workflow and Data Management System for the Atmospheric Radiation Measurement (ARM) Program
- The Spatial Platform for Advanced Research and Collaboration (sPARC) – The Energy of Visualization
- Cloud-Based Jupyter Notebooks for Enabling In-Situ Data Analysis and Subsetting
- Data Management for Clean Energy Demonstration Projects at Scale
- Project Alexandria: A Data Platform
Wednesday, October 23
- DOE Leadership Panel
Data Intensive Computing
- Keynote: Data and HPC Adjacent AI/ML Workloads
- AskOEDI: The Open Energy Data Initiative's New AI Research Assistant
- FusionSci: Augmented Intelligence for Cross-Disciplinary Scientific Discovery
- Safety, Security, and Trustworthiness of Data in Generative AI Ecosystems
- High Performance Data Facility: Status and Plans
- ESGF2-US Data Proximate Computing and Services to Accelerate Data Intensive Climate Science
Poster Session
- Advancing Data Intensive Computing via Data Compression/Reduction at Extreme-Scale
- PerSSD: Persistent, Shared, and Scalable Data with Node-Local Storage for Scientific Workflows in Cloud Infrastructure
- Creating a Cross-Lab Curation Portal Featuring ML/AI Metadata Extraction
- Driving DOE Forward: Scaling Data Governance and Stewardship for Strategic Success
- Empowering the Marine Energy Community with AI-Ready Data from the Portal and Repository for Information on Marine Renewable Energy (PRIMRE)
- Project Alexandria – Managing and Enabling Discovery of DNN R&D Data
- Got Data? Discovery Is the Key
- AI for Automated Citation Metadata Extraction in an Open Data Repository
- OWL & SCROLL: Natural Language Models for Knowledge Preservation and Workforce Development
- Bernie-AI and Beyond
- Standards and Quality Control Processes for Earth Science Datasets
- Data Governance and Stewardship for AI: An Effective Data Lifecycle Using Distributed Computing, Ontologies and Workflows for NeuroSymbolic AI
- AI-Driven Knowledge Discovery Framework for Renewable Energy
- Statistical Analysis of Convection in Variable stars Using Realistic Hydrodynamic Simulations
- Breaking Free from the Human Chain: Automating Data for Impact
- Capturing and Reporting NNSA Data Usage for the Berkeley Nuclear Data Cloud
- Need-to-Know in a Data Virtualization Application
- Building the Technical Ecosystem for the Data Archive (Darc): APIs, Analytics, Reflector
- LANL Weapons Mission Technology Leverage of AI/ML for Big Data Ingestion, Indexing and Search
- The World Data System: Representing the U.S. on the International Data Stage
- Accelerating Deep Learning Training via Inter-Node Access Coordination Over Node-Local Storages
- Preparing LANL’s Data for National Security AI Applications: The Ambitious Vision of the Mission Data Stewardship (Midas) Alliance
- LLNL Data Curation: Current State, Issues, and Solutions
Thursday, October 24
Data Curation and Governance
- Keynote: Curated Data Pipelines for Advanced Analytics and AI
- Towards a DOE Metadata Schema for Generalist Open Data Repositories
- Merits of an Interconnected and Interoperable Repository Ecosystem
- Implementing Data Governance across DNN R&D
- From Chaos to Clarity: Actionable Insights for Supporting Data Stewards as You Mature Data Governance
- Challenges in Managing the Digital Thread in HPC Centric Modeling and Simulation Workflows
- Three Data Building Blocks to a Better NSE
- LLNL's Open Data Initiative
Read about this year's event via LLNL News: Data Days brings Department of Energy labs together for discussions on data management and more
Workshop dates: October 24–26, 2023
Hosted by: LLNL
Themes:
- Data Intensive Computing
- Cloud and Hybrid Data Management
- Data Access, Sharing, and Sensitivity
- Data Curation and Metadata Standards
- Data Governance and Policy
Report: 2023 Report (PDF 997KB)
Tuesday, October 24
Wednesday, October 25
Thursday, October 26
Time | Topics, talks, and activities |
---|---|
7:00am | Check-in and hospitality |
8:15am |
Session 5: Cloud and Hybrid Data Management
|
2:30pm | Adjourn |
Workshop dates: June 1–3, 2022
Hosted by: LLNL
Themes:
- Cloud and hybrid data management
- Data-intensive computing
- Data access, sharing, and sensitivity
- Data policy and ethics
Resources:
- 2022 Report (PDF 985KB)
- 2022 Agenda (PDF 208KB)
- 2022 Proceedings:
Workshop dates: October 5–7, 2020
Hosted by: LLNL, virtual only
Themes:
- Data curation and standards: legacy data, existing data, and future data
- Data-intensive computing, high performance computing (HPC), and data science tools for DOE's computing communities
- Data access, sharing, and sensitivity
- Cloud, HPC, and hybrid data management
A companion hackathon covering a selected topic from the topic areas above was held prior to D3. There was no fee to participate in D3, and an abstract was not required to attend the workshop.
Resources:
- 2020 Report (PDF 3.09MB)
- 2020 Agenda (PDF 738KB)
Workshop dates: September 25–26, 2019
Hosted by: LLNL
Themes:
- Data curation and standards
- Data-intensive computing
- Data management in the cloud
- Data access, sharing, and sensitivity
Resources:
- 2019 Report (PDF 1.87MB)
- LLNL news coverage