graphic of genesis mission - AI for scientific discovery

Data Challenges Facing the Genesis Mission

February 03, 2026   |  Ian Foster

A Globus Response

A recent HPCwire article by Ali Azhar situates the U.S. Department of Energy’s Genesis Mission within a broader realization that AI for science will succeed or fail not on model capability alone, but on the infrastructure that allows data and computation to be shared, governed, and trusted at scale.

This thoughtful and timely piece highlights issues that many of us see every day as we work to move AI for science from promising demonstrations to sustained, multi-institutional practice. What is striking is how consistently the challenges described in this article trace back not to models or accelerators, but to the practical mechanics of using data and computation across boundaries.

What is often underappreciated is that much of the infrastructure needed to address these issues is not hypothetical—it is already deployed and operating across the DOE complex and beyond. Globus, in particular, is in active use across national laboratories and at hundreds of partner institutions today, supporting precisely the kinds of cross-facility workflows that the Genesis Mission brings into focus.

At the governance layer, Globus Auth directly addresses the federated access problem described by Azhar. DOE laboratories, universities, and agency partners already use Auth to integrate local identity systems, enforce site-specific policies, and enable delegated authorization without requiring centralized ownership. This is not an aspirational model; it is production infrastructure that has scaled across organizations with distinct regulatory and security constraints.

The same is true at the data layer. Globus Transfer and Sharing are already used to move and share data among leadership-class HPC systems, experimental facilities, campus clusters, and cloud platforms across DOE labs. These services allow data to be reused across institutional boundaries while preserving local control and scientific context—avoiding the false choice between centralization and fragmentation that often stalls large initiatives.

The discussion of discovery and reuse aligns closely with the role of Globus Search, which is being used today to expose lab- and project-specific metadata in a federated manner. Rather than forcing uniform schemas, Search enables incremental interoperability, allowing communities to make data discoverable on their own terms while still supporting cross-institutional AI workflows.

The concerns raised around traceability and reproducibility are also ones that DOE teams are already addressing operationally. Globus Flows is used to automate and record multi-step scientific processes, encompassing data movement, analysis, and model execution, capturing provenance as workflows run rather than attempting to reconstruct it later. This is exactly the kind of “trust by construction” that AI-driven science at scale requires.

Finally, the challenge of bridging HPC, cloud, and AI workflows is not theoretical. Globus Compute is in active use across DOE facilities to dispatch computation across heterogeneous environments using a common execution model. By reducing the need for environment-specific rewrites, it helps ensure that progress is driven by scientific insight rather than by which site happens to have the most flexible infrastructure.

Seen in this light, the Genesis Mission is less about inventing entirely new systems and more about building on proven, deployed capabilities to operate at greater scale and tighter coordination. If Genesis succeeds, it will be because it leverages infrastructure that already works across the DOE ecosystem to reduce friction, respect institutional autonomy, and allow AI workflows to move with scientific intent. Globus is already playing that role today.