Globus Federated AI Stack

An Interoperable Ecosystem for AI in Science


Globus provides the foundation for advanced machine learning and artificial intelligence development, deployment, and research across diverse scientific disciplines and leveraging distributed science cyberinfrastructure (CI). Globus services enable managed remote computation and data management, indexing and search, and sophisticated automated workflows across a federated ecosystem of CI. Globus provides an overarching identity and access management platform that secures interactions between users, applications, services, and distributed computing resources.

The Globus AI Stack builds on the Globus platform to offer a suite of services that directly facilitate model training, publication, and inference, as well as AI dataset management. It leverages the robust, scalable, and secure Globus platform, for data and compute management, search, and automation capabilities, as well as its standards compliant federated identity and access management fabric.


Why Globus?

  • Globus and the Globus AI stack implement a hybrid deployment model unlike traditional approaches
  • The platform can manage large-scale data and computational resources
  • Globus services can orchestrate operations on remote CI, thereby enabling models to be trained, published, and used directly on existing resources-from laptops to supercomputers
  • The Globus AI stack provides programmatic APIs and SDKs that enable use in specific domains, and the creation of customized services and platforms, and extension, by the community

Garden

Model publication and inference

Garden is a service to support publication of machine learning models alongside data grouped by scientific domain and use cases. It simplifies the process of wrapping a model with all dependencies such that it can be easily used for inference by users. Inference can be launched on any Globus Compute endpoint or in the cloud, enabling researchers to easily discover a published model and then use it on their local resources.

Diamond

Model training

Diamond is a service to manage the training and fine tuning of models on high performance computing clusters. It provides an accessible cloud-hosted service via which users can discover models (e.g., OpenFold, Llama), create a container for a specific target resource (e.g., TACC Frontera), deploy the training job on that target resource, and monitor and manage the training process via various training statistics.

APPFL

Federated Learning

APPFL/APPFLx is a powerful Python library and hosted service for managing federated machine learning training and inference across disparate computing resources. APPFLx provides a REST API and web interface to create “federations” of participating devices and to then deploy a machine learning training process across those federations. It uses Globus Compute to launch local training on each device and to launch aggregation of local models into a global model. The resulting global model incorporates aspects of the local models and can then be used by the federation for inference.

Foundry

Data, Models, Science

Foundry hosts structured, ML-ready data of any size that can be accessed programmatically via a Python SDK. It provides a schema via which publishers can describe datasets, making it easy for users to discover and then use those datasets. Datasets can be loaded directly in Python and used as input to model training or inference.


National Artificial Intelligence Research Resource (NAIRR)

The National Artificial Intelligence Research Resource (NAIRR) is a concept for a shared national research infrastructure to connect U.S. researchers to responsible and trustworthy Artificial Intelligence (AI) resources, as well as the needed computational, data, software, training, and educational resources to advance research, discovery, and innovation.

Globus and NAIRR

Many resource providers in NAIRR already have Globus services, all have data, and some have Globus Compute. These users can simply layer this AI infrastructure on top.

Learn more about the NAIRR Pilot