Beyond SGX: Embracing GPU TEE for Decentralized AI (dAGI)
2024-08-01
TL;DR
Phala Network integrates Trusted Execution Environment (TEE) technology, using Intel’s SGX, to secure blockchain and AI computations. By adding confidential GPUs, like NVIDIA's H100, Phala enhances security and performance, addressing the limitations of CPU TEEs. This hybrid GPU + CPU TEE model tackles challenges in AI and blockchain, such as data security, model integrity, scalability, and traceability, while protecting the proprietary data sent for inference LLM.
Phala’s chain of trust ensures secure, verifiable computing through onchain registration, CPU/GPU attestations, and endorsements from Intel SGX and NVIDIA GPU. This approach guarantees efficient and tamper-proof AI confidential computation processing, addressing LLM privacy concerns.
Introduction
In our previous blog post, we explored how Phala Network leverages Trusted Execution Environment (TEE) technology to revolutionize the integration of blockchain and AI. As an AI coprocessor, Phala Network utilizes Intel’s Software Guard Extensions (SGX) to ensure secure, isolated computations, thereby setting new standards for decentralized computing. We also discussed how our hybrid blockchain TEE design positions Phala as a leader in secure and verifiable computing solutions.
Building on this foundation, we introduced the integration of confidential GPUs to further enhance confidential computation. This addition aims to address the shortcomings of existing CPU TEE technologies and unlock new possibilities for secure AI processing. In this blog, we will delve deeper into the hybrid GPU + CPU TEE approach, highlighting its significance in the AI era, particularly in ensuring data privacy AGI.
The Critical AI Challenges in Blockchain and Decentralized Systems
Before we dive into our proposed hybrid GPU + CPU TEE confidential computational model, let's address the key challenges currently facing blockchain LLM/AI model computation:
- Data Security and Privacy: AI models often require access to sensitive user data for training and inference. Protecting this data from unauthorized access and ensuring its confidentiality is paramount.
- Model Integrity and Trust: Ensuring that AI models operate correctly and have not been tampered which is essential for maintaining trust in decentralized systems.
- Scalability and Performance: AI tasks are computationally intensive and require robust hardware to handle large-scale data processing efficiently.
- Resistance to Attacks: Decentralized systems are susceptible to various types of attacks, including denial-of-service (DoS) and Sybil attacks, which can undermine the system’s reliability and security.
- Traceability: In the blockchain space, verifying what happens inside GPUs is challenging. The lack of traceability and privacy can expose projects to risks such as Sybil attacks.
Confidential Computing with GPUs
Phala Network is pioneering the integration of confidential GPUs, further enhancing the capabilities and security of its TEE infrastructure. Confidential GPUs, such as NVIDIA's H100 Tensor Core GPUs, provide a hardware-based solution that can operate alongside any cryptographic algorithm. This technology enhances the integrity and confidentiality of compute-intensive tasks such as AI model training and inference, protecting sensitive data and AI models from unauthorized access and tampering thus ensuring data privacy AGI .
With confidential computing enabled, the Host OS and hypervisor (think of it as a manager of virtual machines) is blocked from accessing the confidential virtual machine (VM) in system memory and from reading GPU memory. Imagine a building manager who has keys to every office, but now there are special secure offices that the manager cannot access without permission. This ensures that only authorized entities can access sensitive data and computations.
Introducing Visualization of Phala’s TEE Hybrid CPU + GPU TEE Confidential Computation Model
Phala’s TEE Chain of Trust Illustrated
The chain of trust in Phala Network's hybrid GPU + CPU TEE approach ensures a secure and verifiable computational environment. By combining the strengths of both CPU and GPU TEEs, Phala Network provides a robust platform for confidential computing, particularly suited for AI and other compute-intensive applications in the blockchain space. This integrated approach addresses the limitations of CPU-only TEEs, offering a more scalable and and secure solution for the data privacy AGI era.
An example of the “Chain of Trust” is exemplified with the integration of Llama3, a large language model, with PyTorch running on the hybrid TEE setup. This setup demonstrates how AI models can be securely executed within a trusted environment, leveraging both CPU and GPU capabilities for enhanced performance and security, protecting the proprietary data sent for inference LLM.
Simplified Steps of the Chain of Trust Process
To make this process easier to understand, let’s break down the steps:
- Onchain Registration: The process starts with the onchain registration of the image hash either directly using pruntime or via Smart Contract on Phala Blockchain. This registration ensures that the computational environment and the software components are reproducible and verifiable. The image hash is a cryptographic representation of the software build, ensuring its integrity.
Think of this as a secure digital signature. The image hash (a unique digital fingerprint of the software) is registered on the blockchain. This ensures that the software and its environment are verified and have not been altered.
- CPU TEE (Intel TDX / AMD SEV): The next step involves the CPU TEE, which includes Intel TDX or AMD SEV technologies. The CPU TEE performs critical measurements, including:
- CPU/bootloader version
- VM/Docker image hash
These measurements are essential to verify the integrity of the computational environment.
Think of it as a security guard checking the ID and credentials of everyone entering a secure area. The CPU attestation, endorsed by Intel, ensures that the hardware and software configurations are secure and trustworthy.
- GPU Integration (NVIDIA H100 Tensor Core GPUs): The diagram also shows the integration of NVIDIA's H100 GPUs in TEE mode. These GPUs, when combined with the CPU TEE, enhance the computational capabilities and security. The NVIDIA driver is initialized in a secure manner, ensuring that the GPU operates within a trusted environment.
Imagine adding a powerful new engine to a car. The GPU, combined with the secure CPU, boosts performance and security. The NVIDIA driver is securely initialized to ensure the GPU works in a trusted environment.
- GPU TEE Measurements: Similar to the CPU TEE, the GPU TEE performs its own set of measurements, including:
- GPU firmware
- Program hash
These measurements are verified through GPU attestation, endorsed by NVIDIA, to ensure the integrity and confidentiality of the computations performed by the GPU.
Just like the CPU TEE, the GPU performs its own security checks. It verifies details like the GPU firmware and the software running on it. NVIDIA then endorses these checks to confirm their accuracy and security.
- Endorsement and Attestation: The endorsements and attestation from Intel and NVIDIA provide an additional layer of trust, verifying that both the CPU and GPU components are operating securely and as intended.
These endorsements from Intel and NVIDIA act like stamps of approval, confirming that both the CPU and GPU are secure and working as intended.
Phala Network’s innovative use of TEEs, particularly with advanced GPUs like the NVIDIA H100, sets it apart from other players in the industry. This approach ensures that AI models are not only efficient but also secure and tamper-proof addressing data privacy AGI.
NVIDIA Remote Attestation in Action
A demonstration of NVIDIA remote attestation showcases how Phala is able to endorse traceability, verifiability and confidentiality of data using blockchain TEE.
Got Questions About How It All Works?
We're here to help! Reach out to us at [email protected] or [email protected] for more information. Let's get your curiosity satisfied!