Docs

Discover how Phala's AI Agent Contract offers the essential tools to develop and profit from intelligent applications.

Explore Now

GPU TEE is Launched on Phala Cloud for Confidential AI

2025-07-17

Build AI People Can Trust: Why Confidential AI?

AI becomes the backbone of more businesses, privacy and verifiability are non-negotiable. Whether you’re building for finance, healthcare, enterprise SaaS, or even consumer apps, your users and clients demand not just promises, but proof that their data and IP stay private. Regulators and enterprise customers are asking for it; smart teams are demanding it.

Today, every AI builder and enterprise faces a stark reality: cloud GPUs are powerful, but not private. You trust that your model, data, and results are safe—but you can’t prove it.

Phala Cloud changes that, making your AI private, verifiable, and trusted.

With the launch of GPU TEE support, Phala introduces the world’s first open, real-time attested, hardware-enforced confidential AI platform on NVIDIA H100/H200. For the first time, you can deploy, scale, and prove the privacy of any AI workload—no blind trust required.


What’s New: Three Breakthrough Features

We built this platform for real-world builders, enterprises, and communities—anyone who needs private, powerful AI.

What You Get: Core Capabilities at a Glance

  • Instant Confidential AI Model Deployment: Launch leading open-source or private models (Qwen, Llama3, DeepSeek, and more) in a GPU TEE, via UI, CLI, or API.
  • Transparent GPU TEE Marketplace: Choose your region, GPU class, and price—see live capacity and launch in seconds.
  • OpenAI-Compatible Confidential AI API: Serve and scale models using drop-in endpoints, with every response cryptographically attested.
  • End-to-End Encryption & Live Verification: Hardware-level isolation, encrypted comms, and real-time attestation for every workload, every call.

Confidential AI Model Deployment

Feature Intro:

Deploy any supported large language model (Qwen, DeepSeek, Llama3, and more) directly into a confidential GPU TEE with just a click (UI), a command (CLI), or an API call. Every model, prompt, and inference is hardware isolated and shielded from cloud providers and even Phala itself.

How to Verify:

Each running model provides a real-time attestation—accessible via our “Chat and Verify” tool. This is not just a status badge; it’s cryptographic proof that your workload is running inside an authentic TEE.

Why it matters:

From startups to the world’s largest enterprises, anyone can now guarantee that AI models and sensitive data are never exposed—not to us, not to the cloud, not to any admin. This opens doors for regulated industries, B2B SaaS, and anyone serious about data privacy.


GPU TEE Marketplace

Feature Intro:

Choose your ideal GPU configuration (H200/H100, region, VRAM, vCPU) from our transparent, self-serve marketplace. You see real-time availability and pricing—then launch confidential GPU compute in seconds.

How to Verify:

All launched workloads receive hardware-backed attestation, so you always know your compute is running on certified, confidential nodes.

Why it matters:

No more guessing, no more waiting, no more opaque pricing. Whether you’re deploying for a day, a project, or at scale, the marketplace gives you the control and trust you need to build the next wave of AI—securely.


Confidential AI API

Feature Intro:

Plug in to a fully OpenAI-compatible API endpoint, but with a twist: every call is processed inside a hardware-enforced TEE. No code changes, no trust needed, just drop-in confidential AI.

How to Verify:

Every response from the Confidential AI API comes with an attestation token. You (and your users) can independently verify every inference, every time.

Why it matters:

You can finally offer your users or customers cryptographic proof that their data stays private—perfect for SaaS, consumer-facing AI, or compliance-critical applications.


How Does End-to-End Encryption Happen?

The secret to Phala’s privacy is our unique, open-source orchestration—engineered for verifiable, end-to-end encrypted AI.

  1. Confidential GPU Node Selection
  2. When you launch a workload, Phala orchestrator picks a node with Intel TDX CPUs and NVIDIA Confidential GPUs. A Confidential VM (enclave) boots, fully hardware-isolated.

  1. Encrypted Container & Key Management
  2. Models and workloads are encrypted at rest and decrypted only inside the TEE after attestation. Keys and secrets are managed by an in-enclave KMS—unseen by anyone outside.

  1. Attested Network & Encrypted API
  2. The enclave issues its own TLS certificates, ensuring all inbound and outbound API traffic is encrypted directly into the TEE. You connect to the enclave itself, not a proxy.

  1. Attestation on Demand
  2. Every job, every API call, every deployment is paired with a real-time remote attestation—proof, not promise, that your code is private and unaltered.

  1. Secure Lifecycle
  2. When a workload ends, keys are destroyed and memory is wiped. No residual data remains—your secrets stay yours.

The result:

No one—not the cloud, not the operator, not even Phala—can see your data, code, or model outside the enclave. Privacy is enforced by hardware, with cryptographic proof you can share or audit.


What Can You Build? (Real Use Cases)

This isn’t theory. These are the workflows already running on Phala Cloud today.

  • Enterprise AI with Compliance: Deploy internal copilots, customer-facing chat, or analytics—meeting the strictest privacy regulations.
  • AI SaaS with Proof: Attract and retain high-value customers with verifiable privacy for every AI API call.
  • Consumer Apps with Trust: Give users privacy guarantees in chatbots, wellness tools, and creative apps—proof included.
  • IP-Safe Model Serving: Serve or license proprietary AI without the risk of model theft or data leaks.

For Developers:

  • Ship private AI chatbots, confidential data analytics, or secure model fine-tuning—prove privacy to your users.
  • Integrate in minutes with our OpenAI-compatible API, Python/JS/Rust SDKs, or directly via CLI.

For Startups:

  • Win enterprise deals by providing customer-verifiable privacy for every API call.
  • Build SaaS for regulated industries—turn compliance from a blocker into your unique selling point.

For Enterprise:

  • Deploy copilots and AI automation for proprietary or regulated data (health, finance, legal, code).
  • Scale multi-tenant AI with guaranteed isolation and customer-verifiable execution.

How Phala Stands Apart

CapabilityOpenAI/ClaudeOn-PremisePhala Cloud
Hardware Privacy
Cloud Scalability
Real-time Verification
Open SourceVaries
Setup ComplexityLowHighLow
Vendor Lock-inHighNoneNone
Performance OverheadNoneNone<5%
  • No Black Boxes: Fully open-source stack, auditable orchestration, and zero vendor lock-in.
  • True Verifiability: Every workload, every call, every model—cryptographic attestation in real time.
  • Simple, Scalable, and Fast: Use UI, CLI, or API. Launch in minutes. Scale as you grow.

Ready to Build the Future of Trustless AI?

Try it today:

  • Deploy a Confidential AI Model
  • Launch a secure GPU in the Marketplace
  • Integrate via the Confidential AI API

About Phala

Phala Network is a decentralized cloud that offers secure and scalable computing for Web3.

With Phat Contracts, an innovative programming model enabling trustless off-chain computation, developers can create new Web3 use cases.

Get the latest Phala Content Straight To Your Inbox.