Confidential AI Models

Others claim privacy. We prove it. Access frontier AI models on cloud, with proof that your data is protected end-to-end.

Phala Logo
Phala Confidential AI
No storage · No logs · End‑to‑end encryption
You
GO

Google: Gemma 3 27B

Encrypted
OP

OpenAI: gpt-oss-20b

Encrypted
OP

OpenAI: GPT OSS 120B

Encrypted
QW

Qwen: Qwen3 Coder

Encrypted
QW

Qwen: Qwen2.5 VL 72B Instruct

Encrypted
DE

DeepSeek: DeepSeek V3 0324

Encrypted
QW

Qwen2.5 7B Instruct

Encrypted
ME

Meta: Llama 3.3 70B Instruct

Encrypted
GO

Google: Gemma 3 27B

Encrypted
OP

OpenAI: gpt-oss-20b

Encrypted
OP

OpenAI: GPT OSS 120B

Encrypted
QW

Qwen: Qwen3 Coder

Encrypted
QW

Qwen: Qwen2.5 VL 72B Instruct

Encrypted
DE

DeepSeek: DeepSeek V3 0324

Encrypted
QW

Qwen2.5 7B Instruct

Encrypted
ME

Meta: Llama 3.3 70B Instruct

Encrypted
GO

Google: Gemma 3 27B

Encrypted
OP

OpenAI: gpt-oss-20b

Encrypted
OP

OpenAI: GPT OSS 120B

Encrypted
QW

Qwen: Qwen3 Coder

Encrypted
QW

Qwen: Qwen2.5 VL 72B Instruct

Encrypted
DE

DeepSeek: DeepSeek V3 0324

Encrypted
QW

Qwen2.5 7B Instruct

Encrypted
ME

Meta: Llama 3.3 70B Instruct

Encrypted
GO

Google: Gemma 3 27B

Encrypted
OP

OpenAI: gpt-oss-20b

Encrypted
OP

OpenAI: GPT OSS 120B

Encrypted
QW

Qwen: Qwen3 Coder

Encrypted
QW

Qwen: Qwen2.5 VL 72B Instruct

Encrypted
DE

DeepSeek: DeepSeek V3 0324

Encrypted
QW

Qwen2.5 7B Instruct

Encrypted
ME

Meta: Llama 3.3 70B Instruct

Encrypted

Win Your Users' Trust

Differentiate with verifiable privacy, build customer confidence with audit-ready cryptographic proofs, and enter regulated markets instantly.

Traditional AIConfidential AI

Integrate in Minutes

The easiest way to add cryptographic privacy to your AI applications. Drop-in replacement for OpenAI, Anthropic, and other major providers.

Supported providers:

OpenAIAnthropicGoogleMeta
Traditional AI
api.openai.com/v1/chat/completions
Phala Confidential AI Phala Confidential AI
encrypted-ai.phala.com/v1/chat/completions
  • Real-time proof generation
  • Audit-ready documentation
  • Customer-facing dashboard

Enterprise features

  • SLA Support
  • Custom Models
  • Volume Discounts
  • Priority Access
GPT OSS 120B
GPT OSS 120B
Input / Output
$0.14 / $0.49
Qwen3 Coder
Qwen3 Coder
Input / Output
$0.9 / $1.5
Llama-3.3-70B
Llama 3.3 70B
Input / Output
$0.1 / $0.25

Drop-in Replacement

Simply replace your API endpoint. Zero code changes required. Works with existing SDKs and frameworks.

Built-in Trust Center

Every request generates cryptographic proof. Show customers exactly how their data is protected with our Trust Center. View demo →

Enterprise Ready

Competitive pricing with enterprise features. Scale with confidence knowing costs won't surprise you.

Available Models

Access the latest frontier AI models with cryptographic privacy protection

PH

Google: Gemma 3 27B

Encryptedphala/gemma-3-27b-it

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, and chat capabilities, including structured outputs and function calling. Gemma 3 27B is Google's latest open source model, successor to [Gemma 2](google/gemma-2-27b-it)

54K context | $0.11/M input tokens | $0.40/M output tokens
PH

OpenAI: gpt-oss-20b

Encryptedphala/gpt-oss-20b

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware. The model is trained in OpenAI’s Harmony response format and supports reasoning level configuration, fine-tuning, and agentic capabilities including function calling, tool use, and structured outputs.

131K context | $0.10/M input tokens | $0.40/M output tokens
PH

OpenAI: GPT OSS 120B

Encryptedphala/gpt-oss-120b

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized to run on a single H100 GPU with native MXFP4 quantization. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation.

131K context | $0.14/M input tokens | $0.49/M output tokens
PH

Qwen: Qwen3 Coder

Encryptedphala/qwen3-coder

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning over repositories. The model features 480 billion total parameters, with 35 billion active per forward pass (8 out of 160 experts). Pricing for the Alibaba endpoints varies by context length. Once a request is greater than 128k input tokens, the higher pricing is used.

262K context | $0.90/M input tokens | $1.50/M output tokens
PH

Qwen: Qwen2.5 VL 72B Instruct

Encryptedphala/qwen2.5-vl-72b-instruct

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images.

128K context | $0.59/M input tokens | $0.59/M output tokens
PH

DeepSeek: DeepSeek V3 0324

Encryptedphala/deepseek-chat-v3-0324

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team.

164K context | $0.49/M input tokens | $1.14/M output tokens
PH

Qwen2.5 7B Instruct

Encryptedphala/qwen-2.5-7b-instruct

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. - Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. - Long-context Support up to 128K tokens and can generate up to 8K tokens. - Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

33K context | $0.04/M input tokens | $0.10/M output tokens
PH

Meta: Llama 3.3 70B Instruct

Encryptedphala/llama-3.3-70b-instruct

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. [Model Card](https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md)

131K context | $0.10/M input tokens | $0.25/M output tokens
Enterprise

Build Your Own PCC

Go beyond shared APIs. With our Confidential GPUs, you can deploy private, fully-audited AI clouds, tailored to your business or product. It's the same technology behind Apple's Private Compute Cloud (PCC), but more open and transparent. Now available for your own models and workloads.

Talk to Experts

Private Infrastructure

Private dedicated infrastructure for your AI workloads.

Custom Models

Deploy your own custom AI models securely.

Full Audit Trails

Complete compliance and audit documentation.

24/7 Support

Dedicated enterprise support team.

Fast Performance

Optimized for speed and efficiency.

Secure Processing

Hardware-protected confidential computing.

Compliance Ready

Meet regulatory requirements easily.

Scalable Solution

Grow with your business needs.

Frequently Asked Questions

Everything you need to know about Confidential AI

Ready to Build AI
People Trust?

Join 500+ teams deploying trustworthy AI in production

No credit card required. Deploy your first model in 5 minutes.

© 2025 Phala. All rights reserved. PrivacyTerms