20 May 2026 5 min read

Announcing PREM Confidential APIs

Today marks a pivotal milestone in our mission to build Private Super Intelligence. We are excited to announce the Beta of the Prem Confidential API, a foundational block in a new stack designed to ensure AI remains private, verifiable, and sovereign.

The Sovereign Alternative

Standard AI APIs process your most sensitive data as prompts, files, and conversations in plaintext on their servers. While HTTPS protects data in transit, it offers no protection at the endpoint.

Prem changes this structural reality. Our Confidential API allows you to run state-of-the-art models inside hardware-sealed Trusted Execution Environments (TEEs). No one, not us, not the cloud provider and not even someone with physical access to the hardware can access your data.

Introducing the Core Capabilities

We are launching three essential AI functionalities, all unified under a single product and with the same security and privacy guarantees:

Confidential Chat Completions: Seamlessly switch from OpenAI with our compatible API to run leading open-source models
Image Understanding: Perform secure vision tasks, including OCR and image description, without exposing proprietary visual data
Encrypted Transcriptions with Deepgram: We’ve partnered with Deepgram to self-host their proprietary STT models within our confidential infrastructure. This allows for high-accuracy, diarised transcription in a verifiable, private environment

Access a variety of models securely using the Prem API

Confidential Computing: foundational trust

What builds trust: a TEE is a hardware-isolated region that keeps code and data protected from the hypervisor, host OS, and infrastructure operators. The trust boundary shifts from policies, data agreements and operator promises to a cryptographic guarantee, rooted in the silicon. The cloud provider can only start and schedule the workloads, but can’t look inside.

AMD Secure Encrypted Virtualization’s guest memory pages encryption has been present since the first generation. SEV-SNP introduced a per-page authentication tag and prevents the hypervisor from manipulating memory mappings, defeating both tampering and rollback attacks, known as Reverse Map Table (RMP).

The RMP is a data structure that syncs the domains with respective owners. The hypervisor can schedule the VM, but cannot see or silently rewrite its memory.

This situation is analogous to a hotel manager lacking master keys and being unaware of the activities of the guests within their assigned rooms.

Intel Trust Domain eXtensions takes a similar VM-scale approach but adds a new CPU mode called SEAM (SEcure Arbitration Mode). A TDX module running in SEAM brokers all communication between the virtual machine manager and the Trust Domain. TD is a private memory encrypted with a per-TD AES-XTS key of 128 or 256 bit length. TDX Connect closes the gap by letting a Confidential VM extend its trust boundaries to specific devices. If the Confidential VM trusts the device, it opens an encrypted and integrity-protected channel between the CPU and the device.

Modern AI workloads need GPUs as accelerators; a CPU enclave is useless if the GPU is untrusted. Although other techniques exist, they don’t match the same degree of confidentiality or are not able to provide similar speeds. NVIDIA extends the trust boundary onto the accelerator through its Secure Gen AI lineup.

The Hopper architecture pioneered the hardware-based GPU TEEs, anchored in an on-die root of trust, becoming a foundation for subsequent generations: Blackwell and Rubin.

While the GPU is in Confidential Compute Mode, it establishes an integrity-protected and encrypted session with the CPU Confidential Virtual Machine, allowing the Confidential Virtual Machine to move data between CPU and GPU.

None of this matters if you can't verify what you're talking to is genuine, correctly configured hardware. Each vendor emits a signed attestation report describing the silicon, firmware versions, and launch measurements.

Trust is not assumed but mathematically enforced: the workload produces cryptographic hardware evidence signed by the vendor, a relying party verifies it and only then are keys or secrets released into the enclave. That's the whole idea in one line: don't trust the operator, trust the signature.

Open-Sourcing "Reticle": Don't Trust, Verify

At Prem, "we promise we won't look" is not enough; we believe in "don't trust, verify". To support this, we are open-sourcing Reticle, our hardware attestation stack entirely written in Rust.

Reticle allows developers to cryptographically verify the integrity of the execution environment directly from the end customer side, whether it’s a browser environment or an application. It provides hardware-signed proof that your data is being processed by:

NVIDIA GPUs in confidential compute mode
AMD SEV-SNP or Intel TDX CPUs

By making our verification stack open-source, we ensure you don't have to take our word for anything; the hardware manufacturer's signature is validated on your own machine.

Verification is a continuous, auditable process. Our security posture goes further: we transparently publish our report validation policies on GitHub. These strictly governed policies are designed to pre-empt known TEE attacks, misbehaviours and older versions (eg, CPU microcode), ensuring a dynamic and verifiable defence layer.

Building the Future of Private Super Intelligence

The Confidential API is more than just a tool; it is the infrastructure layer for a broader vision: giving humanity back its digital sovereignty.

As AI agents increasingly manage our health, finances, and work, the privacy stakes compound.

This release establishes the core of the Prem Stack, and its foundational element is the Prem API: The developer bridge for private inference.

We are building a future where intelligence is a proprietary asset you own, architecturally guaranteed by verifiable proof, not a service you rent at the cost of your data.

Explore comprehensive documentation for the Infrastructure and API at docs.prem.io.