Skip to main content

Architecture

PQ Signer adds a post-quantum ML-DSA signature alongside a custodian's existing ECDSA signature on every transaction. The design pairs a Trusted Execution Environment with a cloud KMS: the TEE holds plaintext key material for the brief windows it is needed, and the KMS acts as the policy engine that decides when that material may be released, gating it on a cryptographic attestation of the TEE's measured code. The current implementation targets AWS Nitro Enclaves and AWS KMS; the same TEE+KMS pattern is intended to extend to additional cloud providers and TEE platforms over time. This page walks through how the signing service is structured, where keys live, what each component is allowed to see, and how a request flows end-to-end.

A signing service split into four trust zones

The whole design is organised around a single rule: plaintext ML-DSA private keys exist only inside a Nitro Enclave, briefly, while a signature is being produced. Everything else, including the parent EC2 instance the enclave runs on, is treated as untrusted for plaintext key material.

That gives four trust zones:

ZoneComponentsSees plaintext PQ keys?
UntrustedPublic internetNo
Authenticated clientCustodian service, holding an mTLS client certificateNo
Semi-trusted parentGateway EC2, Nitro host EC2, DynamoDB, IAM credentialsNo
Trusted enclaveNitro Enclave vCPU and memory, NSMYes, briefly during signing

Authorization sits with the custodian. By the time a request reaches PQ Signer, the custodian's own policy engine has already approved it and produced its own ECDSA signature on the same message. PQ Signer is a co-signer: it adds an ML-DSA signature, identified by an opaque user_id the custodian supplies. The enclave does no ECDSA verification of its own.

End-to-end flow

The deployment splits the parent into two EC2 instances joined by a private mTLS link. The Gateway holds DynamoDB credentials and no KMS access. The Nitro host holds KMS relay credentials and no DynamoDB access. The enclave talks only to KMS, through a vsock-proxy on the parent.

╭──────────────────────╮
│ Custodian client │
╰──────────┬───────────╯
│ HTTPS POST /v1/pq-signer (mTLS)

╭──────────────────────╮
│ Gateway EC2 │ IAM: dynamodb:* NO kms:*
│ validate, resolve │
│ user_id, DDB I/O │
╰──────────┬───────────╯
│ private mTLS, TCP 8443

╭──────────────────────╮
│ Nitro host EC2 │ IAM: kms:Decrypt, kms:GenerateDataKey (relay)
│ host-agent │
╰──────────┬───────────╯
│ vsock

╭─────────────────────────────────────────────────────────╮
│ KMS ↔ TEE channel │
│ │
│ ╭──────────────────╮ ╭─────────────────────╮ │
│ │ Nitro Enclave │ ──────▶ │ vsock-proxy (parent)│ │
│ │ (EIF) │ ◀────── │ TLS pass-through │ │
│ │ no net, no disk │ ╰──────────┬──────────╯ │
│ │ ML-DSA, NSM │ │ │
│ ╰──────────────────╯ ▼ │
│ ╭───────────────────╮ │
│ │ AWS KMS │ │
│ │ Recipient flow: │ │
│ │ - PCR check │ │
│ │ - EncCtx check │ │
│ ╰───────────────────╯ │
│ │
│ TLS terminates inside the enclave. │
│ Parent sees only ciphertext. │
╰─────────────────────────────────────────────────────────╯

The KMS↔TEE box is the critical piece. The vsock-proxy is just a TCP forwarder with a static YAML allowlist; it has no certificates, no credentials, and no view into what passes through it. TLS to kms.<region>.amazonaws.com is negotiated and terminated inside the enclave, SigV4 signing happens inside the enclave using credentials fetched on demand from the parent over a separate vsock channel, and the CiphertextForRecipient blob that KMS returns is wrapped to an ephemeral RSA-2048 key that the enclave generated for that single call and zeroes immediately after. Decryption of that envelope, recovery of the plaintext data key, and every downstream cryptographic operation all happen inside the enclave. The parent observes only opaque ciphertext on the wire, never a plaintext data key, never an ML-DSA private key, and never the KMS session that produced them.

What each component does

Gateway

The Gateway terminates mTLS from the custodian, validates the request shape, and routes by request_type. For key generation it can mint a user_id (UUIDv4) if the client omits one. For signing it forwards whatever user_id the client supplied and looks up the matching row in DynamoDB. It then relays everything to the host-agent over a private mTLS link and returns the response.

The Gateway has no KMS permissions on the wrapping CMK and an explicit IAM deny. It performs no signature verification.

Host-agent

The host-agent runs on the Nitro-capable parent EC2 instance. It is the only local vsock peer for the enclave. It accepts Gateway frames over rustls server-side mTLS on TCP 8443 and relays each frame to the enclave over vsock. When the enclave needs a KMS call, the host-agent makes it on the enclave's behalf using the Nitro host IAM role, and returns only CiphertextForRecipient plus the CMK-wrapped data key. A plaintext data key never crosses this relay; the enclave unwraps it itself.

Enclave

The enclave is stateless. No disk, no network, vsock only. It generates the ML-DSA keypair, generates an ephemeral RSA-2048 keypair per KMS call so KMS can wrap a data key to it, requests attestation documents from the NSM, performs AES-GCM with AAD that binds user_id, alg, and the ciphertext schema version, and zeroes every plaintext buffer the moment it is done.

Supporting daemons on the parent

The AWS vsock-proxy utility forwards TCP-over-vsock to kms.<region>.amazonaws.com:443 on a fixed port (8000) using a YAML allowlist. A small credential forwarder fetches the parent's IAM role credentials via IMDSv2 (HttpTokens=required, hop limit 1) and forwards them to the enclave on a dedicated vsock port (8001) with refresh before expiry. Neither daemon can inspect KMS traffic; TLS terminates inside the enclave.

DynamoDB

One table per custodian, keyed by user_id (string). Every row holds ciphertext or non-sensitive metadata:

AttributeTypeNotes
user_id (PK)stringOpaque identifier supplied at keygen
wrapped_dkbinaryOutput of kms:GenerateDataKey
ct_mldsa_privbinaryAES-256-GCM ciphertext: schema_version (1B) ‖ nonce (12B) ‖ ciphertext ‖ tag (16B)
mldsa_pubkeybinaryML-DSA public key
algstringML-DSA-44
birth_attestationbinaryCOSE_Sign1 document from the NSM at keygen time
enclave_versionstringBuild-time release label baked into the EIF
kms_key_arnstringCMK ARN used to wrap the data key
enc_ctxstring{"user_id": "<value>"}, for audit
created_atnumberUnix epoch

The table is safe to back up, replicate, and grant read-only access to operators. There is nothing here that compromises key material on its own.

KMS CMK

A customer-managed symmetric (AES-256) key. The key policy is the heart of the trust model:

  • It pins kms:RecipientAttestation:ImageSha384 (= PCR0), :PCR1, :PCR2, and :PCR8 to the custodian-built and custodian-signed EIF.
  • It requires EncryptionContext.user_id on every Decrypt and GenerateDataKey.
  • It denies Decrypt and GenerateDataKey for the Nitro host role when Recipient is absent, blocking direct parent-role calls even though IAM allows the API.
  • Automatic annual rotation is enabled; previously wrapped data keys remain decryptable.
  • ScheduleKeyDeletion is wrapped in dual-control change management.

NSM

The Nitro Security Module is a silicon-level chip on the Nitro card. It produces COSE_Sign1 attestation documents signed by the AWS Nitro Attestation PKI root (G1) and populates the platform configuration registers (PCRs) that KMS pins against.

Key hierarchy

Three layers of keys live in three different places when plaintext:

KeyPurposeWhere in plaintext
HSM Backing KeyRoot of the custodian's CMKInside the KMS HSM only, ever
Data KeyWraps one user's ML-DSA private keyKMS HSM (generation, unwrap); enclave memory (briefly)
ML-DSA private keySigns the requestEnclave memory (briefly, during signing)

This is classic envelope encryption: the ML-DSA private key is wrapped under a per-user AES-256 data key; the data key is wrapped under the custodian's CMK. Per-user data keys give containment. Compromising one user's stored ciphertext does not affect any other user.

How key generation works

When the custodian calls POST /v1/pq-signer with request_type=key_generation:

  1. The Gateway checks the request shape and resolves user_id. If the client passed one, it is forwarded verbatim. If not, the Gateway mints a UUIDv4.
  2. The request is relayed to the host-agent over mTLS, then to the enclave over vsock.
  3. The enclave generates an ML-DSA keypair, generates an ephemeral RSA-2048 keypair, and asks the NSM for an attestation document.
  4. The enclave calls kms:GenerateDataKey with KeySpec=AES_256, EncryptionContext={"user_id": user_id}, and Recipient=AttestationDocument. KMS performs two independent checks before responding: the attestation document's PCRs must match the policy, and the encryption context must be present. It returns { wrapped_dk, CiphertextForRecipient(DK) }.
  5. The enclave unwraps the plaintext data key from the CMS envelope using its ephemeral RSA private key, AES-GCM-encrypts the ML-DSA private key under it with AAD = user_id ‖ alg ‖ schema_version, and asks the NSM for a birth attestation whose user_data commits sha256(mldsa_pubkey) ‖ sha256(wrapped_dk) ‖ user_id ‖ kms_key_arn ‖ alg ‖ timestamp.
  6. The enclave reads its compiled-in enclave_version constant (only the enclave can read this string in the clear, since PCR2 is one-way), zeros every plaintext buffer, and returns the response.
  7. The Gateway writes the row to DynamoDB with a conditional PutItem (attribute_not_exists(user_id)) so a duplicate user_id cannot silently overwrite an existing record.
  8. The client receives { user_id, mldsa_pubkey, birth_attestation, enclave_version }.

The birth attestation is the keypair's permanent provenance. Anyone holding it can later prove the keypair was produced by a specific measured enclave for a specific user_id, without trusting the Gateway, the host-agent, or any auxiliary store.

How signing works

When the custodian calls POST /v1/pq-signer with request_type=sign:

  1. The Gateway validates the request shape and reads the matching row from DynamoDB by user_id.
  2. It relays { request_type, payload, wrapped_dk, ct_mldsa_priv } to the host-agent over mTLS, then to the enclave over vsock.
  3. The enclave calls kms:Decrypt with CiphertextBlob=wrapped_dk, EncryptionContext={"user_id": payload.user_id}, and Recipient=AttestationDocument. KMS rejects on either an attestation mismatch or an encryption-context mismatch, with an opaque InvalidCiphertextException that does not reveal which check failed.
  4. The enclave unwraps the data key from CMS, AES-GCM-decrypts ct_mldsa_priv (AAD mismatch fails authentication), and runs ML-DSA-sign(payload.message) over the full message. ML-DSA operates over the full message, not a hash, so the complete payload must reach the enclave.
  5. The enclave zeros the data key, the ML-DSA private key, the ephemeral RSA private key, and every intermediate buffer.
  6. The client receives { mldsa_signature, mldsa_public_key }.

Three independent checks happen before any plaintext appears: KMS verifies the enclave attestation, KMS verifies the encryption-context binding, and AES-GCM verifies the AAD. Any of them failing aborts the signing operation.

How updates and rollouts work

The vendor publishes a new release with source, a reproducible build recipe, and expected PCR values. The custodian reviews the diff, rebuilds deterministically, verifies the PCRs match, and signs the new EIF with their own certificate. They add the new PCR set to the CMK key policy alongside the previous values under dual-control change management, deploy the new EIF, drain old enclaves, and optionally tighten the policy to the new PCRs only after cutover.

Wrapped data keys in DynamoDB are bound to the CMK and to user_id via EncryptionContext, not to any specific enclave version. Once the new PCRs are listed in the policy, the new enclave can decrypt every existing row without re-wrapping or migration.

Birth attestations across upgrades

A birth attestation is meant to remain verifiable for the lifetime of the keypair, which can outlive any specific enclave version. The architecture achieves this without an auxiliary store: the PCR values measured at keygen are committed inside the birth_attestation COSE_Sign1 payload itself and signed by the AWS Nitro Attestation PKI, which is independent of any specific enclave image.

Any verifier (a relying party, an auditor, or a future enclave) does this:

  1. Fetch birth_attestation and optionally enclave_version from DynamoDB.
  2. Verify the COSE_Sign1 signature against the Nitro Attestation PKI root.
  3. Read (pcr0, pcr1, pcr2, pcr8) from the attestation payload.
  4. Match the tuple against the custodian's record of accepted enclave releases. The enclave_version column is a fast pre-check; the COSE_Sign1 payload is the ground truth.
  5. Confirm user_data matches the keygen metadata bound at signing time.

KMS key policy in one place

The custodian-managed CMK policy must include:

  • kms:RecipientAttestation:ImageSha384 equal to PCR0.
  • kms:RecipientAttestation:PCR1, :PCR2, :PCR8 equal to the manifest values.
  • kms:EncryptionContextKeys exactly user_id.
  • kms:EncryptionContext:user_id present.
  • Distinct principal statements: Nitro host/enclave role gets Decrypt and GenerateDataKey through the attested-recipient path only; operator/admin gets key management but no Decrypt; the Gateway role has an explicit IAM deny on the CMK.
  • A Deny on the Nitro host/enclave role for Decrypt and GenerateDataKey when Recipient is absent.

Why these PCRs

PCRContentPinned?Why
PCR0EIF hashYesPins the exact enclave image. Primary code-identity control.
PCR1Linux kernel and bootstrap hashYesPrevents kernel-level tampering.
PCR2User application hashYesPins the application layer independently of the kernel.
PCR3Parent IAM role ARNNoVaries across accounts and regions; would break scaling and DR. Principal authorization is already enforced by the policy statement.
PCR4Parent instance IDNoChanges on every launch; would tie key release to a single instance.
PCR8EIF signing certificate hashYesPins the custodian's signing authority. Stops a rogue image signed by a different key from accessing the CMK.

PCR0/1/2 pin the full software stack, PCR8 pins the signing authority, PCR3/4 are intentionally omitted to keep the deployment portable.