Skip to main content

Benchmarks

We treat PQ readiness as measurable bottlenecks — bytes, sync points, and latency distributions — and benchmark relentlessly against the same workloads as we evolve the protocol. This page lists what we measure; specific numbers live in the dated benchmark reports.

Consensus metrics

  • Time to notarization / finalization (p50 / p95 / p99).
  • End-to-end bytes per view — total bytes a validator transmits and receives to advance one view.
  • Bytes broadcast per validator per view — the slope of the byte budget.
  • Durability sync points — WAL / fsync impact on latency tails. Larger PQ artifacts can make these more visible.
  • Sign-time tails under load — particularly for Falcon (rejection-sampling based).
  • Certificate size vs validator count, with and without thresholding. The headline metric for the PQ scaling cliff.

User layer metrics

  • Transaction byte size — pk-in-every-tx vs KeyVault key-id reference. The KeyVault-detached model reduces ML-DSA-44 wire size by ~35%.
  • Composite signature verification throughput — primary + cosigner, across the four supported schemes.
  • Precompile verification throughput — KeyVault, NonceManager, CryptoSwitchboard, ML-DSA verifier, Falcon verifier.
  • Block propagation vs tx size — as average tx size grows, how does propagation latency scale?
  • Cold-vault tx overhead — ML-DSA primary + SLH-DSA cosigner, vs a standard composite (ML-DSA primary + optional P256 / ECDSA cosigner).

Crypto metrics

  • Sign / verify microbenchmarks across CPU targets.
  • Implementation variance and tail latency — same scheme, different implementations, different hardware.
  • Mithril threshold signing latency under realistic custody workflows (TEE coordination, policy checks, network round-trips).

PQ Wallet Layer metrics

  • PQ smart wallet gas costs and overhead vs an equivalent ECDSA UserOp.
  • Verifier contract gas costs per supported chain (Stylus on Arbitrum: ~374K gas; pure-EVM verifier costs on chains without Stylus are higher).
  • End-to-end UX latency — sign → submit → finalize.
  • Tooling friction for Foundry / Hardhat and common dev stacks.

P2P transport evaluation (post-mainnet)

If / when we evaluate ML-KEM for P2P key establishment:

  • Handshake size and fragmentation behavior.
  • Handshake CPU costs and tail latency.
  • Connection churn impacts — reconnect storms, NAT traversal, mobile links.
  • Operational debugging complexity and failure modes.
  • Compatibility with existing network stacks and observability tooling.

We do not plan a hybrid KEM design at this stage. The intent is to decide on a clean single-lane approach if and when the measured data justifies it.

Localnet baselines

Measured on a current dev laptop with a debug build, empty blocks, localnet init --nodes 4:

  • Total RSS ≈ 10 GB (≈ 2.5 GB per node × 4).
  • Total CPU < 1 core.
  • Steady-state at empty blocks.

Real transaction load grows these numbers. The repository's dated benchmark reports (e.g., docs/coding-egress-bench-2026-04-29.md) cover outbound bandwidth at N = 4 and N = 8 under both Standard and Coding marshal variants. Re-run before drawing conclusions — the protocol moves and the numbers move with it.

How to read benchmark numbers honestly

A few notes that apply to every chart we publish.

  • A devnet number is a lower bound. L1 data costs are zero on devnet. Real Arbitrum, Optimism, or mainnet calldata pricing dominates the user-layer "real" cost.
  • Single-thread microbench ≠ end-to-end. Sign / verify microbenchmarks ignore scheduling, allocator pressure, and the cost of moving bytes between layers. We publish microbenchmarks for context, not for cost forecasting.
  • Tail latency is the metric that matters. Falcon signing has occasional rejection-sampling retries; Mithril threshold signing has more. p95 / p99 numbers tell a different story than p50 — both go in the report.
  • Validator count matters more than block size. Quorum proofs scale linearly with validator count. Double the validators, double the per-view byte budget. This is the leverage threshold signatures are meant to unlock.