NVIDIA cuPQC

NVIDIA cuPQC is an SDK of GPU-optimized cryptographic math libraries for building both classical and next-generation high-performance cryptographic applications.

Download Now


Key Features

Cryptographic Math Libraries

Provides direct access to a growing suite of GPU-optimized cryptographic math libraries, including hash functions, Merkle trees, and public-key operations. Production-ready libraries enable developers to build solutions for both traditional and next-generation cryptographic applications with maximum control and flexibility.

Unified Modular Component APIs

Delivers modular component-level APIs that provide mathematical building blocks that work consistently across diverse cryptographic domains and applications. The same patterns and components can be used across whether building classical or next-generation cryptographic systems, reducing complexity and accelerating development.

Support for Emerging Cryptographic Technologies

Powers a growing range of emerging cryptographic technologies—from post-quantum cryptography to zero-knowledge proofs and beyond. Domain-specific optimizations enable developers to efficiently build advanced cryptographic systems with specialized functions for each application.

Link-Time Optimization for Peak Performance

Applies link-time optimization to automatically select optimal GPU kernels for your specific configuration and parameters without manual tuning.

Broad GPU Platform Support

Delivers optimized performance across a broad range of NVIDIA GPU architectures—from NVIDIA Jetson™ edge devices to data center GPUs. Cryptographic solutions can be seamlessly deployed across any environment.

Exceptional Performance at Scale

Optimized for both low-latency single operations and high-throughput batch processing. Scales seamlessly from real-time responses to massive parallel workloads with GPU-accelerated performance.


Libraries

Composable GPU-accelerated cryptographic math libraries give developers direct access to high-performance primitives for building custom cryptographic solutions.
Available Libraries:

  • Hash Functions and Merkle Tree

  • Public Key Operations

Hash Functions and Merkle Tree

The cuPQC-Hash library delivers GPU-accelerated cryptographic hashing and Merkle tree operations for high-performance security applications, delivering exceptional throughput for both single and large-scale hash computations needed for applications including data integrity verification, proofs of membership, hash-based signatures, and secure authentication protocols. The library supports industry-standard algorithms alongside emerging cryptographic hash functions, providing flexibility for both traditional and next-generation security implementations.

Explore the cuPQC-Hash User Guide

Hash Functions

cuPQC-Hash supports GPU-accelerated implementations of SHA-2, SHA-3, SHAKE, and Poseidon 2 algorithm families, optimized for different security levels and performance requirements. 

The hash library demonstrates significant performance gains, processing thousands of hash computations simultaneously, delivering throughput ranging from hundreds to thousands of GB/s, depending on the algorithm. These performance improvements make cuPQC-Hash ideal for applications requiring rapid cryptographic operations at scale.

Hash Function Performance: On NVIDIA RTX PRO™ 6000 (Blackwell Server Edition), cuPQC-Hash achieves throughput ranging from 388 GB/s to 891 GB/s for 8 KB input messages, representing 14x to 103x higher throughput across SHA-2, SHA-3, and SHAKE algorithm families compared to AMD EPYC 9124 (32 threads) CPU baseline.

Merkle Tree

GPU-accelerated Merkle tree construction enables rapid proof generation and verification for large datasets. Parallelizing tree building drastically speeds up data integrity and authentication applications. Merkle trees are fundamental building blocks for zero-knowledge proof systems, hash-based signature schemes, and secure data verification protocols, enabling applications such as privacy-preserving authentication, verifiable computation, and confidential data verification.

Merkle Tree Generation Performance: NVIDIA RTX PRO 6000 (Blackwell Server Edition) GPU achieves 48x to 147x speedup compared to AMD EPYC 9124 (32 threads) CPU for tree generation using Poseidon 2 BabyBear-16 hash function across tree sizes from 216 to 222 leaves with 128 field element leaf inputs. GPU tree generation time ranges from 0.84 ms to 11.5 ms, with acceleration improving for larger trees (147x at 222 leaves), making it ideal for large-scale zero-knowledge proof applications.

Public Key Operations

The cuPQC-PK library delivers GPU-accelerated implementations of NIST-standardized ML-KEM-512/768/1024 and ML-DSA-44/65/87 with exceptional performance across diverse workloads, from low-latency single operations to high-throughput batch processing. cuPQC-PK allows seamless deployment across any NVIDIA GPU platform, from edge devices to data center infrastructure. Ideal for TLS handshakes, VPN tunneling, code signing, certificate authorities, and encrypted communications at scale.

Explore the cuPQC-PK User Guide

A chart showing ML-KEM-768 performance: on NVIDIA H100 GPU
ML-KEM-768 Performance: On NVIDIA H100 GPU, cuPQC-PK achieves 13.5M keygen/s, 9.0M encap/s, and 8.5M decap/s, delivering speedups of 175x, 120x, and 135x vs. single-core AMD EPYC 7313P CPU.
A chart showing ML-DSA-65 performance: on NVIDIA H100 GPU
ML-DSA-65 Performance: On NVIDIA H100 GPU, cuPQC-PK achieves 6.5M keygen/s, 0.6 sign/s, and 5.0M verify/s, delivering speedups of 290x, 65x, and 215x vs. single-core AMD EPYC 7313P CPU.

Latest Product News


Partners Adopting NVIDIA cuPQC

“cuPQC’s safe and high-performance algorithms make transitioning to post-quantum cryptography achievable for enterprises with high-throughput security applications”

- Hart Montgomery, Linux Foundation

NVIDIA cuPQC Partner - Evolution
NVIDIA cuPQC Partner - Open Quantum Safe
 NVIDIA cuPQC Partner - PQShield
NVIDIA cuPQC Partner - QuSecure
NVIDIA cuPQC Partner - Sandbox AQ

Resources