Status: Complete
Preprint: "Parallelizing the Variational Quantum Eigensolver: From JIT Compilation to Multi-GPU Scaling" — arXiv:2601.09951 (with Ashton Steed)
GitHub: github.com/rylanmalarchick/QuantumVQE
Overview
GPU-accelerated Variational Quantum Eigensolver (VQE) for quantum chemistry simulations. Achieved 117× speedup through a 4-phase optimization pipeline, reducing H2 potential energy surface computation from 593.95s to 5.04s while maintaining near-exact accuracy.
Key Results
| Metric | Value |
|---|---|
| Total Speedup | 117× |
| Original Time | 593.95s |
| Optimized Time | 5.04s |
| GPU Advantage (26 qubits) | 80.5× |
| Parallel Efficiency (4 GPUs) | 99.4% |
| Max Qubits (single H100) | 29 (before OOM) |
| Ground State Energy | -1.137 Ha (at equilibrium) |
| Bond Lengths Computed | 100 |
Optimization Pipeline
Phase 1: JIT Compilation
- Just-in-time compilation of quantum circuit operations
- Eliminates Python interpreter overhead
- Uses PennyLane Catalyst for quantum-aware JIT
Phase 2: GPU Acceleration
- Single-GPU parallelization of quantum state evolution
- CUDA-accelerated matrix operations via
lightning.gpu - Speedup scales from 10.5× (4 qubits) to 80.5× (26 qubits)
Phase 3: Multi-GPU Scaling
- Distributed computation across 4× NVIDIA H100 GPUs
- Efficient memory management for large state vectors
- 99.4% parallel efficiency maintained
Phase 4: MPI Parallelization
- OpenMPI distribution across 192 AMD EPYC cores
- Hybrid CPU-GPU workload balancing
- Data-parallel VQE across bond lengths
Scaling Study Results
| Qubits | CPU Time | GPU Time | Speedup |
|---|---|---|---|
| 4 | 0.21s | 0.02s | 10.5× |
| 10 | 0.89s | 0.04s | 22.3× |
| 18 | 12.4s | 0.31s | 40.0× |
| 26 | 891.2s | 11.1s | 80.5× |
Hardware
Computed on ERAU Vega HPC Cluster:
- 4× NVIDIA H100 GPUs (80GB each)
- 192 AMD EPYC CPU cores
- High-bandwidth NVLink interconnect
Application
Computed the H2 molecular potential energy surface—the ground state energy of molecular hydrogen across 100 different bond lengths. This is a standard benchmark for quantum chemistry methods and demonstrates the practical utility of VQE for molecular simulation.
The 117× speedup enables interactive exploration of quantum chemistry problems that would otherwise require batch processing.
Technology Stack
| Category | Technology |
|---|---|
| Quantum Framework | PennyLane, Catalyst |
| GPU Acceleration | JAX, CUDA, lightning.gpu |
| Parallelization | OpenMPI (mpi4py) |
| HPC | PBS, SLURM |
| Hamiltonians | Random Pauli, TFIM, Heisenberg |