Leveraging TPU for Homomorphic Encryption
Speaker
Jianming Tong
PhD candidate at Georgia Tech.
Title
"Leveraging TPU for Homomorphic Encryption"
Abstract
Cloud-based services are making the outsourcing of sensitive client data increasingly common. Although homomorphic encryption (HE) offers strong privacy guarantee, it requires substantially more resources than computing on plaintext, often leading to unacceptably large latencies in getting the results. HE accelerators have emerged to mitigate this latency issue, but with the high cost of ASICs.
In this paper we show that HE primitives can be converted to AI operators and accelerated on existing ASIC AI accelerators, like TPUs, which are already widely deployed in the cloud. Adapting such accelerators for HE requires (1) supporting modular multiplication, (2) high-precision arithmetic in software, (3) efficient mapping on matrix engines. We introduce the CROSS compiler (1) to adopt Barrett reduction to provide modular reduction support using multiplier and adder, (2) Basis Aligned Transformation (BAT) to convert high-precision multiplication as low-precision matrix-vector multiplication, (3) Matrix Aligned Transformation (MAT) to covert vectorized modular operation with reduction into matrix multiplication that can be efficiently processed on 2D spatial matrix engine.
Our evaluation of CROSS on a Google TPUv4 demonstrates significant performance improvements, with up to 161$\times$ and 4.5$\times$ speedup compared to the SotA many-core CPUs and GPUs.
About Speaker
Jianming Tong (https://jianmingtong.github.io/) is a PhD candidate at Georgia Tech, under the guidance of Dr. Tushar Krishna. He is also a visiting researcher at MIT and a student researcher at Google. His primary research area is Computer Architecture with major interest on software(MLSys'24)-system(MLSys'23, IEEE Micro'23)-hardware(ISCA'24) full-stack optimizations for privacy-preserving and performance-oriented AI workloads, i.e. make both AI and privacy-preserving AI faster and more efficient. He has extensive prototype experience on both ASICs (TOC, TVLSI, GLSVLSI) and FPGAs (FPT, SC) and internships at Alibaba DAMO Academy, Pacific Northwest National Lab and Rivos. His research is recognized by Qualcomm Innovation Fellowship, MLCommon Rising Star and DAC 2024 Young fellows.