Brandon Reagen
-
Assistant Professor
Brandon Reagen is an Assistant Professor in the Department of Electrical and Computer Engineering with affiliation appointments in the Computer Science. He earned a PhD in computer science from Harvard in 2018 and received his undergraduate degrees in computer systems engineering and applied mathematics from the University of Massachusetts, Amherst, in 2012.
A computer architect by training, Brandon has a research focus on designing specialized hardware accelerators for applications including deep learning and privacy preserving computation. He has made several contributions to ease the use accelerators as general architectural constructs including benchmarking, simulation infrastructure, and System on a Chip (SoC) design. He has led the way in highly efficient and accurate deep learning accelerator design with his studies of principled unsafe optimizations, and his work has been published in conferences ranging from computer architecture, machine learning, computer aided design, and circuits.
Prior to joining NYU, he was a research scientist with Facebook’s AI Infrastructure Research working on privacy preserving machine learning and systems for neural recommendation. During his PhD he was a Siebel Scholar (2018) and was selected as a 2018 Rising Star in Computer Architecture by Georgia Tech.
Research News
DeepReDuce: ReLU Reduction for Fast Private Inference
This research was led by Brandon Reagen, assistant professor of computer science and electrical and computer engineering, with Nandan Kumar Jha, a Ph.D. student under Reagen, and Zahra Ghodsi, who obtained her Ph.D. at NYU Tandon under Siddharth Garg, Institute associate professor of electrical and computer engineering.
Concerns surrounding data privacy are having an influence on how companies are changing the way they use and store users’ data. Additionally, lawmakers are passing legislation to improve users’ privacy rights. Deep learning is the core driver of many applications impacted by privacy concerns. It provides high utility in classifying, recommending, and interpreting user data to build user experiences and requires large amounts of private user data to do so. Private inference (PI) is a solution that simultaneously provides strong privacy guarantees while preserving the utility of neural networks to power applications.
Homomorphic data encryption, which allows inferences to be made directly on encrypted data, is a solution that addresses the rise of privacy concerns for personal, medical, military, government and other sensitive information. However, the primary challenge facing private inference is that computing on encrypted data levies an impractically high penalty on latency, stemming mostly from non-linear operators like ReLU (rectified linear activation function).
Solving this challenge requires new optimization methods that minimize network ReLU counts while preserving accuracy. One approach is minimizing the use of ReLU by eliminating uses of this function that do little to contribute to the accuracy of inferences.
“What we are to trying to do there is rethink how neural nets are designed in the first place,” said Reagen. “You can skip a lot of these time and computationally-expensive ReLU operations and still get high performing networks at 2 to 4 times faster run time.”
The team proposed DeepReDuce, a set of optimizations for the judicious removal of ReLUs to reduce private inference latency. The researchers tested this by dropping ReLUs from classic networks to significantly reduce inference latency while maintaining high accuracy.
The team found that, compared to the state-of-the-art for private inference DeepReDuce improved accuracy and reduced ReLU count by up to 3.5% (iso-ReLU count) and 3.5× (iso-accuracy), respectively.
The work extends an innovation, called CryptoNAS. Described in an earlier paper whose authors include Ghodsi and a third Ph.D. student, Akshaj Veldanda, CryptoNAS optimizes the use of ReLUs as one might rearrange how rocks are arranged in a stream to optimize the flow of water: it rebalances the distribution of ReLUS in the network and removes redundant ReLUs.
The investigators will present their work on DeepReDuce at the 2021 International Conference on Machine Learning (ICML) from July 18-24, 2021.