Subverting Privacy-Preserving GANs: Hiding Secrets in Sanitized Images


This research was led by Siddharth Garg, Institute Associate Professor of electrical and computer engineering, and included Benjamin Tan, a research assistant professor of electrical and computer engineering, and Kang Liu, a Ph.D. student.  

Machine learning (ML) systems are being proposed for use in domains that can affect our day-to-day lives, including face expression recognition systems. Because of the need for privacy, users will look to use privacy preservation tools, typical produced by a third party. To this end, generative adversarial neural networks (GANs) have been proposed for generating or manipulating images. Versions of these systems called “privacy-preserving GANs” (PP-GANs) are designed to sanitize sensitive data (e.g., images of human faces) so that only application-critical information is retained while private attributes, such as the identity of a subject, are removed —  by, for example, preserving facial expressions while replacing other identifying information.

Such ML-based privacy tools have potential applications in other privacy sensitive domains such as to remove location-relevant information from vehicular camera data; obfuscate the identity of a person who produced a handwriting sample; or remove barcodes from images. In the case of GANs, the complexity involved in training such models suggests the outsourcing of GAN training in order to achieve PP-GANs functionality.

To measure the privacy-preserving performance of PP-GANs researchers typically use empirical metrics of information leakage to demonstrate the (in)ability of deep learning (DL)-based discriminators to identify secret information from sanitized images. Noting that empirical metrics are dependent on discriminators’ learning capacities and training budgets, Garg and his collaborators argue that such privacy checks lack the necessary rigor for guaranteeing privacy.

In the paper “Subverting Privacy-Preserving GANs: Hiding Secrets in Sanitized Images,” the team formulated an adversarial setting to "stress-test" whether empirical privacy checks are sufficient to guarantee protection against private data recovery from data that has been “sanitized” by a PP-GAN. In doing so, they showed that PP-GAN designs can, in fact, be subverted to pass privacy checks, while still allowing secret information to be extracted from sanitized images.

While the team’s adversarial PP-GAN passed all existing privacy checks, it actually hid secret data pertaining to the sensitive attributes, even allowing for reconstruction of the original private image. They showed that the results have both foundational and practical implications, and that stronger privacy checks are needed before PP-GANs can be deployed in the real-world.

“From a practical stand-point, our results sound a note of caution against the use of data sanitization tools, and specifically PP-GANs, designed by third-parties,” explained Garg.

The study, which will be presented at the virtual 35thAAAI Conference on Artificial Intelligence, provides background on PP-GANs and associated empirical privacy checks; formulates an attack scenario to ask if empirical privacy checks can be subverted, and outlines an approach for circumventing empirical privacy checks.

  • The team provides the first comprehensive security analysis of privacy-preserving GANs and demonstrate that existing privacy checks are inadequate to detect leakage of sensitive information. 

  • Using a novel steganographic approach, they adversarially modify a state-of-the-art PP-GAN to hide a secret (the user ID), from purportedly sanitized face images. 

  • They show that their proposed adversarial PP-GAN can successfully hide sensitive attributes in “sanitized” output images that pass privacy checks, with 100% secret recovery rate. 


“Our experimental results highlighted the insufficiency of existing DL-based privacy checks, and potential risks of using untrusted third-party PP-GAN tools,” said Garg, in the study.