Compression and Modern Data Processing

NYU Community Event

Speaker: Thomas Courtade

Host Faculty: Professor Elza Erkip


At first glance, modern applications of data processing -- such as clustering, querying, and search -- bear little resemblance to the classical Shannon-theoretic problem of lossy compression. However, the ultimate goal is the same for modern and classical settings; both demand algorithms which strike a balance between the complexity of the algorithm output and the utility that it provides. Thus, when we attempt to establish fundamental performance limits for these "modern" data processing problems, elements of classical rate distortion theory naturally emerge.

Inspired by the challenges associated with extracting useful information from large datasets, I will discuss compression under logarithmic loss. Logarithmic loss is a penalty function which measures the quality of beliefs a user can generate about the original data upon observing the compressor's output. In this context, we characterize the tradeoff between the degree to which data can be compressed and the quality of beliefs an end user can produce. Notably, our results for compression under logarithmic loss extend to distributed systems and yield solutions to two canonical problems in multiterminal source coding.

I will also briefly discuss recent work on compression for identification, where we seek to compress data in a manner that preserves the ability to reliably answer queries of a certain form. This setting stands in stark contrast to the traditional compression paradigm, where the goal is to reproduce the original data (either exactly or approximately) from its compressed form. Under certain assumptions on the data sources, we characterize the tradeoff between compression rate and the reliability at which queries can be answered.

About the Speaker

Thomas Courtade received the B.S. degree in Electrical Engineering from Michigan Technological University in 2007, and the M.S. and Ph.D. degrees in Electrical Engineering from UCLA in 2008 and 2012, respectively. In 2012, he was awarded the Inaugural Postdoctoral Research Fellowship through the Center for Science of Information. He currently holds this position, and resides at Stanford University. His recent honors include a Distinguished Ph.D. Dissertation award and an Excellence in Teaching award from the UCLA Department of Electrical Engineering and a Best Student Paper Award at the 2012 International Symposium on Information Theory.