Building a Computational Toolkit for Data Science

Monday, March 5, 2018 - 10:00am EST

Speaker: Christopher Musco, MIT


      Access to increasingly massive datasets is a primary driver of advances in machine learning and data science. However, while more data allows us to fit richer models and detect more complex patterns, it presents a major computational challenge for even basic statistical methods.
      In this talk, I will discuss my efforts to address this challenge by designing highly scalable, general purpose algorithmic tools for processing large datasets. I will highlight my research on fast dimensionality reduction algorithms, which accelerate statistical methods by first compressing data to a small set of highly informative features. By building a deeper theoretical understanding of dimensionality reduction, this work has led to faster algorithms for fundamental problems like linear regression, principal component analysis, and clustering. These algorithms are provably accurate, simple to implement, and effective in practice.
      I will also discuss the important challenge of understanding dimensionality reduction for nonlinear methods in machine learning. I will present exciting open directions and recent progress on accelerating some of these powerful techniques. Finally, I will explain how research on algorithmic tools can have impact beyond faster computation, informing how we explore, interpret, and ultimately learn from large datasets.

      Christopher Musco is a Ph.D. candidate in computer science at MIT, advised by Professor Jonathan Kelner. He studies scalable algorithms for core problems in data science and machine learning, with a particular interest in large-scale linear algebra and graph processing. His work is interdisciplinary, combining tools from theoretical computer science, scientific computing, optimization, and statistics to develop state-of-the-art algorithms in both theory and practice. Christopher received his undergraduate degree in Computer Science and Applied Mathematics from Yale University.