Effective and Efficient Information Access Systems for Making Sense of Big Data

Lecture / Panel
For NYU Community

Speaker: Jimmy Lin, University of Maryland

It would not be an exaggeration to say that big data has transformed many aspects of science, engineering, and commerce over the past few years. In this context, my recent work is driven by two questions:

  1. How do we build effective and efficient information access systems that help users make sense of massive quantities of data?
  2. How do we build scalable infrastructure to support these applications?

In this talk, I will present a number of search techniques that attempt to deliver high-quality results and to do so quickly. These techniques span a broad spectrum of the software stack, from low-level architectural optimizations that attempt to minimize branch mispredicts and cache-conscious memory layouts to "higher-up" machine-learning approaches for inducing ranking models that are sensitive to execution cost. Many of these techniques are inspired by real-world problems and production systems, from which my students and I develop generalized solutions in a rigorous manner.

I will conclude this talk with some thoughts about the future of big data and the evolving role of academic and industrial research.


Jimmy Lin is an Associate Professor and the Associate Dean of Research in the College of Information Studies (The iSchool) at the University of Maryland, with a joint appointment in the Institute for Advanced Computer Studies (UMIACS) and an affiliate appointment in the Department of Computer Science. He graduated with a Ph.D. in Electrical Engineering and Computer Science from MIT in 2004. Lin's research lies at the intersection of information retrieval and natural language processing; his current work focuses on large-scale distributed algorithms and infrastructure for data analytics. From 2010-2012, Lin spent an extended sabbatical at Twitter, where he worked on services designed to surface relevant content to users and analytics infrastructure to support data science.