Domain Adaptation Theory and Algorithms

Lecture / Panel
For NYU Community


Mehryar Mohri
Courant Institute of Mathematical Sciences
New York University


Early learning theory and algorithms were designed for an ideal world. Modern day large-scale data sets in search engine design, computational biology, natural language processing, computer vision, and many other applications do not comply with the ideal assumptions originally made: training instances are often poorly labeled, the sample can be biased, the distributions may drift with time, and the points may not be i.i.d. These issues must be addressed for learning to be effective. Ignoring them often leads to poor performance. A new theory and new algorithms are required.

This talk will address the specific problem of single-source domain adaptation which commonly arises when the distribution of the source labeled data somewhat differs from that of the target domain. It will present novel theoretical results for adaptation and provide algorithmic solutions derived from that theory. It will also report the result of several preliminary experiments.

Joint work with Corinna Cortes, Yishay Mansour, and Afshin Rostami.

About the Speaker

Mehryar Mohri is a professor of computer science at the Courant Institute of Mathematical Sciences at NYU and a research consultant at Google. His current  topics of interest include machine learning, theory and algorithms, text and speech processing, and computational biology.

Prior to the Courant, Mohri worked for about ten years at AT&T Research, formerly AT&T Bell Labs (1995-2004), where he in the last few years was the Head of the speech algorithms department and served as a Technology Leader, overseeing research projects in machine learning, text and speech processing, and the design of general algorithms. His work on speech recognition and that of his team form the algorithmic foundation for most spoken-dialog applications commercially deployed in the U.S. Professor Mohri is also the author of several software libraries widely used in research and academic institutions around the world.