Design and Empirical Evaluation of Interactive and Interpretable Machine Learning

Lecture / Panel
For NYU Community

Speaker: Forough Poursabzi, Microsoft Research in NYC


Machine learning is ubiquitous in domains such as criminal justice, credit, lending, and medicine. Traditionally, machine learning models are evaluated based on their predictive performance on held-out data sets. However, to convince non-experts that these models are trustworthy and reliable in these critical domains, we need to go beyond traditional setup, where models are thought of as black boxes and impossible to interact with.

I will talk about my research on designing and evaluating machine learning systems that facilitate human interaction. I will start by introducing an interactive system that uses machine learning techniques such as topic models and active learning to help non-experts label document collections and make sense of them. I will demonstrate that effective interaction of users with machines leads to a better and faster understanding of the documents. Then, I will discuss the necessity of empirical evaluation of interpretability with humans in the loop. I will introduce a framework for isolating and measuring the effect of different properties of models on people's behavior. I will walk through a series of large-scale, randomized, pre-registered experiments to examine the effect of the number of input features and the model transparency on people's behavior during a specific task. Our findings emphasize the importance of studying how models are presented to people and empirically verifying that interpretable models achieve their intended effects on end users.


Forough is a post-doctoral researcher at Microsoft Research in New York City. She works in the interdisciplinary area of interpretable and interactive machine learning. Forough collaborates with psychologists to study human behavior when interacting with machine learning systems. She uses these insights to design machine learning models that humans can use effectively. She is also interested in several aspects of fairness, accountability, and transparency in machine learning and their effect on people 's decision-making process. Forough holds a BE in computer engineering from the University of Tehran and a PhD in computer science from the University of Colorado at Boulder.