Get Another Label? Improving Data Quality and Machine Learning Using Multiple, Noisy Labelers

Lecture / Panel
For NYU Community

Speaker: Foster Provost, NYU Stern School of Business



I will discuss the repeated acquisition of "labels" for data items when the labeling is imperfect. Labels are values provided by humans for specified variables on data items, such as "PG-13" for "Adult Content Rating on this Web Page." With the increasing popularity of micro-outsourcing systems, such as Amazon's Mechanical Turk, it often is possible to obtain less-than-expert labeling at low cost. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus especially on the improvement of training labels for supervised induction.  We present repeated-labeling strategies of increasing complexity, and show several results, including:  (i) Repeated-labeling can improve label quality and model quality (per unit data-acquisition cost), but not always. (ii) Simple strategies can give considerable advantage, and (iii) carefully selecting a chosen set of data points for labeling does even better (we present and evaluate several techniques). The bottom line: the results show clearly that when labeling is not perfect, selective acquisition of multiple labels is a strategy that data modelers should have in their repertoire. I illustrate the results with a real-life application from on-line advertising: using Mechanical Turk to help classify web pages as being objectionable to advertisers.

This is joint work with Panos Ipeirotis, Victor S. Sheng, and Jing Wang.


Foster Provost is Professor, NEC Faculty Fellow, and Paduano Fellow of Business Ethics (Emeritus) at the NYU Stern School of Business. He just retired as Editor-in-Chief of the journal Machine Learning, and in 2001 he co-chaired the program of the ACM KDD conference. He is Chief Scientist for Coriolis Ventures, a NYC-based early stage venture and incubation firm. His publications include many papers on the focused intervention of human resources for data mining. He was an organizer of the First and Second Workshops on Human Computation (HCOMP), which focused on various variations of crowdsourcing and micro-outsourcing. His other main research interest these days is predictive modeling with social network data, for which he won the 2009 INFORMS Design Science Award. Foster has applied these ideas in practice to applications including on-line advertising, fraud detection, network diagnosis, targeted marketing, counterterrorism, and others.