Speaker: Philip Bohannon, Facebook
Each Facebook user can express his or her story and interests through the lens of a rich, structured entity graph of places, concepts and things. With the recent announcement of Graph Search, Facebook users can also explore the graph with a natural-language query experience. Maximizing the breadth, depth, and connection-quality of this graph improves the user-experiences of profile-completion, check-in, liking, tagging, and, of course, the quality of search results. In this talk, I will introduce the Facebook entity graph and define a variety of problems faced, and technologies employed, for managing and searching the graph.
Managing quality in the entity graph exposes many familiar problems such as entity matching, data categorization, normalization and a variety of prediction that must be solved at an extreme scale. Two factors further complicate the picture: First, a significant fraction of nodes in the entity graph serve as communication channels, allowing individuals, businesses and community groups to interact with individuals who have expressed an interest. Second, the use of the graph for self-expression makes it profoundly dynamic, as users create nodes and links in a variety of contexts on a daily basis. We will call out several ways that these differences affect the traditional definitions of problems and outline current solution approaches including an extensive crowd-sourcing infrastructure for place data.
In order to explore the social and entity graph, Facebook recently introduced Graph Search, a structured query facility with a natural language user-interface and expressive semantics. I will briefly introduce the technology stack that serves Graph Search queries, and point out some ways that the entity graph had to adapt to support high quality query answers.
Philip Bohannon manages the Entities Team at Facebook, a combined engineering and applied-science team responsible for several aspects of data quality at Facebook, including entity matching, crowdsourcing and attribute enrichment. Previously, he was a Sr. Principal Research Scientist at Yahoo Research, managing the Knowledge Management Research Team, and prior to that, a Member of Technical Staff at Bell Labs in Murray Hill, New Jersey. He received his PhD from Rutgers in 2000. He has published over 35 research papers, and is currently interested in structured data extraction, human-database feedback loops, and entity matching.