Speaker: Kenny Zhu, Shanghai Jiao Tong University
There is massive amount of human language text both online and offline. Such text data is an extremely rich source of information which enables applications in many domains. Structured knowledge is crucial in understanding text. My research has been focused on extracting structured knowledge from unstructured text and then use such knowledge for text understanding and other natural language processing tasks. This talk introduces two of my projects in this endeavor.
In the first part, I present our solution to a generalized word disambiguation problem called Wikification. Wikification converts a piece of free text into a Wikipedia-style document with links from noun-phrase terms in the free text to their corresponding Wikipedia articles. Our key insight is to iteratively enrich the links in the original Wikipedia corpus to get comprehensive link co-occurrence knowledge and then use it for disambiguating terms in the new text. In the second part, I present a method to automatically extract verbs and their arguments from a large text corpus and generalize the arguments into a number of abstract concepts and thus create an "action concept" lexicon for all verbs. Example of action concepts include "play game", "play character", "wear jewelry", and "wear color", etc. Such action concepts provide more refined semantics than semantic roles in FrameNet, but at the same time allow generalization of different verb uses and enable automatic computation in many applications.
Kenny Q. Zhu is a Distinguished Research Professor (regular track) in the Department of Computer Science and Engineering at Shanghai Jiao Tong University. He graduated with a B.Eng (Hons) in Electrical Engineering in 1999 and a PhD in Computer Science in 2005 from National University of Singapore. He was a postdoctoral researcher and lecturer from 2007 to 2009 at Princeton University. Prior to that, he was a software design engineer at Microsoft, Redmond, WA. From Feb 2010 to Aug 2010, he was a visiting professor at Microsoft Research Asia in Beijing. Kenny's main research interests include information extraction, knowledge discovery and domain specific languages. His research has been supported by NSF China, MOE China, Microsoft, Google, Oracle, and AstraZeneca. He is also the winner of a 2013 Google Faculty Research Award.