Interactive Online Data Exploration and Analytics
Speaker: Feifei Li, University of Utah
With the continued growth of data from an every-increasingly connected world, an important challenge is to support interactive exploration over big data. But retrieving all records that satisfy a query condition could be expensive for big data. The CPU and communication cost of performing an analytical task on top adds additional overhead. As a result, waiting for exact analytical results may not be an option, especially if users want to explore data interactively. We propose a system with a knob that allows users to adjust the trade-off between query cost and approximation quality. Furthermore, this knob is online and enables users to tune the knob in real time as needed. We achieve this objective via introducing a data summary layer that provides mergeable and interactive data summaries. A basic approach is to produce online random samples for arbitrary queries and turn these samples into online estimators for various analytical tasks. We show how to do this efficiently and effectively for big data. In addition, we will also briefly review our work in building main-memory database systems to support acid-style, OLTP transactions, and secure analytical engine for data analytics over encrypted data.
Bio:
Feifei Li is an Associate Professor at the School of Computing, University of Utah. He obtained his B.S. in computer engineering from Nanyang Technological University, Singapore in 2002 (transferred from Tsinghua University, China) and PhD in computer science from Boston University in 2007. His research focuses on the data analytics, systems, and security problems in databases and big data management. He was a recipient for an NSF career award in 2011, two HP IRP awards in 2011 and 2012 respectively, a Google App Engine award, the IEEE ICDE best paper award in 2004, and the IEEE ICDE 10+ Years Most Influential Paper Award in 2014. He is/was the demo PC chair for VLDB 2014, general co-chair for SIGMOD 2014, PC area chair for ICDE 2014 and SIGMOD 2015, and an associate editor for IEEE TKDE and Journal of Computer Science and Technology. His research is actively supported by NSF, NSFC (NSF China), local and state government agencies, and the industry. He is currently the PI or co-PI for 5 NSF projects for nearly 5 million dollars, and in particular he is the PI for two NSF projects on big data systems and foundations for over 2 million dollars.