Visit ShanghaiTech University | 中文 | How to find us
HOME > News and Events > Events
Bootstrap and Uncertainty Propagation: New Theory and Techniques in Approximate Query Processing
Date: 2016/9/14             Browse: 648

Bootstrap and Uncertainty Propagation: New Theory and Techniques in Approximate Query Processing

Speaker: Kai Zeng

Time: Sept. 14, 2:30pm - 3:30pm.

Location: Room 410, Teaching Center


Sampling is one of the most commonly used techniques in Approximate Query Processing (AQP)--an area of research that is now made more critical by the need for timely and cost-effective analytics over “Big Data”. The sheer amount of data and the complexity of analytics pose new challenges to sampling-based AQP, calling for innovations in various tech aspects. These include: how to estimate the errors of general SQL queries with ad-hoc user defined functions if computed on samples? How to better present the approximate query results to the user? How to build the database engines to be more suitable for approximate query processing?


In this talk, I will present a series of my work which answers the important questions mentioned above. We will see: (1) A fully automated statistics technique--bootstrap can be fully integrated with relational algebra theory and database systems, and provides accuracy estimation support for general OLAP queries. (2) With bootstrap error estimation technique in combination with a novel uncertainty propagation theory, OLAP query processing can shift to an incremental execution engine, which provides a smooth trade-off between query accuracy and latency, and fulfills a full spectrum of user requirements from approximate but timely query execution to a more traditional accurate query execution.



Kai Zeng is a senior scientist at Cloud and Information Service Lab, Mocrosoft. His research interest lies in large scale data intensive systems. He received his PhD in Database from UCLA in 2014. He used to work at AMPLab UC Berkeley as a postdoc researcher. He has won several awards, including SIGMOD 2012 best paper award and SIGMOD 2014 best demo award. 

SIST-Seminar 16067