Innovation @ HP Labs

Insights on research, innovation, and emerging technology from HP Labs researchers around the world.

Learn more at www.hpl.hp.com

BlinkDB: a query engine for making perfect decisions in the absence of perfect answers

Contributed by Vanish Talwar, Principal Research Scientist, and Indrajit Roy, Senior Researcher in HP Labs’ Intelligent Infrastructure Lab

 

Editor’s note: This is another in a series of posts featuring research projects discussed by visiting speakers to HP Labs Palo Alto.

 

74.jpg

BlinkDB is an ongoing project at AMPLab in UC Berkeley that supports bounded errors and bounded response times on very large data. Recently, the lead graduate student working on this project, Sameer Agarwal, gave a talk on BlinkDB at HP Labs. Sameer is a 4th year PhD student advised by Prof. Ion Stoica. He and his team have implemented BlinkDB as a sampling-based approximate query engine. It maintains a set of multi-dimensional, multi-resolution samples from original data which is updated over time. When a query comes in, BlinkDB dynamically selects an appropriately sized sample on which the query is executed. This is done by generating an error-latency profile for the query on different sample sizes. Sameer presented various results showing the effectiveness of BlinkDB. In particular, on a 17TB trace BlinkDB was 100x faster than Hive within an error of 2-10%.

 

This approach helps support interactive queries and is attractive in scenarios where perfect answers are not always needed. In such cases, approximation can be used to get an answer back in a user provided time bound and/or with user provided error bounds. A tradeoff can be made between query accuracy and response time.

 

BlinkDB is available as an open source version at: http://blinkdb.org. This is an ongoing research project and more details are available at the project’s website.

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the community guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
About the Author


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation