Every time you surf the web, check your credit card balance, or even sign up for a class at UC Merced, you’re using a relational database.
Relational databases are basically the back end of operating software, aggregating information and culling results based on your search or query. Improving the speed of these results, known as query optimization, is the focus of computer science and engineering Professor Florin Rusu and his third-year graduate students Yesdaulet Izenov and Asoke Datta.
“Every single company that allows you to create an account is likely using a relational database. That’s pretty much how things work on the internet,” Rusu said. “The information you’re seeing, that’s displayed to you, corresponds to what you searched for, but you only see what appears on your screen. What is happening under the hood is that information is extracted from a relational database.”
Working in code, or “under the hood,” is Rusu’s specialty. He and his graduate students find ways to improve the speed of query results when people utilize databases. Rusu received a $499,870 grant from the National Science Foundation last fall to fund his research on COMPASS, a novel query optimization paradigm that uses sketch data synopses.
“Our work is to optimize processing time by using a special type of data structure called sketches,” Izenov explained. Sketches compress large amounts of data into small arrays of random counters at a very quick rate.
Queries vary in complexity. For instance, if you’re using a database to search for an engineering class at a specific time in a specific part of campus, the database must process three pieces of data and combine, or join them, before displaying the results you see. That processing takes time and adding more data sources makes the database work harder. The model Rusu’s team created processes the joins with Fast-AGMS sketches, making the sketches the only statistic the database must contend with, which improves both the speed and accuracy of results.
“The way you access the data can be a problem. You want to find an optimal access plan that makes your query fast in a quick amount of time. That’s our goal,” Datta said.
COMPASS operates 1.35 to 11.28 times faster than four other well-known database systems and is available on GitHub, an open-source website where code is shared with other software developers.
Rusu published his findings with Izenov, Datta and UC Merced alumnus Jun Hyung Shin in the paper “COMPASS: Online Sketch-based Query Optimization for In-Memory Databases.” They also presented their findings at ACM SIGMOD, the International Conference on Management of Data, in June.