M.S. Final Oral Exam: Kun Ji

Event
Speaker: 
Kun Ji
Wednesday, November 10, 2021 - 9:00am
Event Type: 

Faster Linear-Inequality Feasibility Checking Algorithms for Intersection-Tree Construction

Rank-aware queries let users specify functions to score the records in a relation. An input to the functions produces a score for each record, which can be used for purposes such as ranking. While this feature makes rank-aware queries essential to a whole range of applications such as data analysis, the involvement of scoring functions makes them difficult to process. For example, to find the record at a particular rank under a function input, the current common approach is to compute the score for every record and sort them out.

Our prior research discovers that given a set of records and their scoring functions, the domain of these functions can be partitioned into a number of disjointed subdomains such that the functions can be sorted according to their scores in each subdomain. This theorem, referred to as rank con- sistency, suggests that we can compute the subdomains and sort the records for each subdomain to support highly efficient query processing. In light of this, we have designed a novel structure called Intersection-tree (I-tree), which computes the subdomains and and sorts the functions during the process of the tree construction. Unlike other techniques that are designed for only one particular query type, the I-tree can support various types of rank-aware queries such as top k and range queries. Our extensive evaluation also confirms the efficiency of the I-tree in query processing – it is more efficient than other techniques when compared head to head.

The I-tree, unfortunately, is computation-intensive to construct for a large number of records. A set of n d-variable linear functions, each used to score one record, can create a total of O(n2) in- tersections, which together can partition their domain into O(n2d) subdomains. These subdomains are computed and indexed while the intersections are inserted one by one to the I-tree. A major computation in this insertion process is the feasibility checking, i.e., given a set of inequalities I and an equation f(X) = 0, check if there exists two inputs X and X0 such that they satisfy all inequalities in I, f(X) < 0 and f(X0) > 0. The original implementation applies the classic Sim-plex algorithm to perform these checkings in two separate steps and thus incur a large amount of repetitive computations. In this thesis, we study the characteristics of the I-tree and propose two approaches to minimize of cost of tree construction: 1) reduce the complexity of feasibility checking, and 2) reuse computation results for next feasibility checking along the insertion path. We implement the proposed methods and evaluate their performance under various settings. Our extensive results show the new approaches outperform the original one to a large extent.

Committee: Ying Cai (major professor), Wallapak Tavanapong, and Qi Li

Join on WebEx: https://iastate.webex.com/iastate/j.php?MTID=m6b85c9e5d26c55de63f4a19296...