Ph.D. Preliminary Oral Exam: Masoud Nosrati

Ph.D. Preliminary Oral Exam: Masoud Nosrati

Oct 13, 2022 - 8:30 AM
to , -

Speaker:Masoud Nosrati

Leveraging Untrustworthy Cloud to Build Authentication Data Structures for Analytic Queries

This research studies the problem of extending users with the capability to verify if the result of an analytic query they receive from a potentially untrustworthy cloud is indeed correct. Here analytic query refers to the broad category of queries such as top-k query that involve scoring functions in query processing. Enabling users to perform query result authentication is crucial to data owners who consider clouds a cost-effective solution for large-scale data management but are concerned of their trustworthiness in query processing. Given a set of records and their scoring functions defined on a domain D, it has been shown that a data owner can partition D into a number of subdomains, sort the records based on their scores, and then build an authentication data structure (ADS), by which a cloud can construct a verification object to prove the correctness of analytic query result. Nevertheless, the existing work has largely ignored the costs of subdomain computation and verification object construction. To address this problem, we have developed a new ADS called Intersection and Function Merkle Hash-tree (IFMH-tree), an integration and extension of the concepts of Intersection-tree (I-tree) and Merkle Hash-tree (MH-tree). Our performance study, through both analysis and experiments, shows that the IFMH-tree supports highly efficient construction of verification objects. This technique, however, is preliminary at its best, because the IFMH-tree is prohibitively expensive to build. Given a database of n records ranked by d-variable linear functions, building such a tree takes O(n^(2d)) × O(n^k) of CPU time and needs storage that is O(n^(2d+2)) times of the original database size. Such resource demand is a major challenge even to cloud service providers, let alone the data owners who opt for outsourcing because of the lack of sufficient computing capability. 

The goal of this research is to enable data owners with minimal resources to build the ADS for analytic queries. The main cost incurred building the ADS is due to the fact that the process involves large numbers of feasibility checking, a procedure that takes a pair of functions and a domain as its input and determines if the two functions create an intersection that partitions the domain into two subdomains. Intuitively, finding solutions to systems of equations and constraints usually take significantly more computation than verifying the found solutions themselves. In light of this observation, we propose to investigate a new concept called Verifiable ADS (V-ADS), which is to have the cloud to perform most computation and the data owner to verify the computation results. Using V-ADS, we expect to develop two query result verification techniques, baseline and advanced. The former offloads most computation from the data owner to cloud to reduce the CPU cost from O(n^(2d)) × O(n^k) to O(n^(2d)) × O(1), whereas the latter reduces the storage and network cost from O(n^(2d)) + O(n^(d+2)) to at most O(n^(2d)). We plan to implement the two solutions and study the performance via testing real-world and simulated data.

Committee: Ying Cai (major professor), Qi Li, Wallapak Tavanapong, Cassandra Jo Dorius, and Shawn Fletcher Dorius

Join on Zoom: https://iastate.zoom.us/j/2763123224?pwd=UVdhTVRWMUZrSnBVUDlWRzBCcWNlUT09