M.S. Final Oral Exam: Urjoshi Sinha

Urjoshi Sinha
Monday, December 19, 2022 - 9:00am
Atanasoff Hall
Event Type: 

Analyzing the Impact of Configurations in Data-driven Software Applications

Highly configurable systems which offer users a large array of options to choose from, come with many challenges. Users of these software systems often want to optimize a particular objective such as improving a functional outcome or increasing system performance. However, many applications today are data-driven, meaning they depend on inputs or data which can be complex and varied. Examples of data-driven systems include tools such as a search engine and scientific software such as a DNA alignment tool. In any data-driven system, the output of the application is directly influenced by the information being fed to these systems which further depends on the nature of the end-user’s objectives. For instance, in a search engine the output depends on the input search query string or how the user is framing these queries. Hence, when trying to optimize these systems, a search needs to be run (and re-run) for all inputs, making optimization a heavy-weight and potentially impractical process. The second part of the problem is that we are also faced with the challenge of devising effective ways of testing these large configuration spaces, given their data-driven nature. In this work, we explore these issues on data-driven highly-configurable scientific applications in two domains- bio-informatics and cyber-physical systems.

In the first part of this thesis, we ask if it is possible to use an optimization algorithm to find configurations to improve functional objectives. We also try to find patterns of best configurations over all input data and try to examine if sampling can be used to approximate the results. For the data-driven application studied in the bio-informatics domain, we find that the default configuration is best only 34% of the time, while clear patterns emerge of other best configurations. We also find evidence that it is possible to use light-weight optimization approaches for this problem. Finally, we demonstrate that sampling of the input data helps find patterns at a lower cost. 

The second part of this thesis focuses on studying configurable autopilot tools used in cyber-physical systems such as drones. We present an intelligent approach of exploring configurability in autopilot tools by using model-based testing which involves feature-modeling, classification trees as some of the key components. We discover that it is possible to detect patterns of poor, or sub-optimal configurations for a given user-scenario using our method. This work thus aims to present efficient approaches to study data-driven configurable tools and optimize configuration spaces with regard to specific user objectives.

Committee: Myra Cohen (major professor), Samik Basu, and Jennifer Newman

Join on Zoom: Please click this URL to start or join. https://iastate.zoom.us/j/93261626832 or, go to https://iastate.zoom.us/join and enter meeting ID: 932 6162 6832