Ph.D. Final Oral Exam: Ibrahim Mesecan
Speaker:Ibrahim Mesecan
A Framework to Adapt Genetic Improvement to New Domains
Finding and fixing software faults is a challenging and expensive process. Companies spend billions of dollars on this process every year. To help reduce these expenses, numerous tools have been developed to automatically detect and fix software issues. Automated Program Repair (APR) tools, in particular, have become a focus of research and are increasingly being adopted in the industry.
However, common APR tools make several assumptions. For example, they assume that there is a fault somewhere in the target program and that it was previously detected, giving us a set of both passing and failing test cases. Many common genetic mutation operators also assume that a solution can be found within the target program. These assumptions, however, may not hold in new domains.For example, information can leak from a program without an actual programming fault. And in non-traditional programming domains, the program repair paradigm may not have a direct mapping, making it difficult to utilize. Genetic Improvement (GI) does not assume there are faults in the code, however, it may also be limited by the same issue with operators as APR.
In this dissertation, we discuss a framework for adapting APR and GI in new domains by improving programs without the need of faults, and adding operators that are customized for these domains. We demonstrate the feasibility of this framework on two domains where, to the best of our knowledge, APR and GI have not been previously used: Information Flow Control (IFC) and molecular programs written in the form of Chemical Reaction Networks (CRN).
This dissertation also provides an investigative study on automated test set generation tools and explores the use of multi-objective optimization and meta multi-objectivization in the IFC domain. We introduce two new mutation operators for use in APR/GI: NewIF and NewFor. These operators create new statements by leveraging identifiers extracted from within the target program, assisting in finding solutions when they are not directly available in the target program.
Our experimental results show that we were able to reduce information leakage, a security flaw, on 89\% (25 / 28) of the test subjects we used. Overall, we achieved a reduction in information leakage while retaining functionality of patches 42.6\% of the time. Furthermore, our multi-objective optimization experiments revealed that SPEA2 helped find the best median results in the IFC domain. Finally, the results in the Chemical Reaction Networks (CRN) domain indicate that we were able to fix 84.2\% (16 / 19) of the test subjects that we used.
Committee: Myra Cohen (major professor), James Lathrop, Robyn Lutz, Wensheng Zhang, and Philip Dixon.
Join on Zoom: https://iastate.zoom.us/j/98382556614?pwd=IptR6eSp0Sj82a9jC784lO5pU1MBde.1