Title: Improving Testing Performance by Dynamic Prioritization of Tests based on Method Invocation Orders
Date/Time: July 12th, 2017 @ 11:00 AM
Place: 223 Atanasoff Hall
Major Professor: Samik Basu
Committee Members: Pavankumar Aduri and Carl Chang
We present a dynamic test prioritization technique with the objective to speed up uncovering updates to existing software and therefore, increase the rate at which faulty software can be debugged. Our technique utilizes two types of data---the results of executing tests on prior version of the software; and the results of executing tests on the new version which determines the next test to be executed.
The contributions of the thesis are two-fold: understanding what constitutes an effective ordering of tests and developing an algorithm that can and efficiently generate such order.
At its cores, the proposed dynamic ordering technique relies on two basic conjectures. Firstly, tests that are closely related are likely to uncover similar updates/faults and tests that are not related are likely to widen the search for updates/faults. In other words, if a test uncovers updates in a software, i.e., its execution behavior (in terms coverage) differs considerably between prior and current version of the software, then selecting a test closely related to it is likely to be beneficial. Similarly, if a test does not uncover updates in a software, it would be good to select an unrelated test to execute next to increase the chances of better coverage. The relationship between tests are determined from the execution of tests while testing prior versions of the software. The second conjecture is that selecting tests in the above order will speed up uncovering bugs in the software.
We develop a baseline ordering using complete knowledge about the results of executing tests in two different versions of the software. The baseline ordering arranges the tests in descending order in terms of amount of changes the tests uncover between the prior and new version of the software. We evaluate the effectiveness of this ordering (i.e., the validity of the conjectures) by computing the rate at which the order can identify (seeded) bugs in a software--the measurement is referred to as APFD. The baseline order produces high APFD values indicating that the order is indeed effective. However, note that the baseline ordering can be only obtained if the tests are already executed in two versions of the software; the challenge is to identify the ordering before executing the tests on the version being tested.
We have developed an algorithm that estimates the baseline ordering. We evaluate the quality of the estimates using a rank relationship measure refer to as Order-Relationship Measure (ORM). We find that the ORM is high when call-sequences resulting from executing tests are used for estimation. We also find that low ORM implies low APFD values for the estimate. We have evaluated our algorithm on two non-trivial software repositories. We have investigated the role of two important parameters (thresholds capturing the closeness relationship between tests) in identifying high quality (high APFD) ordering and outlines how these parameters can be statically determined based on executing tests on the prior versions of the software. Finally, we have showed that the application of our algorithm in generating the test orders dynamically has close to 3% overhead.