Ph.D. Final Oral Exam: Samantha Syed Khairunnesa
Speaker:Samantha Syed Khairunnesa
Context-aware Contract
Application programming interfaces (APIs) are indispensable parts of modern software development. The APIs allow us to avoid reinventing the wheel; however, it also calls for correct usage of APIs to enable the developers to write reliable codes. Contracts specify the proper use of APIs (and methods in general). Thus, it tells the client of a method the obligations that are expected to be satisfied. In return, the callee method will accommodate the part of the contract to the client (e.g., to the caller method) that it promises. The formal contract, even informal documentation, can assist the developers in understanding the functionality of APIs and the right way to use it. Thus, it is crucial to understand and collect the contract for APIs developers use to construct software. In addition, we see that utilizing machine learning (ML) APIs is becoming more frequent to solve conventional algorithmic problems in modern days. The key reason is the robustness of these techniques. However, the machine learning APIs are not free of error either. Although a rich body of literature discusses the contracts in non-ML APIs, ML APIs have not been studied to understand the contracts. This is the first problem the thesis addresses. In this work, we present the first work to understand the type of contracts required for ML APIs to investigate whether the contracts are different from traditional counterpart software APIs. We give an empirical study to document contracts for ML APIs in this work. One of the key insights in this study is that the software engineering (SE) community can engage some current contract mining approaches to mine contracts for ML APIs due to the similarity in the type of contracts that ML APIs share with traditional APIs. Still, there might be a necessity to combine behavioral and temporal contract mining approaches that have been independently developed so far. In addition, the contracts for ML APIs also require capturing the context information, e.g., output data labels, level of the layer constructing the neural network, etc. We see that along with system failures; such contract breaches result in incorrect functionalities, performance issues, etc. As a result, there is an immediate need to formalize the contracts for ML APIs that can capture this type of context information to support reasoning by the specification language. Interestingly, this problem persists in several application domains, including ML. As a solution to this problem, the second part of the thesis focuses on formalizing the contracts for APIs (or methods in general) that show context dependency. Although the idea of context is similar to what is observed in context-oriented programming (COP), this work focuses on writing contracts for (API) methods that may behave differently depending on the state of the context. Furthermore, the context-dependent methods in question also require maintaining shared context. At this moment of writing this thesis, there is no explicit way to describe this type of context information directly using state-of-the-art contract languages. We propose a context-aware contract language in this thesis to allow reasoning on methods that involve context information. We present the applicability of such a language, discuss the design in terms of syntax, operational semantics, and type rules.
Finally, we describe an automatic contract mining technique to address the sparse usage problem in usage-based precondition mining. Our insight is to leverage the knowledge that can be understood through the language’s constructs and semantics. And to capture this knowledge, our approach includes a technique to analyze the data and control flow in the program that leads to API calls and infer conditions implicitly present in the code.
Committee: Hridesh Rajan (major professor), Carl Chang, Gary Leavens, Wei Le, and Robyn Lutz
Join on Zoom: https://iastate.zoom.us/j/95168065881?pwd=eEFjOXJTL002eUlOcDZvbExvV2Rjdz09