Reasoning about Deep Learning Models for Trustworthy Prediction
Deep Learning (DL) techniques are increasingly being incorporated in critical software systems today. DL software is buggy too. Recent work in SE has characterized these bugs, studied fix patterns, and proposed detection and localization strategies. Moreover, DL models are trained with certain assumptions about the data during the development stage and then used for prediction in the deployment stage. It is important to imply the trustworthiness of the model’s predictions with unseen data during deployment. First, we introduced design by contract for DL libraries to document the properties of DL libraries by writing contracts on DL APIs and provide developers with a mechanism to identify bugs during development to improve the reliability of deep learning software. We developed 15 sample DL contracts targeting common DL bugs related to performance issues and found they effectively prevented some common classes of structural bugs and training problems. Second, we proposed a novel technique that uses rules derived from neural network computations to infer data preconditions for a DNN model to determine the trustworthiness of its predictions. We evaluated our approach on 29 fully connected deep neural network models using four real-world tabular datasets and found significant results compared to prior work regarding effectiveness and efficiency in detecting correct and incorrect predictions during the deployment stage.
Third, building on these preliminary works, we propose to extend the concept of inferring data preconditions from models that utilize convolution layers and feature extraction from raw input data, such as images. Fourth, we propose a novel approach that leverages the inferred data preconditions to determine feature importance and explain the model’s predictions locally and globally by identifying key features. Next, we intend to determine out-of-distribution features in the training set that the model learned during training. To ensure the DL system’s robustness as a postcondition, we plan to utilize inferred data preconditions for finding unjustified out-of-distribution features, which is not desired for the DL model’s behavior. Finally, we propose data debugging from inferred data preconditions to detect problems in the training dataset based on what the model learned about the data during training and its failure during deployment to improve the trustworthiness of DL model’s prediction.
Committee: Hridesh Rajan (major professor), Hongyang Gao, Qi Li, Wallapak Tavanapong and Pavan Aduri
On Campus Location: Atanasoff 223
Join on Zoom: https://iastate.zoom.us/j/97690393920