Ph.D. Preliminary Oral Exam: Benjamin Steenhoek

Ph.D. Preliminary Oral Exam: Benjamin Steenhoek

Dec 8, 2023 - 9:00 AM
to , -

Understanding and Improving Deep Learning Models for Vulnerability Detection

Static analysis has become critically important in facilitating software development velocity while detecting and preventing security vulnerabilities. Deep Learning (DL) has demonstrated satisfactory performance as an approach for vulnerability detection and can outperform some static analyzers on some open-source datasets of software vulnerabilities. However, existing vulnerability detection evaluations can be limited in scope and dimension, and this leads to neural network systems which have limitations such as not being well understood, being inefficient, and failing to generalize. The end products of our work will be an increased understanding and a body of techniques for improving DL models for vulnerability detection. First, we reproduced nine state-of-the-art (SOTA) DL models on widely used datasets of open-source software vulnerabilities. We investigated research questions involving the model capabilities, training data, and model interpretation, and provide guidance on understanding model results, preparing training data, and improving the robustness of the models. Building on the findings of this empirical study, we developed DeepDFA, a DL model based on a graph neural network (GNN) which is inspired by dataflow analysis. We demonstrate that DeepDFA outperforms several other models and improves upon state-of-the-art models, is efficient in terms of computational resources and training data, and generalizes to novel software applications better than other SOTA approaches.

We propose to study several follow-up problems: First, we plan to probe whether deep learning vulnerability detection models can learn vulnerability semantics. Second, we propose to study the capabilities of large language models (LLMs) for vulnerability detection and investigate the most effective techniques and common error patterns. Finally, we propose to integrate interprocedural context into the DL models, thus providing a more accurate evaluation of the models’ performance and filling a critical gap in the road from benchmarks to real-world application.