Ph.D. Final Oral Exam: Benjamin Steenhoek

Event
Speaker: 
Benjamin Steenhoek
Thursday, October 3, 2024 - 1:30pm
Location: 
Atanasoff 223
Event Type: 

Understanding and Improving Deep Learning Models for Vulnerability Detection

Vulnerability detection tools are essential for ensuring security while maintaining software development velocity. Deep Learning (DL) has shown potential in this domain, often surpassing static analyzers on certain open-source datasets. However, current DL-based vulnerability detection systems are limited, resulting in models that are poorly understood, inefficient, and struggle to generalize, and their applicability in practical applications is not well understood. In this thesis, we comprehensively evaluate state-of-the-art (SOTA) DL vulnerability detection models, including Graph Neural Networks (GNNs), fine-tuned transformer models, and Large Language Models (LLMs), yielding a deeper understanding of their benefits and limitations and a body of approaches for improving DL for vulnerability detection using static and dynamic analysis.

First, we empirically study the model capabilities, training data, and model interpretation of fine-tuned graph neural networks and transformer models and provide guidance on understanding model results, preparing training data, and improving the robustness of the models. We found that state-of-the-art models were limited in their ability to leverage vulnerability semantics, which are critical aspects of vulnerability detection.

Building on these findings, we developed DeepDFA and TRACED, which integrate static and dynamic analysis into DL model architecture and training. DeepDFA is a GNN architecture and learning approach inspired by dataflow analysis. DeepDFA outperforms several other state-of-the-art models, is efficient in terms of computational resources and training data, and generalizes to novel software applications better than other SOTA approaches. TRACED is a transformer model which is pre-trained on a combination of source code, executable inputs, and execution traces. TRACED improves upon statically pre-trained code models on predicting program coverage and variable values, and outperforms statically pre-trained models in two downstream tasks: code clone retrieval and vulnerability detection.

Additionally, we evaluate large language models (LLMs) for vulnerability detection using SOTA prompting techniques. We find their performance is hindered by a lack of Code Understanding and Logical Reasoning, and suggest directions for improvement in these areas.

Finally, we introduce DeepVulGuard, an IDE-integrated tool based on DL models for vulnerability detection and fixing. Through a real-world user study with professional developers, we identify promising aspects of in-IDE DL integration, along with critical issues such as high false-positive rates and non-applicable fixes that must be addressed for practical deployment.

Committee: Wei Le (major professor), Myra Cohen, Qi Li, Samik Basu, and Hongyang Gao