PhD Preliminary Oral Exam: Md Mahbubur Rahman
Learning code semantics for software engineering tasks?
Deep learning has shown promising results for software engineering tasks, but current models still often rely on surface-level textual patterns instead of the program semantics needed for accurate reasoning, limiting their robustness and generalization. Our empirical studies on state-of-the-art deep learning models and LLMs for vulnerability detection (VD) show high prediction variability, weak alignment with bug-relevant semantics, and persistent difficulty with multi-step code reasoning. Even SOTA LLMs achieve only 50–55% balanced accuracy on VD, regardless of model size, training data, or fine-tuning strategy. To address these limitations, we develop two approaches that explicitly add semantic understanding to models. CausalVul introduces causal learning to VD by identifying spurious features through semantic-preserving perturbations and using do-calculus with the backdoor criterion to reduce the model’s reliance on them, improving accuracy, robustness, and out-of-distribution generalization across multiple SOTA models and datasets. ConceptCoder introduces code concepts, human-understandable semantic properties of code, and a concept-supervised multi-task fine-tuning framework that trains LLMs to first recognize these concepts and then reason with them, similar to human code inspection. ConceptCoder achieves SOTA performance on VD, outperforming strong baselines including DeepDFA, TRACED, GPT-5.2, and Claude-Opus-4.5. It also generalizes beyond VD to branch prediction, showing that concept-based fine-tuning is effective across distinct code reasoning tasks. Building on this foundation, we are further extending concept-based learning into a Mixture-of-Experts architecture for input/output prediction reasoning. Together, these contributions move deep learning models toward causal and semantically grounded code reasoning rather than reliance on spurious correlations.
Committee Members: Wei Le (Major Professor), Hongyang Gao, Christopher Quinn, Bowen Weng and Myra Cohen