PhD Final Oral Exam: Ashutosh Kumar Nirala
Toward Reliable AI: Practical and Certified Defenses for Adversarially Robust Vision Systems
Deep learning has achieved remarkable success across numerous domains, yet its vulnerability to adversarial perturbations, as first revealed in 2013, has significantly limited its applicability in critical and safety-sensitive applications. Beyond the practical concerns, this phenomenon is intrinsically intriguing, warranting dedicated study in its own right. These adversarial attacks, though ingeniously crafted, conclusively demonstrate that modern deep networks often exploit superficial statistical correlations rather than developing a true semantic understanding of the tasks they perform. This has drawn substantial attention from both theoretical and applied perspectives.
This dissertation, develops a unified framework for improving the reliability of vision models against adversarial threats, spanning both practical black-box defenses and formal certification methods. The central goal is to ensure that vision systems can maintain correct predictions under a wide range of perturbation models, deployment settings, and task complexities.
First, we present AlignFix, a practical black-box defense for image classification inspired by top-down feedback in biological perception. AlignFix exploits complementary feature biases in naturally trained and adversarially trained models to detect and correct adversarial perturbations on the fly, providing robustness against score-based, decision-based, and transfer attacks i.e, all practical black-box attacks.
Second, we introduce Open Vocabulary Certification (OVC), a fast certification framework for open-vocabulary vision-language models such as CLIP. OVC leverages pre-computed certificates for related prompts, along with embedding caching and distributional approximations, to accelerate randomized smoothing by up to two orders of magnitude while retaining tight robustness guarantees for novel prompts.
Finally, we propose one of the first direct certification frameworks for object detection, applied to YOLO-based runway detection in aviation settings. Using Interval Bound Propagation, our method jointly certifies classification, objectness, and localization heads, enabling provable robustness guarantees for both labels and bounding boxes under bounded perturbations.
Together, these contributions advance the development of vision systems that are both practically robust in realistic black-box settings and certifiably correct under formal threat models, enabling a path toward reliable, trustworthy AI in high-stakes visual perception tasks.
Committee: Soumik Sarkar (major professor), Jin Tian, Qi Li, Chris Quinn, and James Lathrop