Ph.D. Research Proficiency Exam: Ashutosh Kumar Nirala
Speaker:Ashutosh Kumar Nirala
Error Correcting by Agreement Checking for Adversarial Robustness
We present a novel method called Error Correcting by Agreement Checking (ECAC) to boost the adversarial robustness of an adversarially-trained model. Inspired by the observation that the feed-forward part of biological perception (of humans and primates) is also vulnerable to adversarial attacks, we argue for an error correction mechanism to achieve robustness. We exploit the observation that natural and adversarially trained models use different sets of features for classification, and it is difficult to eliminate both these features concurrently. Thus, if the two models disagree on a prediction, the disagreement can be used as an error signal. Specifically, the input could be nudged towards the class predicted by each other using a (targeted) PGD attack. If nudged towards the correct class, the two models tend to agree on the prediction, revealing the true class. Nudging the input is similar to undoing the adversarial attack. We also crafted a series of adaptive attacks to reliably evaluate the robustness of our method. By extensive experiments on different datasets (CIFAR and ImageNet) and architectures (ResNet and ViT), we demonstrate that ECAC improves accuracy in all settings viz white-box, black- box, and natural accuracy for different architectures. On the practical black-box settings, ECAC achieves accuracy above 80% on CIFAR-10 using ResNet-18 when evaluated using SQUARE and SPSA attacks.
Committee: Jin Tian (major professor), Hongyang Gao, Qi Li, Chris Quinn, and James Lathrop
Join on Zoom: https://iastate.zoom.us/j/97717403173