PhD Preliminary Oral Exam: Shaila Sharmin

PhD Preliminary Oral Exam: Shaila Sharmin

May 7, 2026 - 1:00 PM
to , -

From Detection to Mitigation: Numerical Instability Analysis for Machine Learning Systems

Machine learning systems are increasingly deployed in safety-critical and resource-constrained environments, yet their numerical reliability remains poorly understood. This thesis presents a systematic study of numerical instability in machine learning systems — from training programs to quantized neural networks to large language models — addressing three interconnected problems: detection, analysis, and repair.

Our first contribution introduces Soft Assertions, a technique for automatically detecting silent numerical instability in ML programs. Unlike existing tools that only catch NaN and Inf, Soft Assertions learn instability conditions from unit tests of 61 unstable library functions and guide a fuzzer toward triggering failures. Evaluated on 79 benchmark programs and 15 real-world GitHub applications, it outperforms all five baseline tools and discovers 13 previously unknown bugs.

The ongoing research work is QuantDIR, which addresses numerical instability introduced by post-training quantization (PTQ) in CNNs. We confirm that accuracy degradation under low-bit quantization is frequently caused by numerical instability that originates in specific layers and propagates through the network. QuantDIR performs offline dynamic analysis to classify layers then applies a closed-form per-channel repair only to the identified unstable layers. Evaluated on seven CNN architectures, QuantDIR achieves accuracy comparable to state-of-the-art PTQ methods. 

Our plan is to extend this methodology to large language models, where we aim to detect which transformer blocks introduce instability under quantization, trace how errors propagate through attention and MLP layers, and apply targeted repair without full fine-tuning.

Together, these contributions form a unified framework: detect where instability originates, trace how it propagates, and apply targeted repair — across programs, quantized CNNs, and large language models

Committee: Wei Le (major professor), Chris Quinn, Myra Cohen, Yang Li, and Amit Siker