Understanding and Reasoning About Fairness in Machine Learning Pipeline
Machine learning (ML) models are increasingly being used in important decision-making software such as approving bank loans, recommending criminal sentencing, hiring employees, and so on. In recent times, many incidents have been reported where the ML models exhibited unfairness or societal bias among people based on their protected attributes such as race, sex, age. Research has been conducted to measure the bias effectively and mitigate them using fairness-aware algorithms. Our research aims at identifying the root cause of unfairness and verify the fairness constraints in complex ML systems. First, we conducted an empirical study on real-world ML models to evaluate fairness using different criteria, successful mitigation techniques, and their impacts. We found several software constructs and fairness patterns that call for further research in the area. Second, we observed that bias is often ingrained in data, and the classifier in ML pipeline is not the only component responsible for unfairness. We leveraged the causal method to isolate the fairness impact of data preprocessing stages in the pipeline. Finally, we propose to provide the predicate transformer semantics for white-box fairness verification. The verifier will compute the weakest precondition of specific fairness requirements to certify whether the model is fair. We also propose to leverage the semantics towards real-time fairness monitoring and reason about fairness composition in the multi-decision ML pipeline.
Committee: Hridesh Rajan (major professor), Wei Le, Andrew Miner, Simanta Mitra, and Kevin Liu
Join on Zoom: Please click this URL to start or join. https://iastate.zoom.us/j/99640782761?pwd=c1lYNFJ1OU5laWxYWWxHMGszenladz09 Or, go to https://iastate.zoom.us/join and enter meeting ID: 996 4078 2761 and password: 706220