Ph.D. Preliminary Oral Exam: Muhammad Arbab Arshad
Multimodal Deep Learning for Agricultural Decision Support
This thesis advances multimodal deep learning methods through four interconnected studies focused on agricultural applications. We first optimize Neural Radiance Fields for efficient 3D reconstruction, achieving 50% reduction in computational requirements while maintaining reconstruction quality. We then develop AgEval, a benchmark for evaluating vision-language models in specialized agricultural tasks, demonstrating performance improvements from 46.24% to 73.37% in few-shot scenarios. Building on this, we introduce an assisted few-shot learning approach that enhances model performance to 80.45% through strategic example selection. Finally, we present an integrated system combining multiple deep learning modalities into a unified interface for practical agricultural applications. While demonstrated in agriculture, our methodological advances in computational efficiency, few-shot learning, and multimodal integration extend to other domains requiring specialized adaptation of deep learning technologies.
Committee: Soumik Sarkar (major professor), Baskar Ganapathysubramanian, Robyn Lutz, Adarsh Krishnamurthy, and Aditya Balu