MS Final Oral Exam: Ankit Jyothish
Jul 14, 2025 - 3:00 PM
to , -
Location
Atanasoff 235
Adapting MoE for Efficient Inference and Applications
The explosive growth of large‐scale models has renewed interest in Mixture-of-Experts (MoE) as a principled way to decouple capacity from per-token compute. In this work, we discuss predictive replication engine that vastly reduce MoE inference latency, MoE model merging techniques and use of MoE for better representation learning in graphs.
Committee: Ali Jannesari (major professor), Chenglin Miao, and Lin Yan