Recent Advances of Stochastic Algorithms for Deep Learning
Stochastic algorithms such as stochastic gradient descent (SGD) have been found to be remarkably effective in training a variety of deep neural networks. However, there is still a lack of theoretical understanding on how and why SGD can train these complex networks towards a global minimum. Practically, it is still in high demand to further speed up stochastic algorithms in deep learning. This talk will present our recent progress towards addressing these issues. The first part of the talk will describe our establishment of the global convergence of SGD in training deep neural networks. By exploiting the star-convexity property that we discover in experiments, our analysis shows that SGD, although has long been considered as a randomized algorithm, converges in an intrinsically deterministic manner to a global minimum. The second part of the talk will focus on stochastic algorithms with variance reduction, and present our advanced designs that further improve the existing art in various scenarios.
Bio: Dr. Yingbin Liang is currently a Professor at the Department of Electrical and Computer Engineering at the Ohio State University (OSU). She received the Ph.D. degree in Electrical Engineering from the University of Illinois at Urbana-Champaign in 2005, and served on the faculty of University of Hawaii and Syracuse University before she joined OSU. Dr. Liang's research interests include machine learning, optimization, information theory and statistical signal processing. Dr. Liang received the National Science Foundation CAREER Award in 2009, and the State of Hawaii Governor Innovation Award in 2009. Her paper received EURASIP Best Paper Award in 2014. She served as an Associate Editor for the Shannon Theory of the IEEE Transactions on Information Theory during 2013-2015.