M.S. Final Oral Exam: Hao Jiang
Speaker:Hao Jiang
Seeded Transfer Learning for Road Roughness Regression
Road roughness affects riding comfort and vehicle operational cost. To maintain good road quality, routine survey of road roughness is necessary for federal and state transportation agencies. The traditional methods for survey require specialized equipment and training. To reduce the cost, more efficient methods have been recently developed, which collect measurement data from sensors in off-the-shelf mobile/smart devices (such as Android phones and iPhones). And those sensor data, along with labelled road roughness index value, is used to construct machine learning models for inferring road roughness from the sensor data. As the sensor data generated by different type/model of devices have different characteristics, in the current practice, a model is often constructed from the data of one single or small set of type/model of devices, and takes only the data of the same configuration for prediction. Hence, despite the potentially large amount of sensor data from a variety of devices, labelled data may still be lacked when applying machine learning to construct a model in a specific setting.
Transfer learning aims to extract knowledge from a source domain, then apply it to a related target domain. This has been used to reduce the training/labeling cost on target domain. Based on existing extensive research, in this work we propose a clustering based seeded transfer learning approach to address the above problem in road roughness modeling and prediction. Specifically, we develop a complete solution for transferring data from a source domain (sensor data collected from devices of type/model A) to a target domain (sensor data collected from devices of type/model B) for model training. Our contributions include: a data normalization step, an implementation to match source clusters and target seeds, as well as the exploration of clustering methods, optimal number of clusters, and seeds percentage of transfer learning.
We evaluate the performance of our approach using the sensor data set collected from the field and some public data sets. The results show that: K-means clustering is less stable than hierarchical clustering; a moderate seeds percentage is preferred; number of clusters does not significantly affect model prediction accuracy.
Committee: Wensheng Zhang (major professor), Halil Ceylan and Ali Jannesari
Join on WebEx: https://iastate.webex.com/iastate/j.php?MTID=mc889724781b170d3b3b60defb79ed8de
Meeting number: 2622 153 9728 Password: yvUWb5csn32