The US National Science Foundation has awarded a 3-year, $449,999 grant to a team led by Professor Vasant Honavar of Computer Science to develop algorithms and software for collaborative, integrative analysis of large, semantically heterogeneous data sets. Advances in networks, sensors, storage, computing, and high throughput data acquisition, have led to a proliferation of autonomous, distributed data sources in many areas of human activity. New discoveries in biological, physical, and social sciences and engineering are being driven by our ability to discover, share, integrate and analyze disparate types of data. Statistically-based machine learning algorithms offer some of the most cost-effective approaches to discovery of experimentally testable predictive models and hypotheses from data. However, the large size, distributed nature, and autonomy of the data sources (and the attendant differences in access, queries allowed, processing capabilities, structure, organization, and underlying data models and data semantics) present hurdles to effective utilization of machine learning. This research aims to overcome these hurdles by developing efficient, resource-aware distributed algorithms and software services to support collaborative, integrative knowledge acquisition such a setting. This project builds on the results of collaboration involving Dr. Honavar's former Ph.D. students Dr. Jie Bao (currently a postdoctoral research associate at the Center for Computational Intelligence, Learning, and Discovery at Iowa State University), Dr. Doina Caragea (currently an assistant professor of Computer Science at Kansas State University), Dr. Jun Zhang (currently a research scientist at Fair Isaac), Dr. Jyotishman Pathak (currently a research scientist at Mayo Clinic). The project will provide enhanced opportunities for research-based training of several Ph.D. students in the Artificial Intelligence Research Laboratory including Cornelia Caragea, Oksana Yakhnenko, Neeraj Koul, and Kewei Tu. The project will also provide research opportunities for MS and undergraduate students. The research team led by Professor Vasant Honavar at Iowa State University and Professor Doina Caragea at Kansas State University will implement, deploy, and evaluate the resulting algorithms using benchmark data sets, associated data models and ontologies, and user-specified inter-ontology mappings on a distributed test-bed of networked databases and services at the two institutions. The resulting open-source software can potentially transform collaborative e-science in the same way that Web has transformed information sharing. The project web site provides access to additional information about the project.
Vasant Honavar Receives an NSF Grant to Develop Algorithms and Software for Collaborative, Integrative, Analysis of Large, Distributed, Semantically Heterogeneous Data Sets
August 16, 2007