 |
Artificial Intelligence Research Laboratory
Department of Computer Science
Iowa State University
|
An Agent-Based Environment for Integrating and Analyzing Plant Genomic Databases
Personnel
Project Summary
Funding
Publications
Additional Information
Projects
AI Lab
Personnel
Project Summary
Recent advances in high throughput (and often automated or semi-automated)
data acquisition technologies,
digital storage technologies, computers and communications have made it possible
to gather and store large amounts of data on plants.
As a result, research groups are generating large volumes of
data. In order to translate these advances in high-throughput data acquisition
into fundamental gains in scientific understanding
calls for development of software tools that
allow integrated retrieval and analysis of the individual
databases. The design and implementation of such tools has to address several
challenges in computer and information sciences:
-
In many instances, the data sources are physically distributed (e.g., at multiple laboratories and data repositories). This calls
for the use of information assistants or software agents for
intelligent, selective, and context-sensitive
data gathering, information extraction, and data assimilation prior to large scale data analysis. Since scientific data sources are dynamic (i.e., they change
rapidly as data items are added or modified), there is a need to monitor
the data sources, propagate the changes, and trigger the necessary updates in
the affected data and knowledge repositories.
For example, as experimental data about mRNA expression and localization are gathered using microarray technology and in situ hybridization, it would be extremely beneficial to annotate the corresponding DNA sequences in an automated or semi-automated fashion.
Since the information of interest is user, problem, and context-dependent, such tools have to be customizable.
- ,
Given the large volumes of data involved, it is desirable to perform as much analysis
as feasible at the sites where the data is located and transmit only the results of analysis rather than flooding the network with data. This calls for the use of mobile software agents that can transport themselves to appropriate sites,
carry out the computation on site, and return with useful results.
-
Since the data sources are often autonomously owned and operated,
and reside on heterogeneous hardware and software platforms, their effective
use requires a sufficient degree of interoperability among the different
data sources (despite their heterogeneity). For example, specific bioinformatics applications might have to access and use data from multiple genome or protein data banks.
-
The data sources contain multiple types of data (text, images, relational
databases, sequence data, spectrograms, protein structures, etc.) This
calls for sophisticated tools for extracting, transforming,
and assimilating relevant information from heterogeneous data sources into a data warehouse where it can be further analyzed
to faciliate knowledge discovery.
-
The large volumes of data, the range of scientifically
relevant but complex interrelationships that need to be discovered, and
the diversity of data sources challenge state-of-the-art
approaches to data mining and knowledge discovery. In particular,
current
statistical and artificial intelligence tools have to be extended and new
efficient algorithms developed to handle data-driven
knowledge acquisition and incremental theory refinement from multiple
heterogeneous, structured as well as semi-structured scientific
data and knowledge sources.
This research brings together scientists
with complementary expertise in computer science, molecular biology, and computational biology
to design, implement, and evaluate
a modular, flexible, and
extensible multi-agent system for selective information retrieval,
information extraction, information assimilation, and data-driven
scientific knowledge discovery using heterogeneous, distributed,
data and knowledge sources. A number of carefully chosen
bioinformatics problems in plant genomes are being
collectively used to identify the functionality required of the
multiagent tools as well as
to evaluate and further develop the tools.
This research is closely integrated with the training
of graduate students in Bioinformatics and Computer Science at Iowa State University.
Funding
Publications
-
Honavar, V. (1999). Distributed Knowledge Networks. Invited Talk.
Artificial Intelligence for Distributed Information Networks
(AiDIN '99) Workshop held during the 1999 National Confere
nce on Artificial Intelligence (AAAI 99), Orlando, Florida. July 1999.
-
Caragea, D., Silvescu, A., and Honavar, V. (1999). Incremental Distributed Support Vector Induction. To appear.
-
Vander Velden, K., Andreotti, A., Dobbs, D., Honavar, V., and Miller, M. (1999).
Protein Secondary Structure Prediction: New results and Some Observations. To appear.
-
Yang, J., Parekh, R., Honavar, V., and Dobbs, D. (1999). Data-Driven Theory
Refinement Algorithms for Bioinformatics. In: Proceedings of the
International Joint Conference on Neural Networks. Washington, D.C.
- Balakrishnan, K. and Honavar, V. (1998).
Intelligent Diagnosis Systems.
Journal of Intelligent Systems.
-
Yang, J. and Honavar, V. (1998).
DistAl:
An Inter-Pattern Distance Based Constructive Neural Network Learning Algorithm.
. Intelligent Data Analysis
-
Yang, J. and Honavar, V. (1998).
Feature Subset
Selection Using a Genetic Algorithm. IEEE Intelligent Systems
(Special Issue on Feature Transformation and Subset Selection).
vol. 13. pp. 44-49.
-
Yang, J. (1999). Adaptive Agents For Information Retrieval and Data-Driven
Knowledge Acquisition. Doctoral Dissertation. Department of Computer Science. Iowa State University.
-
Honavar, V., Miller, L. and Wong, J. (1998).
Distributed Knowledge Networks. In:
Proceedings of the IEEE Information Technology Conference. Syracuse, NY.
-
Helmer, G., Wong, J., Honavar, V. and Miller, L. (1998). Intelligent Agents for
Intrusion Detection. In: Proceedings of the IEEE Information Technology
Conference. Syracuse, NY.
-
Miller, L., Honavar, V. and Wong, J. (1998). Object-Oriented Data Warehouse for
Information Fusion from Heterogeneous Data and Knowledge Sources.
In: Proceedings of the IEEE Information Technology Conference. Syracuse, NY.
-
Yang, J., Pai, P., Honavar, V., and Miller, L. (1998).
Mobile Intelligent Agents
for Document Classification and Retrieval: A Machine Learning Approach.
In: Proceedings of the European Symposium on Cybernetics and Systems Research.
-
Yang, J., Havaldar, R., Honavar, V., Miller, L. and Wong, J. (1998).
Coordination and Control of Distributed Knowledge
Networks Using the Contract Net Protocol. In: Proceedings of the IEEE
Information Technology Conference. Syracuse, NY.
Additional Information
To appear.
Dr. Vasant Honavar
Artificial Intelligence Research Laboratory
Department of Computer Science
Iowa State University
Atanasoff Hall, Ames, IA 50011-1040 USA
phone: +1-515-294-1098, +1-515-294-4377; fax: +1-515-294-0258
© Vasant Honavar, 1999.