![]() |
Artificial Intelligence Research Seminar Artificial Intelligence Research Laboratory Department of Computer Science Iowa State University |
Artificial Intelligence Research Seminar Com S 610 (VH) Fall 2000 will meet once a week. AI seminar will be coordinated by Adrian Silvescu. The seminar topics for fall 2000 will be drawn from among the following:
OCT. 2: Xiaosi Zhang and Neeraj Koul.
Xiaosi will talk about the data mining the yeast genome expression data. It will be focused on the cluster analysis of gene expression patterns.Using spotted DNA microarrays data, clustering the gene expression data groups together, the coexpression of genes of known function with poorly characterized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not avaiable currently.
The avaiable public gene expression data includes the data
during the diauxic shift, the mitotic cell division cycle, sporulation, and
temperature and reducing shock by using microarrays containing essentially
every ORF.
The data can be downloaded from:
http://cmgm.stanford.edu/pbrown/explore/index.html
Neeraj will present material from the following papers:
SEPT. 25: Neurobiology Talk on Gene Expression.
SEPT. 18: Research interests presentation
SEPT. 11: Vasant Honavar Algorithmic Approaches to Gene Expression Analysis
Modern biology rests on the premise (often referred to as the central dogma) > that the functional state of an organism is largely determined by the gene expression pattern. This premise implies that understanding the nature of complex biological processes such as development, cellular differentiation, carcinogenesis, etc., requires determining the spatio-temporal expression patterns of thousands of genes, and, more importantly, seeking out the organizing principles that allow biological processes to function in a coherent manner under different environmental conditions. The recent advent of DNA microarray technology provides biologists with the ability to measure the expression levels of thousands of genes in a single experiment. Initial experiments by Eisen et al (1998) using microarray technology suggest that sets of genes with related functions can be detected on the basis of similar gene expression patterns. With the increasing use of DNA microarray and related technologies for gathering gene expression data from plants and animals, there is a growing need for sophisticated computational tools for extracting biologically significant information from gene expression data, assigning functions to genes, and identifying signalling pathways and control circuits (e.g., signal transduction pathways and genetic regulatory networks). In this talk, I will present an overview of algorithmic approaches that have been used for largescale gene expression analysis. I will also point out some of the limitations of currently used approaches. I will conclude with a a proposal for a gene expression analysis toolkit consisting of a suite of algorithms that overcome the limitations of the current techniques.
References
WEDNESDAY MEETINGS: 3:30-5:00pm, 217 Atanasoff Hall.
OCT. 11: Doina Caragea
We will finish the discussion about how to learn a boolean function using Fourier representation, and then we will see how this theory can be applied to learn a decision tree. We will also learn how to construct a decision tree given its Fourier representation in practice (how to go from Fourier representation to information gain). The first part of the talk is based on the paper: Learning Boolean Functions via the Fourier Transform, Yishay Mansour, 1994 (http://www.math.tau.ac.il/~mansour/cv.htm) and the second part is based on the paper: Collective Data Mining: A New Perspective Toward Distributed Data Mining. Kargupta, H., Park, B., Hershberger, D. , and Johnson, E.,(1999) (http://www.eecs.wsu.edu/~hillol/).
OCT. 4: Doina Caragea
In the seminar today I will talk about how to learn a boolean function using Fourier Transform. The presentation will be structured as follows: description of the concept learning: introduction to Fourier transform (Fourier basis): the connection between learning and Fourier transform: algorithms for learning a boolean function using Fourier theory; how this algorithms can be applied for learning a boolean decision tree.
Reference:
SEPT. 27: Adrian Silvescu
This talk will present some recent developments in Statistical Learning Theory based on Valdimir Vapnik's book.
SEPT. 20: Carson Andorf.
Recent advances in data storage and data acquisition technologies have made it possible to produce large data sets. Many of these large data sets are physically distributed and due to their large size it is very expensive, in terms of both network bandwidth and time, to assemble them at a central location. Other data sets have security issues so only summaries can be made available. These types of data sets require algorithms that learn from distributed data without actually collecting the data. Currently, there are a lot of batch learning algorithms and many of these can be mapped into a distributed learning environment.
In this talk, I will discuss learning decision trees on distributed data sets. I will give a brief overview of different methods of learning from distributed data sets, different types of distributed data, and how decision trees work in a batch environment. Most of my talk will focus on the work of Taru Sharma in mapping the Decision Tree algorithm ID3 (Quinlan, 1986) to an environment that deals with both vertically and horizontally distributed data and also, my own work, in collaboration with Dr. Honavar, in mapping the algorithm IREP and IREP* (Furnkranz and Widmer 1994) to an environment of both vertically and horizontally distributed data.
References
SEPT. 13: Statistical Learning Theory: Adrian Silvescu
This talk will introduce some recent results in statistical learning theory developed by Valdimir Vapnik in his recent books on this topic.
SEPT. 6: Vasant Honavar
Cumulative Multi-Task Learning from Distributed, Dynamic Data and Knowledge Sources
Abstract
A fundamental question in computational studies of learning is: How do living systems learn over a period of time, across multiple tasks, without losing the ability to perform the tasks that they have already mastered? A closely related problem involves the design and analysis of algorithms that enable autonomous agents to learn from multiple, distributed, dynamic data and knowledge sources as well as other agents in open-ended environments. While there has been a great deal of research on batch learning algorithms that learn from a given data set, there is relatively little work on algorithms for learning in open-ended environments.
In this talk, I will introduce a class of learning problems that arise in open-ended, dynamic environments consisting of multiple, distributed, possibly autonomous data and knowledge sources and agents and review some of the work that is being done in our lab on addressing these problems.
Much of this talk is based on work that has been done in collaboration with Doina Caragea, Adrian Silvescu, and Carson Andorf all of whom are graduate students in the AI lab.
References
If you are interested in receiving seminar announcements, please send email to honavar@cs.iastate.edu to get on our mailing list or periodically check out this page for schedule of talks.
For additions and updates to this page, please contact: silvescu@cs.iastate.edu.
Artificial Intelligence Research Laboratory
Department of Computer Science
Iowa State University
Atanasoff Hall, Ames, IA 50011-1040 USA
phone: +1-515-294-4377, fax: +1-515-294-0258