Computer Science Colloquium Series - Fall 1997

Colloquium Series - FALL 1997


The Computer Science Colloquium Series is a forum for invited speakers, faculty, and graduate students to share research ideas. Everyone is invited to attend and participate. An up to date listing of the speakers and abstracts of their talks will be posted here.


Distinguished Lecture

Speaker James E. Smith
Affiliation Department of Electrical and Computer Engineering
  University of Wisconsin, Madison
Title Next Generation Processor Microarchitectures (tentative)
Date Thursday, March 26, 1998


Listing of talks

Title Speaker Affiliation Date Time Location
Neural Architectures for Database Query Processing, Syntax Analysis, Knowledge Representation, and Inference Chun-Hsien Chen Department of Computer Science, Iowa State University Thursday, September 11 3:30 pm B 29 Atanasoff Hall
Web Mining: Pattern Discovery on the World Wide Web Bamshad Mobasher Department of Computer Science, University of Minnesota, Minneapolis Thursday, September 18 3:30 pm B 29 Atanasoff Hall
Learning DFA from Simple Examples Rajesh Parekh Department of Computer Science, Iowa State University, Ames Thursday, September 25 3:30 pm B 29 Atanasoff Hall
Emerging Technological Trends in Education Jim Schlosser Department of Computer Science, Iowa State University, Ames Thursday, October 2 3:30 pm B 29 Atanasoff Hall
A Machine Learning Approach to Natural Language Understanding John Zelle Department of Mathematics and Computer Science, Drake University, Des Moines Thursday, October 9 3:30 pm B 29 Atanasoff Hall
Connectivity and Sparse Wavelength Conversion in Wavelength-Routing Networks Arun Somani Department of Electrical and Computer Engineering Iowa State University, Ames Thursday, October 16 3:30 pm B 29 Atanasoff Hall
Weak Completeness and Strong Hypotheses Jack Lutz Department of Computer Science, Iowa State University, Ames Thursday, October 23 3:30 pm B 29 Atanasoff Hall
Technology and Legal Issues: Ownership and Copyright on the Web (video presentation) Roger Cepeda   Thursday, November 6 3:30 pm B 29 Atanasoff Hall
GP-Automata: one method for automatic creation of systems of rules Dan Ashlock Department of Mathematics, Iowa State University, Ames Thursday, November 13 3:30 pm B 29 Atanasoff Hall
Conformational Search Techniques for Assisting Chemists in Drug Design Steve Lavalle Department of Computer Science, Iowa State University, Ames Thursday, December 4 3:30 pm B 29 Atanasoff Hall


Abstracts

  1. Next Generation Processor Microarchitectures

    By - James E. Smith

    A fundamentally new generation of processor organization, or microarchitecture, appears only once in about 20 years, and each generation enables substantially higher levels of pro- cessor performance. Thus far, processors have passed from being serial to pipelined to superscalar. Commercial pro- ducts belonging to the next generation will likely appear in the middle of the next decade. This talk begins with a brief discussion of technology trends that will drive the development of next generation microarchitectures; key tech- nology trends are very high transistor counts and the domi- nance of on-chip wire delays.

    Overall characteristics of next generation processors will be discussed; these processors will likely rely on multiple processing elements with a new level of control hierarchy to manage them. A number of supporting microarchitectural technologies must be developed to enable next generation processors. Several of these technologies, including high level control flow prediction, trace caching, instruction pre-processing, and data value prediction will be described. These technologies, and others, will likely dominate microarchitecture research for the next several years.

  2. Neural Architectures for Database Query Processing, Syntax Analysis, Knowledge Representation, and Inference

    By - Chun-Hsien Chen

    Artificial neural networks (ANN), due to their inherent parallelism, potential for fault tolerance, and adaptation through learning, offer an attractive computational paradigm for a variety of applications in computer science and engineering, artificial intelligence, robotics, and cognitive modeling. Despite the success in the application of ANN to a broad range of numeric tasks in pattern classification, control, function approximation, and system identification, the integration of ANN and symbolic computing is only beginning to be explored. This talk is about our research which explores to integrate ANN and some essential symbolic computations for content-based associative symbolic processing. This offers an opportunity to explore the potential benefits of ANN's inherent parallelism in the design of high performance computing systems for real time content-based symbolic processing applications. We develop methods to systematically design massively parallel architectures for pattern-directed symbol processing using neural associative memories as key components. In particular, we propose neural architectures for content-based as well as address-based data storage and recall, information retrieval and database query processing, elementary logical inference, sequence processing, and syntax analysis. Their potential advantages over conventional serial computer implementations of the same functions are examined in the talk.


  3. Web Mining: Pattern Discovery on the World Wide Web

    By - Bamshad Mobasher

    With the explosive growth of information sources available on the World Wide Web, it has become increasingly necessary for users to utilize automated tools in order to find, extract, filter, and evaluate the desired information and resources. In addition, with the transformation of the Web into the primary tool for electronic commerce, it is imperative for organizations and companies, who have invested millions in Internet and Intranet technologies, to track and analyze user access patterns. These factors give rise to the necessity of creating server-side and client-side intelligent systems that can effectively mine for knowledge both across the Internet and in particular Web localities.

    Web mining can be broadly defined as the discovery and analysis of useful information from the World Wide Web. This broad definition on the one hand describes the automatic search and retrieval of information and resources available from millions of sites and on-line databases, i.e., Web content mining and on the other hand, the discovery and analysis of user access patterns from one or more Web servers or on-line services, i.e., Web usage mining.

    The purpose of this talk is to introduce Web mining and to discuss problems and proposed techniques associated with data mining on the Web. The primary focus of the talk is on Web usage mining and on related problems, unique to the Web paradigm. These include the necessity of integrating various data sources such as server access logs, referrer logs, and user profile information; resolving difficulties in the identification of users due to missing unique key attributes in collected data; and the importance of identifying user sessions or transactions from usage data, site topologies, and models of user behavior. We also introduces a general architecture for Web usage mining, and briefly discuss the WEBMINER, a system based on the proposed architecture.

  4. Learning DFA from Simple Examples

    By - Rajesh Parekh

    Grammar inference, the process of inducing an unknown grammar from examples (and possibly other sources of information), finds applications in many areas including pattern recognition, intelligent agents, language acquisition, and computational biology. intelligent agents, language acquisition, and computational biology. Regular languages constitute one of the simplest class of formal languages and hence induction of regular grammars, or equivalently finite automata (DFA), is a key problem in machine learning. Abundance of impossibility or intractability results derived using a number of different learning models makes it a particularly challenging problem in computational learning theory. There exists no algorithm that can in polynomial time return a minimum state deterministic finite automaton that is consistent with a given sample of positive and negative examples of an unknown regular language. Kearns and Valiant have shown that even approximate learning of DFA in the PAC sense is unlikely to be easy because it would entail algorithms for solution of several known hard problems such as breaking the RSA cryptosystem.

    Against this background, Pitt, in a seminal paper, posed the following open research problem: ``Are DFAs PAC-identifiable if examples are drawn from the uniform distribution, or some other known simple distribution?'' This talk presents our recent research that answers a variant of this question in the affirmative. In particular, we describe a framework for efficient learning of DFA from simple examples, i.e., examples drawn according to the Universal distribution conditioned on the knowledge of the target. We describe an efficient learning algorithm for exact learning of the target DFA with high probability when a bound on the number of states (N) of the target DFA is known in advance. When N is not known, we show how this algorithm can be used for efficient probably approximate learning of DFAs. This work naturally extends the teachability results due to Goldman and Mathias and the equivalent model of learning grammars from characteristic samples due to Gold to a probabilistic framework. We will also summarize the current state of the art in grammar inference in general, and regular grammar inference in particular, and outline some directions for future research in this area.

    This talk is based on the paper: ``Learning DFA from Simple Examples,'' Rajesh Parekh and Vasant Honavar, In: Proceedings of the Eighth International Workshop on Algorithmic Learning Theory (ALT'97), Berlin: Springer-Verlag, In press.

  5. Emerging Technological Trends in Education

    By - Jim Schlosser

    Keeping abreast of technological trends in both industry and education is becoming a full time job. Traditional information delivery paradigms will have to be modified to accommodate new hardware, new software, and techno-savy customers(users).

    Just as hardware and software become obsolete, information delivery techniques (i.e. teaching methods) must be modified to offer the student the most for their educational dollar.

    Topics covered will include:
    Distance Learning implementations, Java, Internet Security, Bandwidth considerations, Resource (human and otherwise) issues, Streaming Media, Encryption, Privacy, PUSH/PULL models of information delivery, and User types.

  6. A Machine Learning Approach to Natural Language Understanding

    By - John Zelle

    Computer understanding of natural (i.e. human) language is a long-standing, difficult and largely unsolved problem in artificial intelligence. Traditional work in this area has primarily focused on inventing mechanisms for representing and reasoning with the vast amounts of linguistic and "real-world" knowledge that is required to achieve effective understanding. Despite progress, the development of natural language interfaces is still a difficult, expertise-intensive task and produces systems that tend to be inefficient, incomplete, and generally "brittle."

    Recently some researchers have turned to empirical approaches for constructing natural language interfaces. Rather than relying on hand-coded rules, these systems apply methods from machine learning to automatically acquire language models from examples of language use. In this talk, I will describe research into a system (CHILL) that uses techniques from a subfield of machine learning known as inductive logic programming to automatically construct natural language interfaces. Experimental evidence suggests that such approaches may lead to significant advances in simplifying the construction of high-quality natural language applications.

  7. Connectivity and Sparse Wavelength Conversion in Wavelength-Routing Networks

    By - Arun Somani

    We will describe and discuss issues in design of circuit-switched all-optical wavelength-routing networks. In particular, we study the effects of topological connectivity and wavelength conversion in such network. We will first describe wavelength division multiplexing concept and then discuss the role of wavelength converters in such network. An analytical framework for accurate analysis of networks with arbitrary topology will be presented. We then introduce a model for networks with a variable number of converters and analyze the effect of wavelength converter density on blocking probability. This framework is applied to regular network topologies that have varying levels of connectivity. It is shown that that the conversion does not offer a significant advantage in most cases. The benefits of conversion are largely dependent on the network load, the number of available wavelengths, and the connectivity of the network. This result is very significant as converters are expensive components. Finally, the tradeoff between physical connectivity, wavelength conversion, and the number of available wavelengths is studied through networks with random topologies and under dynamic non-Poisson traffic.

  8. Weak Completeness and Strong Hypothesis

    By - Jack Lutz

    Despite extensive research over the past 25 years, the theory of computing has utterly failed to provide reasonable estimates of the amount of time required to solve important computational problems. The most famous instances of this failure are the NP-complete problems. A tremendous amount of experimental and theoretical evidence indicates that these problems require exponential time, but there is still no proof that they require more than linear time!

    Thiss talk surveys recent research on weak completeness--a new approach to establishing exponential lower bounds on the time required to solve computational problems. Weak completeness--a condition defined using efficiently computed betting strategies--has the following two important properties.

    • Weakly complete problems are provably intractable in a very strong sense (exponential lower time bounds on exponentially many instances).
    • Almost every problem solvable in exponential time is weakly complete, but is not complete, and is thus not provably intractable by standard methods.

    Taken together, these facts imply that weak completeness greatly extends the collection of problems that we can prove are intractable. Of course, the next step is to find natural examples of this phenomenon, an open problem that will be discussed.

    It is possible that the NP-complete problems are themselves weakly complete. The hypothesis that this is so (roughly that NP contains a "non-negligible subset" of exponential time, a condition that implies that NP is not P) has been shown in recent years to give plausible resolutions to many questions that are not known to be resolved by "traditional" complexity-theoretic hypotheses such as the separation of the polynomial-time hierarchy. The talk will survey the explanatory power of this hypothesis and discuss recent arguments for and against its plausibility.

    This talk is intended for nonspecialists.

  9. Technology & Legal Issues: Ownership and Copyright on the Web (video presentation)

    By - Roger Cepeda

    This is a 55 minutes long video tape of a course presented by Roger Cepeda. It explores the issues surrounding ownership and copyright of materials accessible through the Internet. It explains the legal challenges and pitfalls that organizations must face in using the Internet, and provides guidelines on how to deal with these challenges. The particular topics covered are:

    • Copyright law: copyright myths, types of infringement, case study of RTC vs Netcom.
    • Avoiding copyright problems: Web page development, NII white paper, international copyright treaty.
    • Trademark law: types of trademark infringement, protectable marks, domain name disputes.

  10. GP-Automata: one method for automatic creation of systems of rules

    By - Dan Ashlock

    This talk will consist of a brief introduction to genetic algorithms and genetic programming, followed by an introduction to a technique for automatically producing a system of rules for solving a problem out of test cases of the problem. The problem used as the primary example will be the Tartarus task of Astro Teller.

    The talk is intended for a general audience within computer science.

  11. Conformational Search Techniques for Assisting Chemists in Drug Design

    By - Steve LaValle

    A key problem in pharmaceutical drug design is scanning a database for molecules that can achieve low energy conformations while satisfying geometric relationships between specific atoms to enable protein docking. This talk introduces the basic problems from a computational perspective, including pharmacophore identification (finding common geometric substructures in molecules), conformational search, clustering, and database screening. Particular focus will be given to the problem of searching while satisfying geometric constraints. Practical considerations will be discussed regarding our implemented system, which is currently used by chemists.

    This research is a joint collaboration with Paul Finn from Pfizer Limited, Lydia Kavraki from Rice University, and Jean-Claude Latombe from Stanford University.

(Top)

Contacts

Thank you for visiting this page.Please send your suggestions and comments to one of us in the Computer Science colloquium commitee.

(Top)
mayuresh@cs.iastate.edu
Last modified: August 25, 1999.