The field of Grammatical Inference enjoys the uncanny distinction of being faced with a variety of open problems that remain to be solved. This page is provided to make researchers and practitioners aware of the different open problems and to motivate them to tackle these problems. It is impossible for us to catalog all the open problems in Grammatical Inference but this page is designed to allow people to contribute to the growing list.
-- Colin de la Higuera, November 6th, 1998
Several different algorithms have been proposed to learn Stochastic Deterministic Finite State Automata (SDFA) (e.g. Carrasco & Oncina, Ron, Singer & Tishby). Having learned the structure of the automaton, the transition probabilities are computed based on the frequency with which the training examples use each edge. This does not allow for an identification in the limit, as any new example modifies (slightly) the frequencies, and hence the probabilities. Can a (fast) probability assigning algorithm be designed that computes the probabilities in another way, so that the general algorithm identifies sdfa in the limit with probability one?
-- David G. Stork , February 1st, 1999
What problems in computational linguistics are primarily or entirely limited by lack of empirical data? That is, suppose we had a perfect oracle that could generate an extremely large number of samples (labelled sentences, spelling variants, etc., as appropriate) that could be used with existing inference or learning techniques. What problems would then be solved? I am especially interested in citable articles that give evidence for such a claim.
I am compiling such a list of areas in a range of disciplines (optical character recognition, path planning, etc.) for a review article on the status of pattern recognition and machine intelligence.
Colin de la Higuera, cdlh@univ-st-etienne.fr
Rajesh Parekh, rparekh@bluemartini.com