Study Guide
Computational Models of Learning
STUDY GUIDE
Note: The links to weekly lecture notes will not be in place usually until a
week later.
Week 1 (January 11, 1999)
Overview of Machine Learning. Mistake Bound Learning Model.
Required readings
-
Introduction. Vasant Honavar.
-
Chapters 1 and 7. Machine Learning. Tom Mitchell, 1997.
Recommended readings
-
Aha, D. (1995).
Machine Learning (tutorial).
-
Balakrishnan, K. and Honavar, V. (1997).
Intelligent Diagnosis Systems. Journal of Intelligent Systems. In press.
-
Honavar, V. (1998). Machine Learning. Invited article. In: Encyclopedia of Electrical and Electronic Engineering. Webster, J. (Ed.), New York: Wiley. In press.
-
Langley, P. Elements of Machine Learning. Palo Alto, CA: Morgan Kaufmann.
-
Mitchell, T. (1997). Does machine learning really work? AI magazine, Vol. 18. No. 3. pp. 11-20.
-
Honavar, V. (1997). A Tutorial on Computational Learning Theory, Vasant Honavar
-
Natarajan, B. (1992). Machine Learning: A theoretical Approach. Palo Alto, CA: Morgan Kaufmann.
-
Kearns, M. and Vazirani, U. (1994). An Introduction to Computational Learning Theory., Cambridge, MA: MIT Press.
Additional Information
Week 2 (January 18, 1999)
Mistake Bound Learning Model (Continued). Weighted Majority Model. Introduction to PAC Learning Model.
Required readings
Recommended readings
Additional Information
Week 3 (January 25, 1999)
Sample Complexity of PAC Learning. Consistent Learners. Efficient PAC Learning. Examples of Concept Classes that are Efficiently PAC Learnable Using Consistent Learners.
Required readings
Recommended readings
Additional Information
Week 4 (February 1, 1999)
More on PAC Learning. k-Term DNF are not PAC learnable using k-term DNF hypothesis space. k-term DNF are PAC learnable using the k-CNF hypothesis space. Occam learning.
Required readings
Recommended readings
Additional Information
Week 5 (February 8, 1999)
Occam learning. A General Framework for the design of Occam Algorithms. Examples of Occam learning algorithms. Occam learning of conjunctive concepts. Occam learning of K-Decision lists defined using conjunctions. An Occam Algorithm for Rule Induction (RIPPER). A Neural Network learning algorithm inspired by Occam Learning (DistAl).
Required readings
-
Occam Learning. Vasant Honavar.
-
Chapter 7. Machine Learning. Tom Mitchell, 1997.
Recommended readings
-
Aha, D. (1995).
Machine Learning (tutorial).
-
Cohen, W.
Learning Trees and Rules with Set Valued Attributes. In Proceedings of AAAI-96.
-
Cohen, W.
Fast effective rule induction.
In: Proceedings of the Twelfth International Conference on
Machine Learning, Lake Tahoe, California, 1995.
-
Furnkrantz, J. (1999).
Separate-and-Conquer Rule Learning.
Artificial Intelligence Review 13(1), 1999. In press.
-
Furnkrantz, J. (1997).
Pruning Algorithms for Rule Learning..
Machine Learning 27(2):139-171, May 1997.
-
Frank, E.. Witten, I.H. (1998).Generating accurate rule sets without global optimization.
Proc International Conference on Machine Learning. Morgan Kaufmann.
-
Honavar, V. (1997). A Tutorial on Computational Learning Theory, Vasant Honavar
-
Natarajan, B. (1992). Machine Learning: A theoretical Approach. Palo Alto, CA: Morgan Kaufmann.
-
Kearns, M. and Vazirani, U. (1994). An Introduction to Computational Learning Theory., Cambridge, MA: MIT Press.
-
Yang, J., Parekh, R. and Honavar, V. (1999). DistAl: An Inter-Pattern Distance Based Constructive Learning Algorithm. Intelligent Data Analysis. In press.
Additional Information
Week 6 (February 15, 1999)
PAC Learning of Infinite Concept Classes. VC Dimension of Infinite Concept Classes. Bounds on the VC dimension of Infinite Concept Classes. Upper and Lower Bounds on Sample Complexity of PAC learning of Infinite Concept Classes. Examples of Infinite concept classes. Sample Complexity of Perceptrons and Multi-Layer Perceptrons.
Required readings
-
Occam Learning. Vasant Honavar.
-
Chapter 7. Machine Learning. Tom Mitchell, 1997.
Recommended readings
-
Aha, D. (1995).
Machine Learning (tutorial).
-
Blumer, A., Ehrenfuecht, A., Haussler, D., and Warmuth, M. Learnability and the Vapnik-Chervonenkis Dimension. Journal of the ACM. 36(4):929-965, October 1989.
-
Honavar, V. (1997). A Tutorial on Computational Learning Theory, Vasant Honavar
-
Natarajan, B. (1992). Machine Learning: A theoretical Approach. Palo Alto, CA: Morgan Kaufmann.
-
Kearns, M. and Vazirani, U. (1994). An Introduction to Computational Learning Theory., Cambridge, MA: MIT Press.
-
Vidyasagar, M. (1997). Theory of Learning and Generalization. Berlin: Springer-Verlag.
Additional Information
Week 7 (February 22, 1999)
Weak Learners and Strong Learners. Boosting and Bagging.
Required readings
Recommended readings
-
Aha, D. (1995).
Machine Learning (tutorial).
-
Honavar, V. (1997). A Tutorial on Computational Learning Theory, Vasant Honavar
-
Natarajan, B. (1992). Machine Learning: A theoretical Approach. Palo Alto, CA: Morgan Kaufmann.
-
Kearns, M. and Vazirani, U. (1994). An Introduction to Computational Learning Theory., Cambridge, MA: MIT Press.
-
N. Duffy and D. Helmbold.
A Geometric Approach to Leveraging Weak Learners.
In: Proceedings of EuroColt 99. Springer Verlag.
-
M. Warmuth and M. Herbster,
Tracking the Best Expert
Journal of Machine Learning
Vol. 32(2), August 1998.
-
M. Warmuth and P. Auer,
Journal of Machine Learning
Vol. 32(2), August 1998.
Tracking the Best Disjunct
-
M. Herbster and M. Warmuth.
Tracking the Best Regressor.
Proc. 12th Annu. Conf. on Comput. Learning Theory
pp. 24-31, July 1998.
-
D.P. Helmbold and M.K. Warmuth. On Weak Learning.
Journal of Computer and System Sciences, 50(3):551-573, June 1995.
-
Freund, Y. 1999.
Boosting a weak learning algorithm by Majority"
Information and Computation. To appear.
-
Robert E. Schapire.
The strength of weak learnability.
Machine Learning, 5(2):197-227, 1990.
-
Robert E. Schapire.
Theoretical Views of Boosting.
In Computational Learning Theory: Fourth European
Conference, EuroCOLT'99, 1999.
Postscript
-
Robert E. Schapire, Yoav Freund, Peter Bartlett and Wee Sun Lee.
Boosting the margin: A new explanation for the
effectiveness of voting methods.
The Annals of Statistics, to appear.
Postscript
-
Yoav Freund and Robert E. Schapire.
A decision-theoretic generalization of on-line learning and an
application to boosting.
Journal of Computer and System Sciences, 55(1):119-139, 1997.
Postscript
-
Improved generalization through explicit optimization of margins.
Llew Mason, Peter Bartlett and Jonathan Baxter
Technical Report, Department of Systems Engineering, Australian National University, 1998.
-
L. Breiman, 1997.
Arcing Classifiers.
-
L. Breiman, 1994.
Bagging Predictors
Additional Information
Week 8 (March 1, 1999)
Weak Learners and Strong Learners. Boosting and Bagging. Continued.
Required readings
Recommended readings
-
Aha, D. (1995).
Machine Learning (tutorial).
-
Honavar, V. (1997). A Tutorial on Computational Learning Theory, Vasant Honavar
-
Natarajan, B. (1992). Machine Learning: A theoretical Approach. Palo Alto, CA: Morgan Kaufmann.
-
Kearns, M. and Vazirani, U. (1994). An Introduction to Computational Learning Theory., Cambridge, MA: MIT Press.
-
N. Duffy and D. Helmbold.
A Geometric Approach to Leveraging Weak Learners.
In: Proceedings of EuroColt 99. Springer Verlag.
-
M. Warmuth and M. Herbster,
Tracking the Best Expert
Journal of Machine Learning
Vol. 32(2), August 1998.
-
M. Warmuth and P. Auer,
Journal of Machine Learning
Vol. 32(2), August 1998.
Tracking the Best Disjunct
-
M. Herbster and M. Warmuth.
Tracking the Best Regressor.
Proc. 12th Annu. Conf. on Comput. Learning Theory
pp. 24-31, July 1998.
-
D.P. Helmbold and M.K. Warmuth. On Weak Learning.
Journal of Computer and System Sciences, 50(3):551-573, June 1995.
-
Freund, Y. 1999.
Boosting a weak learning algorithm by Majority"
Information and Computation. To appear.
-
Robert E. Schapire.
The strength of weak learnability.
Machine Learning, 5(2):197-227, 1990.
-
Robert E. Schapire.
Theoretical Views of Boosting.
In Computational Learning Theory: Fourth European
Conference, EuroCOLT'99, 1999.
Postscript
-
Robert E. Schapire, Yoav Freund, Peter Bartlett and Wee Sun Lee.
Boosting the margin: A new explanation for the
effectiveness of voting methods.
The Annals of Statistics, to appear.
Postscript
-
Yoav Freund and Robert E. Schapire.
A decision-theoretic generalization of on-line learning and an
application to boosting.
Journal of Computer and System Sciences, 55(1):119-139, 1997.
Postscript
-
Improved generalization through explicit optimization of margins.
Llew Mason, Peter Bartlett and Jonathan Baxter
Technical Report, Department of Systems Engineering, Australian National University, 1998.
-
L. Breiman, 1997.
Arcing Classifiers.
-
L. Breiman, 1994.
Bagging Predictors
Additional Information
Week 9 (March 8, 1999)
Weak Learners and Strong Learners. Boosting and Bagging. Continued.
Required readings
Recommended readings
-
Aha, D. (1995).
Machine Learning (tutorial).
-
Honavar, V. (1997). A Tutorial on Computational Learning Theory, Vasant Honavar
-
Natarajan, B. (1992). Machine Learning: A theoretical Approach. Palo Alto, CA: Morgan Kaufmann.
-
Kearns, M. and Vazirani, U. (1994). An Introduction to Computational Learning Theory., Cambridge, MA: MIT Press.
-
N. Duffy and D. Helmbold.
A Geometric Approach to Leveraging Weak Learners.
In: Proceedings of EuroColt 99. Springer Verlag.
-
M. Warmuth and M. Herbster,
Tracking the Best Expert
Journal of Machine Learning
Vol. 32(2), August 1998.
-
M. Warmuth and P. Auer,
Journal of Machine Learning
Vol. 32(2), August 1998.
Tracking the Best Disjunct
-
M. Herbster and M. Warmuth.
Tracking the Best Regressor.
Proc. 12th Annu. Conf. on Comput. Learning Theory
pp. 24-31, July 1998.
-
D.P. Helmbold and M.K. Warmuth. On Weak Learning.
Journal of Computer and System Sciences, 50(3):551-573, June 1995.
-
Freund, Y. 1999.
Boosting a weak learning algorithm by Majority"
Information and Computation. To appear.
-
Robert E. Schapire.
The strength of weak learnability.
Machine Learning, 5(2):197-227, 1990.
-
Robert E. Schapire.
Theoretical Views of Boosting.
In Computational Learning Theory: Fourth European
Conference, EuroCOLT'99, 1999.
Postscript
Postscript
-
Yoav Freund and Robert E. Schapire.
A decision-theoretic generalization of on-line learning and an
application to boosting.
Journal of Computer and System Sciences, 55(1):119-139, 1997.
Postscript
-
Improved generalization through explicit optimization of margins.
Llew Mason, Peter Bartlett and Jonathan Baxter
Technical Report, Department of Systems Engineering, Australian National University, 1998.
-
L. Breiman, 1997.
Arcing Classifiers.
-
L. Breiman, 1994.
Bagging Predictors
Additional Information
SPRING BREAK
Week 10 (March 22)
Kolmogorov Complexity and Learning. Introduction to Kolmogorov Complexity.
Kolmogorov Complexity and Occam Learning.
Required readings
Recommended readings
-
Aha, D. (1995).
Machine Learning (tutorial).
-
Honavar, V. (1997). A Tutorial on Computational Learning Theory, Vasant Honavar
-
Natarajan, B. (1992). Machine Learning: A theoretical Approach. Palo Alto, CA: Morgan Kaufmann.
-
Kearns, M. and Vazirani, U. (1994). An Introduction to Computational Learning Theory., Cambridge, MA: MIT Press.
-
M. Li and P.M.B. Vitanyi,
Learning simple concepts
under simple distributions,
SIAM. J. Computing, 20:5(1991), 911-935.
-
Li, M. and Vitanyi, P. Kolmogorov Complexity and its Applications.
Additional Information
Week 11 (March 22)
Solomonoff-Levin Universal Distributions.
Learning Under Simple Distributions. Log-term DNF are PAC-learnable under Simple Distributions.
Required readings
Recommended readings
-
Aha, D. (1995).
Machine Learning (tutorial).
-
Honavar, V. (1997). A Tutorial on Computational Learning Theory, Vasant Honavar
-
Natarajan, B. (1992). Machine Learning: A theoretical Approach. Palo Alto, CA: Morgan Kaufmann.
-
Kearns, M. and Vazirani, U. (1994). An Introduction to Computational Learning Theory., Cambridge, MA: MIT Press.
-
M. Li and P.M.B. Vitanyi,
Learning simple concepts
under simple distributions,
SIAM. J. Computing, 20:5(1991), 911-935.
-
Li, M. and Vitanyi, P. Kolmogorov Complexity and its Applications.
Additional Information
Week 12 (March 30)
DFA Learning. Search Space for DFA Learning. DFA Learning From Characteristic
Samples. RPNI Algorithm. Learning Simple DFA from Simple Examples.
Required readings
Recommended readings
-
Aha, D. (1995).
Machine Learning (tutorial).
-
Honavar, V. (1997). A Tutorial on Computational Learning Theory, Vasant Honavar
-
Natarajan, B. (1992). Machine Learning: A theoretical Approach. Palo Alto, CA: Morgan Kaufmann.
-
Kearns, M. and Vazirani, U. (1994). An Introduction to Computational Learning Theory., Cambridge, MA: MIT Press.
-
Honavar, V. and Slutzki, G. (1998) (Ed.). Proceedings of the Fourth
International Colloquium on Grammatical Inference. (LNCS Vol. 1433).
Berlin: Springer-Verlag.
-
Parekh, R. & Honavar, V. (1999). Automata Induction, Grammar Inference, and Language Acquisition. Invited chapter.
In: Handbook of Natural Language Processing. Dale, Moisl
& Somers (Ed). New York: Marcel Dekker. In press.
-
Parekh, R. and Honavar, V. (1999). Simple DFA are Polynomially Exactly Learnable from Simple Examples
-
Parekh, R., Nichitiu, C., and Honavar, V. (1998).
A Polynomial Time Incremental Algorithm for Learning DFA.
In: Proceedings of the Fourth International
Colloquium on Grammatical Inference (ICGI'98), Ames, IA.
Lecture Notes in Computer Science vol. 1433 pp. 37-49. Berlin: Springer-Verlag.
- Incremental Regular Inference, Lecture Notes in Artificial Intelligence, No. 1147, Springer Verlag, Grammatical
Inference : learning syntax from sentences, ICGI'96, pp. 222 -- 237, 1996.
-
What is the search space of Regular Inference?, Lecture Notes in Artificial Intelligence, No. 862, Springer Verlag,
Grammatical Inference and Applications, ICGI'94, pp. 25--37, 1994.
-
M. Li and P.M.B. Vitanyi,
Learning simple concepts
under simple distributions,
SIAM. J. Computing, 20:5(1991), 911-935.
-
Li, M. and Vitanyi, P. Kolmogorov Complexity and its Applications.
Additional Information
Week 13 (April 6)
Support Vector Machines.
Required readings
Recommended readings
-
Aha, D. (1995).
Machine Learning (tutorial).
-
Honavar, V. (1997). A Tutorial on Computational Learning Theory, Vasant Honavar
-
Natarajan, B. (1992). Machine Learning: A theoretical Approach. Palo Alto, CA: Morgan Kaufmann.
-
Kearns, M. and Vazirani, U. (1994). An Introduction to Computational Learning Theory., Cambridge, MA: MIT Press.
- Cortes, C.; and Vapnik, V. 1995. Support Vector Networks. Machine
Learning 20:273-297.
- Osuna, E.; Freund, R.; Girosi, F. 1997.
Support Vector Machines:Training and Applications. MIT AI Memo 1602,
March, 1997.
- Osuna, E.; Freund, R.; Girosi, F. 1997. An improved
Training Algorithm for Support Vector
Machines. NNSP'97
-
Osuna, E. and Girosi, F.
Reducing the run-time complexity of Support Vector Machines.
To appear in ICPR'98, Brisbane, Australia, 1998..
(x pages, xx kB)
- Vladimir Cherkassky and Filip Mulier; 1998. "Support Vector
Machines" in Learning from Data: Concepts, Theory, and Methods.
Wiley Interscience.
(353-387 pages)
-
C.J.C. Burges;
1998.
A tutorial on support vector machines for pattern recognition.
Data Mining and Knowledge Discovery, Vol 2 Number 2.
(43 pages)
-
P. S. Bradley, Usama M. Fayyad, and O. L. Mangasarian; 1998.
Data Mining: Overview and Optimization Opportunities,
Mathematical Programming Technical Report 98-01, University of
Wisconsin Madison.
(Postscript)
-
P. S. Bradley and O. L. Mangasarian; 1998.
Massive Data Discrimination via Linear Suppport Vector Machines
Mathematical Programming Technical Report 98-05, University of
Wisconsin Madison.
(Postscript)
- Cristianini, N.; Campbell, C.; Shawe-Taylor, J.; 1998.
Multiplicative Updatings for Support-Vector Learning
Neuro COLT Technical Report TR-1998-016,
Royal Holloway College. (compressed
postscript)
-
M.A. Hearst, B. Schölkopf, S. Dumais, E. Osuna, and
J. Platt. Trends and Controversies - Support Vector Machines. IEEE
Intelligent Systems, 13(4):18-28, 1998.
(pdf) (gzipped postscript)
- Smola, A. J.; Schölkopf, B.; 1998.
A Tutorial on Support Vector Regression
Neuro COLT Technical Report TR-1998-030,
Royal Holloway College.
(gzipped postscript, 73 pages, 320kB)
- J. Platt, Probabilistic Outputs for
Support Vector Machines and Comparisons to Regularized Likelihood
Methods, submitted to Advances
in Large Margin Classifiers, A. Smola, P. Bartlett,
B. Scholkopf, D. Schuurmans, eds., MIT Press, (1999), to
appear.
- J. Platt, Using Sparseness and Analytic QP to Speed Training of
Support Vector Machines, in
Advances in Neural
Information Processing Systems 11, M. S. Kearns, S. A. Solla,
D. A. Cohn, eds., MIT
Press, (1999).
- J. Platt, Fast Training of Support Vector
Machines using Sequential Minimal Optimization, in
Advances in Kernel Methods - Support
Vector Learning, B. Schölkopf, C. Burges, and A. Smola, eds., MIT Press, (1999).
- J. Platt, How to Implement SVMs, IEEE Intelligent Systems Magazine,
Trends and Controversies, Marti Hearst, ed., vol 13, no 4, (1998).
Additional Information