ARTIFICIAL INTELLIGENCE RESEARCH LABORATORY
    Center for Computational Intelligence, Learning, and Discovery
    Department of Computer Science


Data Mining and Knowledge Discovery: Algorithms and Applications

Data Mining is concerned with the development and applications of algorithms for discovery of a priori unknown relationships - associations, groupings, classifiers from data. Honavar's current research on data mining is focused on:


Selected References

  1. Caragea, D., Zhang, J., Pathak, J., and Honavar, V. (2006). Learning Classifiers from Distributed, Ontology-Extended Data Sources. Proceedings of the 8th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2006), Krakov, Poland, Lecture Notes in Computer Science. Berlin: Springer. In press.

  2. Yan, C., Terribilini, M., , Wu, F., Jernigan, R.L., Dobbs, D. and Honavar, V. (2006) Identifying amino acid residues involved in protein-DNA interactions from sequence. BMC Bioinformatics, 2006.

  3. Terribilini, M., Lee, J.-H., Yan, C., Jernigan, R. L., Honavar, V. and Dobbs, D. (2006) Predicting RNA-binding Sites from Amino Acid Sequence. RNA Journal.. Vol. In press, Accepted, 2006.

  4. Kang, D-K., Silvescu, A. and Honavar, V. (2006). RNBL-MN: A Recursive Naive Bayes Learner for Sequence Classification. In: Proceedings of the Tenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2006). Lecture Notes in Computer Science.. Berlin: Springer-Verlag. In press.

  5. Zhang, J., Kang, D-K., Silvescu, A. and Honavar, V. (2006) Learning Compact and Accurate Naive Bayes Classifiers from Attribute Value Taxonomies and Data. Knowledge and Information Systems. Vol. 9. No. 2. pp. 157-179, 2006.

  6. Pathak, J, Yong, J. Honavar, V., McCalley, J. (2006). Condition Data Aggregation for Failure Mode Estimation of Power Transformers. In: Hawaii International Conference on Systems Sciences.

  7. Terribilini, M., Lee. J-H., Yan, C., Carpenter, S., Jernigan, R., Honavar, V. and Dobbs, D. (2006). Identifying interaction sites in recalcitrant proteins: predicted protein and rna binding sites in HIV-1 and EIAV agree with experimental data. In: Pacific Symposium on Biocomputing. Hawaii. In press.

  8. Vasile, F., Silvescu, A., Kang, D-K., and Honavar, V. (2006). TRIPPER: An Attribute Value Taxonomy Guided Rule Learner. In: Proceedings of the Tenth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD). In press.

  9. Silvescu, A. and Honavar, V. (2005). Independence, Decomposability and functions which take values into an Abelian Group. In: Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics. http://anytime.cs.umass.edu/aimath06/proceedings.html.

  10. Yakhnenko, O., Silvescu, A., and Honavar, V. (2005). Discriminatively Trained Markov Model for Sequence Classification. In: IEEE Conference on Data Mining (ICDM 2005). Houston, Texas. IEEE Press.

  11. Caragea, D., Zhang, J., Bao, J., Pathak, J., and Honavar, V. (2005). Algorithms and Software for Collaborative Discovery from Autonomous, Semantically Heterogeneous Information Sources (Invited paper). In: Proceedings of the 16th International Conference on Algorithmic Learning Theory. Lecture Notes in Computer Science. Singapore. Vol. 3734. pp. 13-44. Berlin: Springer-Verlag.

  12. Zhang, J., Caragea, D. and Honavar, V. (2005). Learning Ontology-Aware Classifiers. In: Proceedings of the 8th International Conference on Discovery Science. Springer-Verlag Lecture Notes in Computer Science. Singapore. Vol. 3735. pp. 308-321. Berlin: Springer-Verlag.

  13. Kang, D-K., Fuller, D., and Honavar, V. (2005). Learning Misuse and Anomaly Detectors from System Call Frequency Vector Representation. In: IEEE International Conference on Intelligence and Security Informatics. Springer-Verlag Lecture Notes in Computer Science. Vol. 3495. pp. 511-516. Springer-Verlag.

  14. Kang, D-K., Zhang, J., Silvescu, A., and Honavar, V. (2005). Multinomial Event Model Based Abstraction for Sequence and Text Classification. In: Proceedings of the Symposium on Abstraction, Reformulation, and Approximation (SARA 2005). Edinburgh, UK. Vol. 3607. pp. 134-148. Berlin: Springer-Verlag.

  15. Kang, D-K., Fuller, D., and Honavar, V. (2005). Learning Classifiers for Misuse and Anomaly Detection Using a Bag of System Calls Representation. In: Proceedings of the 6th IEEE Systems, Man, and Cybernetics Workshop (IAW 05). West Point, NY. pp. 118-125. IEEE.

  16. Andorf, C., Silvescu, A., Dobbs, D. and Honavar, V. (2004). Learning Classifiers for Assigning Protein Sequences to Gene Ontology Functional Families. In: Fifth International Conference on Knowledge Based Computer Systems (KBCS 2004). India. pp. 256-255. New Delhi, India: Allied Publishers.

  17. Caragea, D., Silvescu, A., and Honavar, V. (2004). A Framework for Learning from Distributed Data Using Sufficient Statistics and its Application to Learning Decision Trees. In: International Journal of Hybrid Intelligent Systems. Vol. 1. No. 2. pp. 80-89.

  18. Cook, D., Caragea, D., and Honavar, V. (2004). Visualization in Classification Problems. In: Proceedings in Computational Statistics (COMPSTAT 2004). pp. 799-806. Springer-Verlag.

  19. Kang, D-K., Silvescu, A., Zhang, J. and Honavar, V. (2004). Generation of Attribute Value Taxonomies from Data for Accurate and Compact Classifier Construction. In: IEEE International Conference on Data Mining. pp. 130-137. IEEE Press.

  20. Lonosky, P., Zhang, X., Honavar, V., Dobbs, D., Fu, A., and Rodermel, S. (2004). A Proteomic Analysis of Chloroplast Biogenesis in Maize. In: Plant Physiology. Vol. 134. pp. 560-574.

  21. R. Polikar, L. Udpa, S. Udpa, and V. Honavar (2004). An Incremental Learning Algorithm with Confidence Estimation for Automated Identification of NDE Signals. In: IEEE Transactions of Ultrasonics, Ferroelectrics, and Frequency Control. Vol. 51. pp. 990-1001.

  22. Sen, T.Z., Kloczkowski, A., Jernigan, R.L., Yan, C., Honavar, V., Ho, K-M., Wang, C-Z., Ihm, Y., Cao, H., Gu, X., and Dobbs, D. (2004). Predicting Binding Sites of Protease-Inhibitor Complexes by Combining Multiple Methods. In: BMC Bioinformatics. Vol. 5. pp. 205.

  23. Yan, C., Dobbs, D., and Honavar, V. (2004). A Two-Stage Classifier for Identification of Protein-Protein Interface Residues. In: Bioinformatics. Vol. 20. pp. i371-378.

  24. Yan, C., Dobbs, D., and Honavar, V. (2004). Identifying Protein-Protein Interaction Sites from Surface Residues . A Support Vector Machine Approach. In: Neural Computing Applications. Vol. 13. pp. 123-129.

  25. Zhang, J. and Honavar, V. (2004). Learning Compact and Accurate Classifiers from Attribute Value Taxonomies and Partially Specified Data. In: IEEE International Conference on Data Mining. pp. 289-298. IEEE Press.

  26. Atramentov, A., Leiva, H., and Honavar, V. (2004). A Multi-Relational Decision Tree Learning Algorithm - Implementation and Experiments.. In: Proceedings of the Thirteenth International Conference on Inductive Logic Programming. Berlin: Springer-Verlag. In press.

  27. Caragea, D., Silvescu, A., and Honavar, V. (2003). Decision Tree Induction from Distributed, Heterogeneous, Autonomous Data Sources. In: Proceedings of the Conference on Intelligent Systems Design and Applications (ISDA 03). In press.

  28. Wang, X., Schroeder, D., Dobbs, D., and Honavar, V. (2003). Automated Data-Driven Discovery of Motif-Based Protein Function Classifiers. Information Sciences. In press.

  29. Yan, C., Dobbs, D. (2003). Identification of Surface Residues Involved in Protein-Protein Interaction -- A Support Vector Machine ApproachIn: Proceedings of the Conference on Intelligent Systems Design and Applications (ISDA-03). Tulsa, Oklahoma. 2003.

  30. Zhang, J. and Honavar, V. (2003). Learning Decision Tree Classifiers from Attribute Value Taxonomies and Partially Specified Data. In: Proceedings of the International Conference on Machine Learning (ICML-03). Washington, DC.

  31. Helmer, G., Wong, J., Honavar, V., and Miller, L. (2003). Lightweight Agents for Intrusion Detection. Journal of Systems and Software. Vol. 67. pp. 109-122.

  32. Helmer, G., Wong, J., Honavar, V., and Miller, L. (2002). Automated Discovery of Concise Predictive Rules for Intrusion Detection. Journal of Systems and Software.60 (3) (2002) pp. 165-175

  33. Caragea, D., Silvescu, A., and Honavar, V. (2001). Invited Chapter. Towards a Theoretical Framework for Analysis and Synthesis of Agents That Learn from Distributed Dynamic Data Sources. In: Emerging Neural Architectures Based on Neuroscience. Berlin: Springer-Verlag.

  34. Caragea, D., Cook, D., and Honavar, V. (2001). Gaining Insights into Support Vector Machine Classifiers Using Projection-Based Tour Methods. In: Proceedings of the Conference on Knowledge Discovery and Data Mining.

  35. Parekh, R. and Honavar, V. (2001). DFA Learning from Simple Examples. Machine Learning. Vol. 44. pp. 9-35.

  36. Polikar, R., Shinar, R., Honavar, V., Udpa, L., and Porter, M. (2001). Detection and Identification of Odorants Using an Electronic Nose. In: Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing.

  37. Polikar, R., Udpa, L., Udpa, S., and Honavar, V. (2001). Learn++: An Incremental Learning Algorithm for Multi-Layer Perceptron Networks. IEEE Transactions on Systems, Man, and Cybernetics. Vol. 31, No. 4. pp. 497-508.

  38. Silvescu, A., and Honavar, V. (2001). Temporal Boolean Network Models of Genetic Networks and Their Inference from Gene Expression Time Series. Complex Systems.. Vol. 13. No. 1. pp. 54-.

  39. Parekh, R., Yang, J., and Honavar, V. (2000). Constructive Neural Network Learning Algorithms for Multi-Category Pattern Classification. IEEE Transactions on Neural Networks. Vol. 11. No. 2. pp. 436-451.

  40. Yang, J. and Honavar, V. (1999). DistAl: An Inter-Pattern Distance Based Constructive Neural Network Learning Algorithm.. Intelligent Data Analysis. Vol. 3. pp. 55-73.

  41. Yang, J. and Honavar, V. (1998). Feature Subset Selection Using a Genetic Algorithm. In: Feature Extraction, Construction, and Subset Selection: A Data Mining Perspective. Motoda, H. and Liu, H. (Ed.) New York: Kluwer. 1998. A shorter version of this paper appears in IEEE Intelligent Systems (Special Issue on Feature Transformation and Subset Selection).

  42. Balakrishnan, K. and Honavar, V. (1998). Intelligent Diagnosis Systems. Journal of Intelligent Systems. Vol. 8. No.3/4. pp. 239-290.