Iowa State University

Iowa State UniversityIowa State University
Carson M. Andorf
Artificial Intelligence Research Laboratory

Department of Computer Science

Curriculum Vita (Download)


Research areas of interest:

Bioinformatics, Computational Biology, Artificial Intelligence, Machine Learning, Data Mining and Knowledge Discovery, Ontology Assisted Data Mining, Bayesian Learning, Agents and Multi-agents Systems, Protein function prediction, Protein structure prediction, Protein subcellular localization prediction, Protein-protein interaction prediction

Education:

(2000 - 2007): Iowa State University Ames, Iowa (USA)
Ph.D. (December 2006) in Computer Science, Minor in Bioinformatics
Advisor: Dr. Vasant Honavar, Co-advisor Drena Dobbs.

(1996 - 2000): Wartburg College, Waverly, Iowa (USA)
Bachelor of Arts degree in Mathematics and Computer Science
Overall GPA 3.83 / 4.00; CS GPA 4.00 / 4.00; Math GPA 3.92 / 4.00

Research Experience

(2002 - 2006): Supported by Iowa State Baker Center / Pioneer Hi-Bred Fellowship.
I developed a new generalized version of the Naive Bayes algorithm and a Support Vector Machine algorithm to use alternative representations of proteins for prediction of novel proteins. These prediction problems include function, structure, subcellular localization, and protein-protein interactions. This method is computationally faster and has the same accuracy as previous methods that needed multiple sequence alignment of the proteins sequences. This method performs at high levels on a wide variety of different datasets. This application was developed in Java and has a well designed graphical user interface.

(2002 - 2002): I participated in protein function prediction research at Pioneer Hi-Bred in Bioinformatics department. I successfully rewrote the Protein Family Database (PFam) by rebuilding the hidden Markov models using alternative representations of the protein sequences. This work, in terms of the databases predictability of previously unknown proteins, was able to increase selectivity without sacrificing sensitivity.

(2000 - 2002): Supported by an IGERT Fellowship funded by the National Science Foundation. I developed a Decision Tree algorithm to use alternative representations of proteins to predict novel proteins function. This method has the same accuracy as previous methods based on using motifs of proteins, but offers a more in-dept and unique insight on the proteins function-structure relationship. I also did an in-depth systematic study on different types of representations of proteins including representations based on amino acid properties, substitution matrices, structure, and random representations. I modified various decision-tree algorithms to do exact-learning on distributed, autonomous, and heterogeneous datasets.

Industrial Experience:

(2004 - 2007): Bioinformatician and Computational Biologist. NewLink, Genetics, Ames, IA.
(2002 - 2002): Bioinformatician and Computational Biologist .Pioneer Hi-Bred, Johnston, IA.
(1998 - 1999): Computer programmer. John Deere Waterloo Works, Waterloo, IA.

Teaching Experience:

(2006) Teaching Assistant for Computational Models of Learning. Duties included guest lecturing and assisting students with projects.

(2000 - 2005) AI Seminar Lecturer: Gave many presentations in the AI Seminar coordinated by Prof. Vasant Honavar.
(2000 - 2005) Bioinformatics Seminar Lecturer: Gave many presentations in the Bioinformatics Seminar coordinated by Prof. Vasant Honavar and Prof. Drena Dobbs.
(1996 - 1998) Teaching Assistant: Taught laboratory sections of “Calculus I, Calculus II, Calculus III, and Linear Algebra” courses. Duties included lecturing and grading of assignments and projects.

Research and Training Grants:

(2002 - 2004): Graduate Research Fellowships in Bioinformatics and Computational Biology, Pioneer Hi-Bred, Inc., $40,000.
(2000 - 2002): IGERT: Computational Molecular Biology Training Group, National Science Foundation, $50,000.

Computer skills

- Languages: C, C++, Java, Perl, Prolog, Lisp, Scheme, COBOL, SAS, Assembly, Python, JCL
- Operating Systems: Unix, Linux, Windows (all Versions), DOS, Mac OS X
- Databases: Oracle, DB2, SQL, IMS
- Web Design: HTML, JavaScript, .ASP

Bioinformatics skills

- Software: Blast, Clustalw, Fasta, MEME, MAST, HMMER, emotif, RasMol, GeneSeqer
- Databases: PDB, Pfam, GO, InterPro, Swiss-Prot, MIPS, SCOP, Prosite, KinBase, Merops

List of publications

Papers in preparation:

1. Andorf, C., Silvescu, A., Dobbs, D. and Honavar, V. (2005) Learning Classifiers for Assigning Protein Sequences to Gene Ontology Functional Families. To be submitted: BMC Bioinformatics Journal.

2. Andorf, C., Silvescu, A., Dobbs, D. and Honavar, V. (2005) Learning Classifiers for Assigning Protein Sequences to Subcellular Localization Families. To be submitted: Bioinformatics Journal.

3. Andorf, C., Zhang, J., Silvescu, A., Dobbs, D. and Honavar, V. (2005) Learning Classifiers for Assigning Protein Sequences to the SCOP Hierarchal Families. To be submitted: BMC Bioinformatics Journal.

4. Andorf, C., A., Dobbs, D. and Honavar, V. (2005) The effects of information gain in random alphabets in classification accuracy of protein sequences.

5. Silvescu, A., Andorf, C., Dobbs, D., Honavar, V. (2005) Inter-element dependency models for sequence classification.

Refereed Journal Papers:

1. Andorf, C., Dobbs, D., and Honavar, V. (2005) Reduced Alphabet Representations of Amino Acid Sequences for Protein Function Classification. Information Sciences. In press.
Invited or Refereed Book Chapters
1. Honavar, V., Andorf, C., Caragea, D., Dobbs, D., Reinoso-Castillo, J., Silvescu, A. Wang, X. (2002). Invited Chapter. Algorithmic and Systems Solutions for Computer Assisted Knowledge Acquisition in Bioinformatics and Computational Biology. In: Computational Biology and Genome Informatics. Wu, C., Wang,P., and Wang, J. (Ed.) World Scientific.
2. Honavar, V., Andorf, C., Caragea, D., Silvescu, A., and Sharma, T. (2001). Invited Chapter. Agent-Based Systems for Data-Driven Knowledge Discovery from Distributed Data Sources: From Specification to Implementation. In: Intelligent Agent Software Engineering. Plekhanova, V. and Wermter, S. (ed.). Idea Group Publisher. In press.

Recent Refereed Conference Papers

1. Andorf, C., Silvescu, A., Dobbs, D. and Honavar, V. (2004) Learning Classifiers for Assigning Protein Sequences to Gene Ontology Functional Families. In: Proceedings of the Fifth International Conference on Knowledge Based Computer Systems (KBCS 2004), India.

2. Andorf, C., Dobbs, D., and Honavar, V. (2002). Data-Driven Generation of Protein Function Classification Rules Based on Sequence Motifs Discovered Using Multiple Alignments From Reduced Alphabet Representations of Protein Sequences. Conference on Computational Biology and Genome Informatics.

3. Silvescu, A., Reinoso-Castillo, J., Andorf, C., Honavar, V., and Dobbs, D. (2001). Ontology-Driven Information Extraction and Knowledge Acquisition from Heterogeneous, Distributed Biological Data Sources. In: Proceedings of the IJCAI-2001 Workshop on Knowledge Discovery from Heterogeneous, Distributed, Autonomous, Dynamic Data and Knowledge Sources.

Awards and Honors

- Baker Center / Pioneer Hi-Bred Bioinformatics Fellowship (2002-2004)
- Arthur A. Collins Computer Science Scholarship, ISU (2002)
- NSF IGERT Fellowship, Bioinformatics and Computational Biology, ISU, (2000- 2002)
- Iowa State University Computer Science Honor Society, (2002 - Current)
- Kappa Mu Epsilon, Mathematics Honor Society, Wartburg College, (1997 - 2000).
- Wartburg College Regents Scholar, (1996 - 2000).
- Dean's Top 40, Dean's List, Wartburg College, (1997 - 2000).
- Chellevold Mathematics Scholarship, Wartburg College, (1998 - 1999).
- Mathematics Field Day Scholarship, Wartburg College, (1996 - 1997).
- State of Iowa Scholar, Wartburg College, (1996).
- Academic All-American in Wrestling, Wartburg College (1997 - 2000).

References:

Available upon request.