|
Research areas of interest:
Bioinformatics, Computational Biology, Artificial Intelligence,
Machine Learning, Data Mining and Knowledge Discovery, Ontology
Assisted Data Mining, Bayesian Learning, Agents and Multi-agents
Systems, Protein function prediction, Protein structure prediction,
Protein subcellular localization prediction, Protein-protein
interaction prediction
Education:
(2000 - 2007): Iowa State University Ames, Iowa (USA)
Ph.D. (December 2006) in Computer Science, Minor in Bioinformatics
Advisor: Dr. Vasant Honavar, Co-advisor Drena Dobbs.
(1996 - 2000): Wartburg College, Waverly, Iowa (USA)
Bachelor of Arts degree in Mathematics and Computer Science
Overall GPA 3.83 / 4.00; CS GPA 4.00 / 4.00; Math GPA 3.92
/ 4.00
Research Experience
(2002 - 2006): Supported by Iowa State Baker
Center / Pioneer Hi-Bred Fellowship.
I developed a new generalized version of the Naive Bayes algorithm
and a Support Vector Machine algorithm to use alternative
representations of proteins for prediction of novel proteins.
These prediction problems include function, structure, subcellular
localization, and protein-protein interactions. This method
is computationally faster and has the same accuracy as previous
methods that needed multiple sequence alignment of the proteins
sequences. This method performs at high levels on a wide variety
of different datasets. This application was developed in Java
and has a well designed graphical user interface.
(2002 - 2002): I participated in protein
function prediction research at Pioneer Hi-Bred in Bioinformatics
department. I successfully rewrote the Protein Family Database
(PFam) by rebuilding the hidden Markov models using alternative
representations of the protein sequences. This work, in terms
of the databases predictability of previously unknown proteins,
was able to increase selectivity without sacrificing sensitivity.
(2000 - 2002): Supported by an IGERT Fellowship
funded by the National Science Foundation. I developed a Decision
Tree algorithm to use alternative representations of proteins
to predict novel proteins function. This method has the same
accuracy as previous methods based on using motifs of proteins,
but offers a more in-dept and unique insight on the proteins
function-structure relationship. I also did an in-depth systematic
study on different types of representations of proteins including
representations based on amino acid properties, substitution
matrices, structure, and random representations. I modified
various decision-tree algorithms to do exact-learning on distributed,
autonomous, and heterogeneous datasets.
Industrial Experience:
(2004 - 2007): Bioinformatician and Computational
Biologist. NewLink, Genetics, Ames, IA.
(2002 - 2002): Bioinformatician and Computational
Biologist .Pioneer Hi-Bred, Johnston, IA.
(1998 - 1999): Computer programmer. John
Deere Waterloo Works, Waterloo, IA.
Teaching Experience:
(2006) Teaching Assistant for Computational
Models of Learning. Duties included guest lecturing and assisting
students with projects.
(2000 - 2005) AI Seminar Lecturer: Gave
many presentations in the AI Seminar coordinated by Prof.
Vasant Honavar.
(2000 - 2005) Bioinformatics Seminar Lecturer:
Gave many presentations in the Bioinformatics Seminar coordinated
by Prof. Vasant Honavar and Prof. Drena Dobbs.
(1996 - 1998) Teaching Assistant: Taught
laboratory sections of “Calculus I, Calculus II, Calculus
III, and Linear Algebra” courses. Duties included lecturing
and grading of assignments and projects.
Research and Training Grants:
(2002 - 2004): Graduate Research Fellowships
in Bioinformatics and Computational Biology, Pioneer Hi-Bred,
Inc., $40,000.
(2000 - 2002): IGERT: Computational Molecular
Biology Training Group, National Science Foundation, $50,000.
Computer skills
- Languages: C, C++, Java, Perl, Prolog, Lisp, Scheme, COBOL,
SAS, Assembly, Python, JCL
- Operating Systems: Unix, Linux, Windows (all Versions),
DOS, Mac OS X
- Databases: Oracle, DB2, SQL, IMS
- Web Design: HTML, JavaScript, .ASP
Bioinformatics skills
- Software: Blast, Clustalw, Fasta, MEME, MAST, HMMER, emotif,
RasMol, GeneSeqer
- Databases: PDB, Pfam, GO, InterPro, Swiss-Prot, MIPS, SCOP,
Prosite, KinBase, Merops
List of publications
Papers in preparation:
1. Andorf, C., Silvescu, A., Dobbs, D. and Honavar, V. (2005)
Learning Classifiers for Assigning Protein Sequences to Gene
Ontology Functional Families. To be submitted: BMC Bioinformatics
Journal.
2. Andorf, C., Silvescu, A., Dobbs, D. and Honavar, V. (2005)
Learning Classifiers for Assigning Protein Sequences to Subcellular
Localization Families. To be submitted: Bioinformatics Journal.
3. Andorf, C., Zhang, J., Silvescu, A., Dobbs, D. and Honavar,
V. (2005) Learning Classifiers for Assigning Protein Sequences
to the SCOP Hierarchal Families. To be submitted: BMC Bioinformatics
Journal.
4. Andorf, C., A., Dobbs, D. and Honavar, V. (2005) The effects
of information gain in random alphabets in classification
accuracy of protein sequences.
5. Silvescu, A., Andorf, C., Dobbs, D., Honavar, V. (2005)
Inter-element dependency models for sequence classification.
Refereed Journal Papers:
1. Andorf, C., Dobbs, D., and Honavar, V. (2005) Reduced Alphabet
Representations of Amino Acid Sequences for Protein Function
Classification. Information Sciences. In press.
Invited or Refereed Book Chapters
1. Honavar, V., Andorf, C., Caragea, D., Dobbs, D., Reinoso-Castillo,
J., Silvescu, A. Wang, X. (2002). Invited Chapter. Algorithmic
and Systems Solutions for Computer Assisted Knowledge Acquisition
in Bioinformatics and Computational Biology. In: Computational
Biology and Genome Informatics. Wu, C., Wang,P., and Wang,
J. (Ed.) World Scientific.
2. Honavar, V., Andorf, C., Caragea, D., Silvescu, A., and
Sharma, T. (2001). Invited Chapter. Agent-Based Systems for
Data-Driven Knowledge Discovery from Distributed Data Sources:
From Specification to Implementation. In: Intelligent Agent
Software Engineering. Plekhanova, V. and Wermter, S. (ed.).
Idea Group Publisher. In press.
Recent Refereed Conference Papers
1. Andorf, C., Silvescu, A., Dobbs, D. and Honavar, V. (2004)
Learning Classifiers for Assigning Protein Sequences to Gene
Ontology Functional Families. In: Proceedings of the Fifth
International Conference on Knowledge Based Computer Systems
(KBCS 2004), India.
2. Andorf, C., Dobbs, D., and Honavar, V. (2002). Data-Driven
Generation of Protein Function Classification Rules Based
on Sequence Motifs Discovered Using Multiple Alignments From
Reduced Alphabet Representations of Protein Sequences. Conference
on Computational Biology and Genome Informatics.
3. Silvescu, A., Reinoso-Castillo, J., Andorf, C., Honavar,
V., and Dobbs, D. (2001). Ontology-Driven Information Extraction
and Knowledge Acquisition from Heterogeneous, Distributed
Biological Data Sources. In: Proceedings of the IJCAI-2001
Workshop on Knowledge Discovery from Heterogeneous, Distributed,
Autonomous, Dynamic Data and Knowledge Sources.
Awards and Honors
- Baker Center / Pioneer Hi-Bred Bioinformatics Fellowship
(2002-2004)
- Arthur A. Collins Computer Science Scholarship, ISU (2002)
- NSF IGERT Fellowship, Bioinformatics and Computational Biology,
ISU, (2000- 2002)
- Iowa State University Computer Science Honor Society, (2002
- Current)
- Kappa Mu Epsilon, Mathematics Honor Society, Wartburg College,
(1997 - 2000).
- Wartburg College Regents Scholar, (1996 - 2000).
- Dean's Top 40, Dean's List, Wartburg College, (1997 - 2000).
- Chellevold Mathematics Scholarship, Wartburg College, (1998
- 1999).
- Mathematics Field Day Scholarship, Wartburg College, (1996
- 1997).
- State of Iowa Scholar, Wartburg College, (1996).
- Academic All-American in Wrestling, Wartburg College (1997
- 2000).
References:
Available upon request.
|