|
|
|
Cornelia Caragea
|
|
Research:
- Machine Learning: My main research interest is
in machine learning, especially learning probabilistic graphical models,
relational learning and last but not least learning from horizontally and
vertically distributed data sources.
In my work, I have applied the general strategy proposed in our lab to
design an efficient algorithm for learning Support Vector Machine
classifiers from large horizontally fragmented distributed data sets. To do
that, I have identified sufficient statistics for the SVM algorithm and
showed how to compute them. I have compared the resulting algorithm with its
equivalent batch algorithm, in terms of standard criteria such as
efficiency, quality, memory, or communication required. I have also proved
the convergence of this algorithm. I intend to generalize the work in our
lab on learning classifiers from horizontally and vertically distributed
data sources to the more general case when data is relationally fragmented.
I will focus my work on learning probabilistic models from relational data
sources and apply the resulting algorithms to problems that arise in
computational biology or to Web data that is or can be structured in
relational tables.
- Bioinformatics and Computational Biology: I am
also interested in bioinformatics and computational biology.
In my work, I have applied different machine learning algorithms to
knowledge acquisition tasks that arise in computational biology (e.g.,
protein function prediction, post-translational modifications prediction)
|
|