Qi Li Receives NSF CAREER Award

Qi Li, an assistant professor of Computer Science at Iowa State University, has been honored with a Faculty Early Career Development (CAREER) Award in recognition of her outstanding research on information extraction from scientific documents. This esteemed accolade is bestowed upon promising early-career faculty members who demonstrate the potential to become influential academic role models in both research and education.

For her project titled "Achieving Quality Information Extraction from Scientific Documents with Heterogeneous Weak Supervisions," Qi Li has been granted substantial five-year funding of $499,948 as part of the award.

“The research in this project is built on the foundations of data mining and machine learning,” says Li. “We seek to integrate heterogeneous supervision and incorporate the domain experts' feedback. This view will enable the use of our contribution outside information extraction tasks such as many other tasks in weakly supervised learning settings, such as cyber-attack detection, material characteristic prediction, and gene function prediction.”

Forward-thinking research in information extraction

Automated information extraction systems utilize natural language processing to extract structured data from machine-readable documents and electronically represented sources. The advent of the World Wide Web has escalated the demand for such systems, as they aid individuals in navigating vast amounts of online data. However, the costs and time associated with developing these systems have led to a digital divide.

Li's research is centered on rectifying these disparities for researchers. Thanks to the CAREER grant, Li can devise a flexible and adaptable information extraction framework that learns from existing resources, eliminating the need for expensive and time-consuming expert annotations. Moreover, this framework aims to bridge the performance gap in real-world applications and address concerns regarding extraction quality and the distinctive requirements of information extraction tasks in scientific literature.

“The [research] tackles a variety of problems drawn from different information extraction settings, which will lead to new principles, methods, and technologies for machine learning, data mining, and natural language processing,” says Li. “The information extraction results will benefit many domains, specifically life science domains such as biomedicine, animal science, and agronomy, all of which involve the processing of massive unlabeled textual data. The project will speed up literature understanding and the curation process and promote new scientific discoveries.”

Promoting inclusivity in computing

According to the National Science Foundation (NSF), the effort to promote equity in information extraction systems aligns with crucial areas of national research priority. This research endeavors to enhance inclusivity in computing activities by engaging in outreach initiatives and developing new systems that augment human performance while further advancing our data infrastructure.

As part of this project, Li will provide support to multiple research assistants at both the graduate and undergraduate levels. Li is particularly dedicated to increasing the participation of individuals from underrepresented groups in the fields of science, technology, engineering, and mathematics (STEM).

In collaboration with Science Bound at Iowa State, Li will spearhead outreach programs focused on information extraction systems. Additionally, educational materials will be delivered to students from K-12 to undergraduate and graduate levels. Li intends to apply this framework to extract information from STEM textbooks, enabling the creation of educational materials for local K-12 schools and outreach programs within the community. Furthermore, she will organize an annual workshop at the Midwest Big Data Summer School hosted by Iowa State University.