Large language models such as ChatGPT are becoming more relevant in the workforce as a tool to increase work efficiency. In the several months since becoming publicly accessible, ChatGPT has become the hot tool in corporate America. ChatGPT can be used to automate various writing tasks such as social media posts, emails, and even code. Helping users quickly get the data they need, so far, ChatGPT has revolutionized how work gets done in the office.
However, as discovered by researchers, several accessibility issues make it a non-ideal tool for the average academic researcher or proprietor of a small business. Commercial language models like ChatGPT cannot be locally deployed or used for sensitive, private data. Additionally, the training costs for language models which can be locally deployed or used for confidential information are in the millions. This creates serious accessibility issues, akin to the digital divide, which Iowa State researchers are working to solve.
Qi Li, Assistant Professor of Computer Science, is working on an approach to automatically find high-quality prompts using moderately sized pre-trained language models such as BERT that can be locally deployed and used for sensitive, private data. Li wants to use her research to tackle the accessibility gap, allowing both academic researchers and practitioners in small business to obtain high-quality annotations with minimal cost without additional privacy concerns thus democratizing language models.
A recent paper, written alongside Iowa State University Computer Science Ph.D. students Mohna Chakraborty and Adithya Kulkarni, “Open-Domain Aspect-Opinion Co-Mining with Double-Layer Span Extraction” was published at the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. This paper focused on open-domain aspect-opinion co-mining tasks in review analysis. Instead of requesting large domain-specific annotations, simple universal templates are proposed to initiate the annotation. The research team also designed a novel double-layer architecture to tackle the noise in weak supervision.
A follow-up paper, “Zero-shot Approach to Overcome Perturbation Sensity of Prompts,” was recently accepted to the 61st Annual Meeting of the Association for Computational Linguistics, the top-tier international conference on this topic. This paper expanded the research to focus on sentiment classification tasks for review analysis. This study aims to generate high-quality prompts using local-deployable moderately sized language models in a zero-shot setting using, where no annotation is needed.
With the Internet allowing access to large volumes of unlabeled data, Li hopes to use her research to help label the data at a minimal cost. By reducing the training costs, she aims to provide accessibility to all, proposing methods that can be implemented daily without expensive hardware. Particularly, they want to tackle challenges in resource-scare domains that are currently being neglected by language model research.