PhD Final Oral Exam: Ying Wei

PhD Final Oral Exam: Ying Wei

Jan 13, 2026 - 3:00 PM
to , -

Consistency-Aware and LLM-Assisted Methods for Named Entity Recognition

Named Entity Recognition (NER) is a fundamental task in natural language processing and underpins many downstream applications, including biomedical text mining and clinical information extraction. Despite advances in neural models, existing NER systems often suffer from two key limitations: inconsistent predictions across documents and heavy reliance on large amounts of high-quality labeled data, which are costly and difficult to obtain in specialized domains. This thesis addresses these challenges through consistency-aware modeling and the structured integration of large language models (LLMs). First, we propose ScdNER, a span-based, document-level NER framework that improves prediction consistency by performing contextual feature fusion at the entity span level rather than the token level. A two-stage prediction strategy enables selective sharing of global contextual information while reducing noise from non-entity spans, leading to more accurate and consistent document-level predictions. Building on this foundation, we investigate how LLMs can be used to enhance BERT-based NER models in data-scarce biomedical and clinical settings. We introduce structured LLM-based augmentation and auxiliary annotation strategies that generate diverse and semantically faithful training data to complement human-labeled datasets. Experimental results show that BERT models can effectively learn from LLM-generated supervision, even when annotations are noisy. Finally, we explore fine-tuning LLMs with supervised learning and reinforcement learning to generate fully synthetic annotated datasets, further reducing dependence on expert annotation. Together, this work demonstrates that combining consistency-aware modeling with LLM-enhanced supervision provides an effective and practical approach for improving robustness, data efficiency, and scalability in biomedical and clinical NER systems.

Committee: Qi Li (major professor), Mengdi Huai, Forrest Bao, Wei Le, and Ali Jannesari