PhD Final Oral Exam: Qiao Qiao

PhD Final Oral Exam: Qiao Qiao

Nov 20, 2025 - 1:00 PM
to , -

Integrating Structural and Semantic Understanding for Robust Knowledge Graph Construction: from knowledge graph completion to zero-shot entity linking

Knowledge Graphs (KGs) have become an essential foundation for representing structured knowledge and supporting reasoning in artificial intelligence. However, real-world KGs are inevitably incomplete and semantically inconsistent, limiting their ability to provide reliable knowledge for downstream applications. This thesis addresses these challenges by advancing both Knowledge Graph Completion (KGC) and Entity Linking (EL)—two fundamental yet interdependent tasks that underpin the construction of accurate and self-evolving KGs.

The first part of this research focuses on the Few-shot Knowledge Graph Completion (FKGC) problem, where models must predict missing facts for relations with only a handful of reference examples. Existing methods often suffer from suboptimal negative sampling and static entity representations. To overcome these limitations, we propose RANA (Relation-Aware Network with Attention-Based Loss), which strategically selects relevant negative samples and introduces an attention-based loss to emphasize more informative contrasts. A dynamic relation-aware entity encoder is further designed to generate context-dependent entity representations. Extensive experiments demonstrate that RANA significantly outperforms state-of-the-art FKGC models on multiple benchmark datasets.

Building on this foundation, the second part of the thesis investigates how to integrate structural and semantic knowledge for general Knowledge Graph Completion. Most prior methods rely solely on either KG embeddings or pre-trained language models (PLMs), resulting in incomplete representations. To bridge this gap, we develop Bridge, a unified framework that jointly encodes entities and relations through PLMs and structural representation learning. Bridge introduces a self-supervised fine-tuning strategy inspired by BYOL, constructing semantically consistent “views” of triples without altering their meaning. This alignment enables effective fusion of textual semantics and graph structure, yielding state-of-the-art results across multiple KGC benchmarks.

The final part of the thesis extends from KGC to Entity Linking, focusing on zero-shot EL in the biomedical domain—a setting characterized by lexical divergence and annotation scarcity. We propose a cost-aware hybrid framework that leverages large language models (LLMs) to synthesize semantically faithful, entity-centric variants, which are then used to fine-tune compact retriever–reranker models. This design reduces reliance on expert annotation while maintaining computational efficiency. Experiments show that the proposed framework achieves robust zero-shot generalization and improved performance under lexical variation, surpassing existing biomedical EL systems.

Together, these three studies advance the goal of constructing self-improving knowledge graphs, where completion enriches relational knowledge and linking ensures semantic accuracy. By tackling the challenges of few-shot learning, cross-modal representation, and zero-shot generalization, this thesis contributes to the development of scalable, data-efficient, and semantically grounded methods for intelligent knowledge graph construction and evolution.

Committee: Qi Li (major professor), Hongyang Gao, Mengdi Huai, Ying Cai and Wensheng Zhang