Poster
in
Workshop: AI for New Drug Modalities
Leveraging Disease-Specific Topologies and Counterfactual Relationships in Knowledge Graphs for Inductive Reasoning in Drug Repurposing
Cerag Oguztuzun · Zhenxiang Gao · Hui Li · Rong Xu
Drug repurposing offers a cost-effective strategy to accelerate drug development by identifying new therapeutic uses for approved medications. However, it poses significant challenges for complex diseases with poorly understood mechanisms of action. Addressing these diseases requires the efficient integration of new data while minimizing retraining time, prompting us to develop \textbf{domain-specific graph augmentation techniques that support semi-inductive reasoning.} We discovered that leveraging counterfactual relationships derived from disease-specific topological structures significantly enhances model performance. Based on this insight, we integrated counterfactual relationships as an augmentation method and an initialization step in our knowledge graph (KG) link prediction training process. We introduce \textbf{KGïA}, an \textbf{i}nductive \textbf{KG} \textbf{a}ugmentation method that utilizes counterfactual relationships based on disease-specific topologies. By aligning augmentation with the intrinsic topological features of disease entities, KGïA enhances the KG in a domain-specific manner, facilitating the discovery of a broader range of novel drug candidates tailored to specific diseases. Our biomedical KG comprises 1,614,801 triples and 100,563 biomedical entities, including 30,006 diseases, constructed from \textbf{6 biomedical datasets} and enriched through Natural Language Processing (NLP) relation extraction. Extensive experiments on this comprehensive KG using \textbf{5 augmented architectures} demonstrate that semi-inductive reasoning significantly improves generalizability (up to a \textbf{24× increase in Mean Reciprocal Rank (MRR)}) and that augmented models outperform state-of-the-art KG-based drug repurposing methods (up to a \textbf{32\% improvement in MRR}). Additionally, in an Alzheimer's Disease (AD) case study, our model identified up to \textbf{5 mechanism categories} compared to \textbf{2 in the baseline}, highlighting its enhanced capability to uncover diverse drug candidates.