Ontology matching with Large Language Models and prioritized depth-first search

Taboada Iglesias, María Jesús; Martínez Hernández, Diego; Arideh, Mohammed; Mosquera Losada, María Rosa

doi:10.1016/j.inffus.2025.103254

Ontology matching with Large Language Models and prioritized depth-first search

dc.contributor.affiliation	Universidade de Santiago de Compostela. Departamento de Electrónica e Computación
dc.contributor.affiliation	Universidade de Santiago de Compostela. Departamento de Física Aplicada
dc.contributor.affiliation	Universidade de Santiago de Compostela. Departamento de Bioloxía Molecular
dc.contributor.author	Taboada Iglesias, María Jesús
dc.contributor.author	Martínez Hernández, Diego
dc.contributor.author	Arideh, Mohammed
dc.contributor.author	Mosquera Losada, María Rosa
dc.date.accessioned	2025-11-04T11:58:33Z
dc.date.available	2025-11-04T11:58:33Z
dc.date.issued	2025-05-07
dc.description.abstract	Ontology matching (OM) plays a key role in enabling data interoperability and knowledge sharing. Recently, methods based on Large Language Model (LLMs) have shown great promise in OM, particularly through the use of a retrieve-then-prompt pipeline. In this approach, relevant target entities are first retrieved and then used to prompt the LLM to predict the final matches. Despite their potential, these systems still present limited performance and high computational overhead. To address these issues, we introduce MILA, a novel approach that embeds a retrieve-identify-prompt pipeline within a prioritized depth-first search (PDFS) strategy. This approach efficiently identifies a large number of semantic correspondences with high accuracy, limiting LLM requests to only the most borderline cases. We evaluated MILA using three challenges from the 2024 edition of the Ontology Alignment Evaluation Initiative. Our method achieved the highest F-Measure in five of seven unsupervised tasks, outperforming state-of-the-art OM systems by up to 17%. It also performed better than or comparable to the leading supervised OM systems. MILA further exhibited task-agnostic performance, remaining stable across all tasks and settings, while significantly reducing runtime. These findings highlight that high-performance LLM-based OM can be achieved through a combination of programmed (PDFS), learned (embedding vectors), and prompting-based heuristics, without the need of domain-specific heuristics or fine-tuning.
dc.description.peerreviewed	SI
dc.description.sponsorship	The authors appreciate the support and training from the University of Santiago de Compostela, as well as the support from the projects AF4EU (101086563) and SUS-SOIL (101157560), funded by the European Union’s Horizon Europe program, and the project Camelia (TSI-100932-2023-3), funded by the Ministerio de Transformación Digital Función Pública (Secretaría de Estado de Digitalización e Inteligencia Artificial), and the European Union (NextGeneration EU-fund). The authors also express their gratitude to Diego Martinez-Taboada for insightful conversations.
dc.identifier.citation	Taboada, M., Martinez, D., Arideh, M., & Mosquera, R. (2025). Ontology matching with Large Language Models and prioritized depth-first search. Information Fusion, 123, 103254. 10.1016/j.inffus.2025.103254
dc.identifier.doi	10.1016/j.inffus.2025.103254
dc.identifier.essn	1872-6305
dc.identifier.issn	1566-2535
dc.identifier.uri	https://hdl.handle.net/10347/43542
dc.issue.number	103254
dc.journal.title	Information Fusion
dc.language.iso	eng
dc.page.final	15
dc.page.initial	1
dc.publisher	Elsevier
dc.relation.projectID	nfo:eu-repo/grantAgreement/EC/HE/101086563
dc.relation.publisherversion	https://doi.org/10.1016/j.inffus.2025.103254
dc.rights	© 2025 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license
dc.rights.accessRights	open access
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.subject	Ontology matching
dc.subject	Retrieval augmented generation
dc.subject	Greedy search
dc.subject	Large Language Models
dc.subject	Zero-shot setting
dc.title	Ontology matching with Large Language Models and prioritized depth-first search
dc.type	journal article
dc.type.hasVersion	VoR
dc.volume.number	123
dspace.entity.type	Publication
relation.isAuthorOfPublication	371f5af9-f195-4e8e-86f8-56aa72d46c77
relation.isAuthorOfPublication	c0bf8f71-b820-4d0c-a7ec-aafaa23611a2
relation.isAuthorOfPublication	04d61864-9f32-4825-ac8d-7e9e8ef2aa96
relation.isAuthorOfPublication.latestForDiscovery	371f5af9-f195-4e8e-86f8-56aa72d46c77

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2025_inffus_taboada_ontology.pdf
Size:: 2.8 MB
Format:: Adobe Portable Document Format

Download

Collections

Electrónica e Computación
Bioloxía Funcional
Física Aplicada