Efficient access methods for very large distributed graph databases

Luaces Cachaza, David; Ríos Viqueira, José Ramón; Cotos Yáñez, José Manuel; Flores González, Julián Carlos

doi:10.1016/j.ins.2021.05.047

Efficient access methods for very large distributed graph databases

dc.contributor.affiliation	Universidade de Santiago de Compostela. Centro de Investigación en Tecnoloxías da Información	gl
dc.contributor.affiliation	Universidade de Santiago de Compostela. Departamento de Electrónica e Computación	gl
dc.contributor.area	Área de Enxeñaría e Arquitectura
dc.contributor.author	Luaces Cachaza, David
dc.contributor.author	Ríos Viqueira, José Ramón
dc.contributor.author	Cotos Yáñez, José Manuel
dc.contributor.author	Flores González, Julián Carlos
dc.date.accessioned	2021-06-24T08:42:02Z
dc.date.available	2021-06-24T08:42:02Z
dc.date.issued	2021
dc.description.abstract	Subgraph searching is an essential problem in graph databases, but it is also challenging due to the involved subgraph isomorphism NP-Complete sub-problem. Filter-Then-Verify (FTV) methods mitigate performance overheads by using an index to prune out graphs that do not fit the query in a filtering stage, reducing the number of subgraph isomorphism evaluations in a subsequent verification stage. Subgraph searching has to be applied to very large databases (tens of millions of graphs) in real applications such as molecular substructure searching. Previous surveys have identified the FTV solutions GraphGrepSX (GGSX) and CT-Index as the best ones for large databases (thousands of graphs), however they cannot reach reasonable performance on very large ones (tens of millions graphs). This paper proposes a generic approach for the distributed implementation of FTV solutions. Besides, three previous methods that improve the performance of GGSX and CT-Index are adapted to be executed in clusters. The evaluation shows how the achieved solutions provide a great performance improvement (between 70% and 90% of filtering time reduction) in a centralized configuration and how they may be used to achieve efficient subgraph searching over very large databases in cluster configurations	gl
dc.description.peerreviewed	SI	gl
dc.description.sponsorship	This work has been co-funded by the Ministerio de Economía y Competitividad of the Spanish government, and by Mestrelab Research S.L. through the project NEXTCHROM (RTC-2015-3812-2) of the call Retos-Colaboración of the program Programa Estatal de Investigación, Desarrollo e Innovación Orientada a los Retos de la Sociedad. The authors wish to thank the financial support provided by Xunta de Galicia under the Project ED431B 2018/28	gl
dc.identifier.citation	Information Sciences, 573 (2021), 65-81. https://doi.org/10.1016/j.ins.2021.05.047	gl
dc.identifier.doi	10.1016/j.ins.2021.05.047
dc.identifier.issn	0020-0255
dc.identifier.uri	http://hdl.handle.net/10347/26514
dc.language.iso	eng	gl
dc.publisher	Elsevier	gl
dc.relation.publisherversion	https://doi.org/10.1016/j.ins.2021.05.047	gl
dc.rights	© 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)	gl
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional
dc.rights.accessRights	open access	gl
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Graph databases	gl
dc.subject	Subgraph search	gl
dc.subject	Graph query processing	gl
dc.subject	Graph indexing	gl
dc.subject	Subgraph isomorphism	gl
dc.subject	Large scale processing	gl
dc.title	Efficient access methods for very large distributed graph databases	gl
dc.type	journal article	gl
dc.type.hasVersion	VoR	gl
dspace.entity.type	Publication
relation.isAuthorOfPublication	61678fc8-bbf4-4466-8736-0d433fbaba1e
relation.isAuthorOfPublication	df8d5480-a8c8-43ec-8e3b-cf5a939ad831
relation.isAuthorOfPublication	12b92448-8b3c-499c-9e5c-6b8e82695771
relation.isAuthorOfPublication.latestForDiscovery	61678fc8-bbf4-4466-8736-0d433fbaba1e

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2021_is_luaces_efficient.pdf
Size:: 1.66 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Electrónica e Computación
Centro de Investigación en Tecnoloxías Intelixentes da USC (CiTIUS)