Efficient access methods for very large distributed graph databases

dc.contributor.affiliationUniversidade de Santiago de Compostela. Centro de Investigación en Tecnoloxías da Informacióngl
dc.contributor.affiliationUniversidade de Santiago de Compostela. Departamento de Electrónica e Computacióngl
dc.contributor.areaÁrea de Enxeñaría e Arquitectura
dc.contributor.authorLuaces Cachaza, David
dc.contributor.authorRíos Viqueira, José Ramón
dc.contributor.authorCotos Yáñez, José Manuel
dc.contributor.authorFlores González, Julián Carlos
dc.date.accessioned2021-06-24T08:42:02Z
dc.date.available2021-06-24T08:42:02Z
dc.date.issued2021
dc.description.abstractSubgraph searching is an essential problem in graph databases, but it is also challenging due to the involved subgraph isomorphism NP-Complete sub-problem. Filter-Then-Verify (FTV) methods mitigate performance overheads by using an index to prune out graphs that do not fit the query in a filtering stage, reducing the number of subgraph isomorphism evaluations in a subsequent verification stage. Subgraph searching has to be applied to very large databases (tens of millions of graphs) in real applications such as molecular substructure searching. Previous surveys have identified the FTV solutions GraphGrepSX (GGSX) and CT-Index as the best ones for large databases (thousands of graphs), however they cannot reach reasonable performance on very large ones (tens of millions graphs). This paper proposes a generic approach for the distributed implementation of FTV solutions. Besides, three previous methods that improve the performance of GGSX and CT-Index are adapted to be executed in clusters. The evaluation shows how the achieved solutions provide a great performance improvement (between 70% and 90% of filtering time reduction) in a centralized configuration and how they may be used to achieve efficient subgraph searching over very large databases in cluster configurationsgl
dc.description.peerreviewedSIgl
dc.description.sponsorshipThis work has been co-funded by the Ministerio de Economía y Competitividad of the Spanish government, and by Mestrelab Research S.L. through the project NEXTCHROM (RTC-2015-3812-2) of the call Retos-Colaboración of the program Programa Estatal de Investigación, Desarrollo e Innovación Orientada a los Retos de la Sociedad. The authors wish to thank the financial support provided by Xunta de Galicia under the Project ED431B 2018/28gl
dc.identifier.citationInformation Sciences, 573 (2021), 65-81. https://doi.org/10.1016/j.ins.2021.05.047gl
dc.identifier.doi10.1016/j.ins.2021.05.047
dc.identifier.issn0020-0255
dc.identifier.urihttp://hdl.handle.net/10347/26514
dc.language.isoenggl
dc.publisherElseviergl
dc.relation.publisherversionhttps://doi.org/10.1016/j.ins.2021.05.047gl
dc.rights© 2021 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)gl
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional
dc.rights.accessRightsopen accessgl
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectGraph databasesgl
dc.subjectSubgraph searchgl
dc.subjectGraph query processinggl
dc.subjectGraph indexinggl
dc.subjectSubgraph isomorphismgl
dc.subjectLarge scale processinggl
dc.titleEfficient access methods for very large distributed graph databasesgl
dc.typejournal articlegl
dc.type.hasVersionVoRgl
dspace.entity.typePublication
relation.isAuthorOfPublication61678fc8-bbf4-4466-8736-0d433fbaba1e
relation.isAuthorOfPublicationdf8d5480-a8c8-43ec-8e3b-cf5a939ad831
relation.isAuthorOfPublication12b92448-8b3c-499c-9e5c-6b8e82695771
relation.isAuthorOfPublication.latestForDiscovery61678fc8-bbf4-4466-8736-0d433fbaba1e

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2021_is_luaces_efficient.pdf
Size:
1.66 MB
Format:
Adobe Portable Document Format
Description: