Efficient edge filtering of directly-follows graphs for process mining

dc.contributor.affiliationUniversidade de Santiago de Compostela. Centro de Investigación en Tecnoloxías da Informacióngl
dc.contributor.affiliationUniversidade de Santiago de Compostela. Departamento de Electrónica e Computacióngl
dc.contributor.authorChapela Campa, David
dc.contributor.authorDumas, Marlon
dc.contributor.authorMucientes Molina, Manuel
dc.contributor.authorLama Penín, Manuel
dc.date.accessioned2022-11-25T10:14:53Z
dc.date.available2022-11-25T10:14:53Z
dc.date.issued2022
dc.description.abstractAutomated process discovery is a process mining operation that takes as input an event log of a business process and generates a diagrammatic representation of the process. In this setting, a common diagrammatic representation generated by commercial tools is the directly-follows graph (DFG). In some real-life scenarios, the DFG of an event log contains hundreds of edges, hindering its understandability. To overcome this shortcoming, process mining tools generally offer the possibility of filtering the edges in the DFG. We study the problem of efficiently filtering the DFG extracted from an event log while retaining the most frequent relations. We formalize this problem as an optimization problem, specifically, the problem of finding a sound spanning subgraph of a DFG with a minimal number of edges and a maximal sum of edge frequencies. We show that this problem is an instance of an NP-hard problem and outline several polynomial-time heuristics to compute approximate solutions. Finally, we report on an evaluation of the efficiency and optimality of the proposed heuristics using 13 real-life event logsgl
dc.description.peerreviewedSIgl
dc.description.sponsorshipWe thank Luciano García-Baíuelos for proposing the idea of combining the results of Chu-Liu-Edmonds’ algorithm to filter a DFG. We also thank Adriano Augusto for providing us with the implementation of the Split Miner filtering technique. This research was funded by the Spanish Ministry of Economy and Competitiveness (TIN2017-84796-C2-1-R) and the Galician Ministry of Education, Culture and Universities (ED431G/08). These grants are co-funded by the European Regional Development Fund (ERDF/FEDER program). D. Chapela-Campa is supported by the Spanish Ministry of Education, under the FPU national plan (FPU16/04428 and EST19/00135). This research is also funded by the Estonian Research Council (grant PRG1226)gl
dc.identifier.citationInformation Sciences 610 (2022). https://doi.org/10.1016/j.ins.2022.07.170gl
dc.identifier.doi10.1016/j.ins.2022.07.170
dc.identifier.essn0020-0255
dc.identifier.urihttp://hdl.handle.net/10347/29469
dc.language.isoenggl
dc.publisherElseviergl
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/TIN2017-84796-C2-1-R/ES/APORTANDO INTELIGENCIA A LOS PROCESOS DE NEGOCIO MEDIANTE SOFT COMPUTING EN ESCENARIOS DE DATOS MASIVOSgl
dc.relation.publisherversionhttps://doi.org/10.1016/j.ins.2022.07.170gl
dc.rights©2022 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)gl
dc.rightsAtribución 4.0 Internacional
dc.rights.accessRightsopen accessgl
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectProcess mininggl
dc.subjectAutomated process discoverygl
dc.subjectDirectly-follows graphgl
dc.subjectEdge filteringgl
dc.titleEfficient edge filtering of directly-follows graphs for process mininggl
dc.typejournal articlegl
dc.type.hasVersionVoRgl
dspace.entity.typePublication
relation.isAuthorOfPublication21112b72-72a3-4a96-bda4-065e7e2bb262
relation.isAuthorOfPublication208dae76-e3a1-4dee-8254-35177f75e17c
relation.isAuthorOfPublication.latestForDiscovery21112b72-72a3-4a96-bda4-065e7e2bb262

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2022_infsci_chapela_efficient.pdf
Size:
2.01 MB
Format:
Adobe Portable Document Format
Description:
Artigo de investigación