Chapela Campa, DavidDumas, MarlonMucientes Molina, ManuelLama Penín, Manuel2022-11-252022-11-252022Information Sciences 610 (2022). https://doi.org/10.1016/j.ins.2022.07.170http://hdl.handle.net/10347/29469Automated process discovery is a process mining operation that takes as input an event log of a business process and generates a diagrammatic representation of the process. In this setting, a common diagrammatic representation generated by commercial tools is the directly-follows graph (DFG). In some real-life scenarios, the DFG of an event log contains hundreds of edges, hindering its understandability. To overcome this shortcoming, process mining tools generally offer the possibility of filtering the edges in the DFG. We study the problem of efficiently filtering the DFG extracted from an event log while retaining the most frequent relations. We formalize this problem as an optimization problem, specifically, the problem of finding a sound spanning subgraph of a DFG with a minimal number of edges and a maximal sum of edge frequencies. We show that this problem is an instance of an NP-hard problem and outline several polynomial-time heuristics to compute approximate solutions. Finally, we report on an evaluation of the efficiency and optimality of the proposed heuristics using 13 real-life event logseng©2022 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)Atribución 4.0 Internacionalhttp://creativecommons.org/licenses/by/4.0/Process miningAutomated process discoveryDirectly-follows graphEdge filteringEfficient edge filtering of directly-follows graphs for process miningjournal article10.1016/j.ins.2022.07.1700020-0255open access