RT Journal Article
T1 A new thread-level speculative automatic parallelization model and library based on duplicate code execution
A1 Martínez, Millan A.
A1 Fraguela, Basilio B.
A1 Cabaleiro Domínguez, José Carlos
A1 Fernández Rivera, Francisco
K1 Speculative parallelism
K1 Automatic parallelization
K1 Thread-level speculation
K1 Template metaprogramming
AB Loop-efficient automatic parallelization has become increasingly relevant due to the growing number of cores in current processors and the programming effort needed to parallelize codes in these systems efficiently. However, automatic tools fail to extract all the available parallelism in irregular loops with indirections, race conditions or potential data dependency violations, among many other possible causes. One of the successful ways to automatically parallelize these loops is the use of speculative parallelization techniques. This paper presents a new model and the corresponding C++ library that supports the speculative automatic parallelization of loops in shared memory systems, seeking competitive performance and scalability while keeping user effort to a minimum. The primary speculative strategy consists of redundantly executing chunks of loop iterations in a duplicate fashion. Namely, each chunk is executed speculatively in parallel to obtain results as soon as possible and sequentially in a different thread to validate the speculative results. The implementation uses C++11 threads and it makes intensive use of templates and advanced multithreading techniques. An evaluation based on various benchmarks confirms that our proposal provides a competitive level of performance and scalability.
PB Springer
SN 0920-8542
YR 2024
FD 2024-03-11
LK https://hdl.handle.net/10347/40975
UL https://hdl.handle.net/10347/40975
LA eng
NO Martínez, M. A., Fraguela, B. B., Cabaleiro, J. C., & Rivera, F. F. (2024). A new thread-level speculative automatic parallelization model and library based on duplicate code execution. The Journal of Supercomputing, 80(10), 13714–13737. doi:10.1007/s11227-024-05987-0
NO Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This research was supported by Grants PID2019-104184RB-I00, PID2019-104834GB-I00, PID2022-141623NB-I00, and PID2022-136435NB-I00, funded by MCIN/AEI/ 10.13039/501100011033, PID2022 also funded by "ERDF A way of making Europe", EU, and the predoctoral Grant of Millán Álvarez Ref. BES-2017-081320, and by the Xunta de Galicia co-founded by the European Regional Development Fund (ERDF) under the Consolidation Programme of Competitive Reference Groups (ED431C 2021/30 and ED431C 2022/16). Funding for open access charge: Universidade da Coruña/CISUG. We also acknowledge the support from the Centro Singular de Investigación de Galicia "CITIC" and the Centro Singular de Investigación en Tecnoloxías Intelixentes "CiTIUS", funded by Xunta de Galicia and the European Union (European Regional Development Fund- Galicia 2014-2020 Program), by grants ED431G 2019/01 and ED431G 2019/04. We also acknowledge the Centro de Supercomputación de Galicia (CESGA).
DS Minerva
RD 24 abr 2026