TextFocus: Assessing the Faithfulness of Feature Attribution Methods Explanations in Natural Language Processing

Mariotti, Ettore; Arias-Duart, Anna; Cafagna, Michele; Gatt, Albert; García-Gasulla, Darío; Alonso Moral, José María

doi:10.1109/ACCESS.2024.3408062

TextFocus: Assessing the Faithfulness of Feature Attribution Methods Explanations in Natural Language Processing

dc.contributor.affiliation	Universidade de Santiago de Compostela. Centro de Investigación en Tecnoloxías Intelixentes da USC (CiTIUS)
dc.contributor.author	Mariotti, Ettore
dc.contributor.author	Arias-Duart, Anna
dc.contributor.author	Cafagna, Michele
dc.contributor.author	Gatt, Albert
dc.contributor.author	García-Gasulla, Darío
dc.contributor.author	Alonso Moral, José María
dc.date.accessioned	2025-05-05T07:50:47Z
dc.date.available	2025-05-05T07:50:47Z
dc.date.issued	2024-05-31
dc.description.abstract	Among the existing eXplainable AI (XAI) approaches, Feature Attribution methods are a popular option due to their interpretable nature. However, each method leads to a different solution, thus introducing uncertainty regarding their reliability and coherence with respect to the underlying model. This work introduces TextFocus, a metric for evaluating the faithfulness of Feature Attribution methods for Natural Language Processing (NLP) tasks involving classification. To address the absence of ground truth explanations for such methods, we introduce the concept of textual mosaics. A mosaic is composed of a combination of sentences belonging to different classes, which provides an implicit ground truth for attribution. The accuracy of explanations can be then evaluated by comparing feature attribution scores with the known class labels in the mosaic. The performance of six feature attribution methods is systematically compared on three sentence classification tasks by using TextFocus, with Integrated Gradients being the best overall method in terms of faithfulness and computational requirements. The proposed methodology fills a gap in NLP evaluation, by providing an objective way to assess Feature Attribution methods while finding their optimal parameters.
dc.description.peerreviewed	SI
dc.description.sponsorship	This work was supported in part by NL4XAI Project funded by European Union’s Horizon 2020 Research and Innovation Program under the Marie Skłodowska-Curie Grant under Agreement 860621; in part by MCIN/AEI/10.13039/501100011033 and 11ESF Investing in Your Future under Grant PID2021-123152OB-C21; in part by MCIN/AEI/10.13039/501100011033 and the 11European Union NextGenerationEU/PRTR under Grant TED2021-130295B-C33; in part by the Galician Ministry of Culture, Education, Professional Training, and University (co-funded by European Regional Development Fund, ERDF/FEDER Program) under Grant ED431G2019/04 and Grant ED431C2022/19; and in part by European Union–Horizon 2020 Program under the Scheme 11INFRAIA-01-2018-2019– Integrating Activities for Advanced Communities 11SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics (http://www.sobigdata.eu) under Grant 871042.
dc.identifier.citation	E. Mariotti, A. Arias-Duart, M. Cafagna, A. Gatt, D. Garcia-Gasulla and J. M. Alonso-Moral. (2024). TextFocus: Assessing the Faithfulness of Feature Attribution Methods Explanations in Natural Language Processing. "IEEE Access", vol. 12, pp. 138870-138880
dc.identifier.doi	10.1109/ACCESS.2024.3408062
dc.identifier.issn	2169-3536
dc.identifier.uri	https://hdl.handle.net/10347/41177
dc.journal.title	IEEE Access
dc.language.iso	eng
dc.page.final	138880
dc.page.initial	138870
dc.publisher	IEEE
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica, Técnica y de Innovación 2021-2023/PID2021-123152OB-C21
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/871042/EU
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/871042/EU
dc.relation.publisherversion	https://doi.org/10.1109/ACCESS.2024.3408062
dc.rights	© 2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Attribution 4.0 International
dc.rights.accessRights	open access
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.subject	Natural language processing
dc.subject	Predictive models
dc.subject	Measurement
dc.subject	Explainable AI
dc.subject	Data models
dc.subject	Computational modeling
dc.subject	Artificial intelligence
dc.subject	Feature detection
dc.subject	Modelos predictivos
dc.subject	Modelos preditivos
dc.subject	Modelos de datos
dc.subject	Intelixencia Artificial
dc.subject	Inteligencia Artificial
dc.subject	Trustworthy AI
dc.subject	Explanation faithfulness
dc.subject.classification	120304 Inteligencia artificial
dc.title	TextFocus: Assessing the Faithfulness of Feature Attribution Methods Explanations in Natural Language Processing
dc.type	journal article
dc.type.hasVersion	VoR
dc.volume.number	12
dspace.entity.type	Publication
relation.isAuthorOfPublication	47f74ee4-a6d5-49cd-8a38-bf9fdeef8f69
relation.isAuthorOfPublication.latestForDiscovery	47f74ee4-a6d5-49cd-8a38-bf9fdeef8f69

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2025_access_alonso_textfocus.pdf
Size:: 1.1 MB
Format:: Adobe Portable Document Format

Download

Collections

Centro de Investigación en Tecnoloxías Intelixentes da USC (CiTIUS)
Electrónica e Computación