RT Journal Article T1 Semi-Supervised Learning in the Field of Conversational Agents and Motivational Interviewing T2 Aprendizaje Semisupervisado en el ´Ambito de los Agentes Conversacionales y la Entrevista Motivacional A1 Rosenova Tsakova, Gergana A1 Fernández Pichel, Marcos A1 Meyer, Selina A1 Losada Carril, David Enrique K1 Semi-supervised learning K1 Motivational Interviewing K1 Conversational Agents AB The exploitation of Motivational Interviewing concepts for text analysis contributes to gaining valuable insights into individuals’ perspectives and attitudes towards behaviour change. The scarcity of labelled user data poses a persistent challenge and impedes technical advances in research under non-English language scenarios. To address the limitations of manual data labelling, we propose a semi-supervised learning method as a means to augment an existing training corpus. Our approach leverages machine-translated user-generated data sourced from social media communities and employs self-training techniques for annotation. To that end, we consider various source contexts and conduct an evaluation of multiple classifiers trained on various augmented datasets. The results indicate that this weak labelling approach does not yield improvements in the overall classification capabilities of the models. However, notable enhancements were observed for the minority classes. We conclude that several factors, including the quality of machine translation, can potentially bias the pseudo-labelling models and that the imbalanced nature of the data and the impact of a strict pre-filtering threshold need to be taken into account as inhibiting factors. PB Sociedad Española para el Procesamiento del Lenguaje Natural SN 1135-5948 YR 2024 FD 2024-09-01 LK https://hdl.handle.net/10347/44523 UL https://hdl.handle.net/10347/44523 LA eng NO This work was supported by project PLEC2021- 007662 (MCIN/AEI/10.13039/501100011033, Plan de Recuperación, Transformación y Resiliencia, Next Generation EU). The authors also thank the financial support supplied by the Xunta de Galicia-Consellería de Cultura, Educación, Formación Profesional e Universidade (ED431G 2023/04, ED431C 2022/19) and the ERDF, which acknowledges the CiTIUS- Research Center in Intelligent Technologies of the USC as a Research Center of the Galician University System. David E. Losada thanks the financial support obtained from project SUBV23/00002 (Ministerio de Consumo, Subdirección General de Regulación del Juego) and project PID2022-137061OB-C22 (Ministerio de Ciencia e Innovación, AEI, Proyectos de Generación de Conocimiento; supported by the ERDF). DS Minerva RD 25 abr 2026