Focused Crawling and Model Evaluation in the field of Conversational Agents and Motivational Interviewing
| dc.contributor.affiliation | Universidade de Santiago de Compostela. Departamento de Electrónica e Computación | es_ES |
| dc.contributor.author | Rosenova Tsakova, Gergana | |
| dc.contributor.tutor | Losada Carril, David Enrique | |
| dc.contributor.tutor | Fernández Pichel, Marcos | |
| dc.date.accessioned | 2023-10-17T13:10:23Z | |
| dc.date.available | 2023-10-17T13:10:23Z | |
| dc.date.issued | 2023-07 | |
| dc.description.abstract | The exploitation of Motivational Interviewing concepts when analysing individuals’ speech contributes to gaining valuable insights into their perspectives and attitudes towards behaviour change. The scarcity of labelled user data poses a persistent challenge and impedes technical advancements in research in non-English language scenarios. To address the limitations of manual data labelling, we propose a semisupervised learning method as a means to augment an existing training corpus. Our approach leverages machine-translated user-generated data sourced from social media communities and employs self-training techniques for annotation. We conduct an evaluation of multiple classifiers trained on various augmented datasets. To that end, we consider diverse source contexts and employ different effectiveness metrics. The results indicate that this weak labelling approach does not yield significant improvements in the overall classification capabilities of the models. However, notable enhancements were observed for the minority classes. As part of future work, we propose to enlarge the datasets only with new examples from the minority classes. We conclude that several factors, including the quality of machine translation, can potentially bias the pseudo-labelling models. The imbalanced nature of the data and the impact of a strict pre-filtering threshold are other important aspects that need to be taken into account. | es_ES |
| dc.description.sponsorship | Universidade de Santiago de Compostela. Escola Técnica Superior de Enxeñaría | es_ES |
| dc.identifier.uri | http://hdl.handle.net/10347/31037 | |
| dc.language.iso | eng | es_ES |
| dc.rights | Atribución-NoComercial-CompartirIgual 4.0 Internacional | |
| dc.rights.accessRights | open access | es_ES |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | |
| dc.subject | Focused Crawling | es_ES |
| dc.subject | Conversational Agents | es_ES |
| dc.subject | Motivational Interviewing | es_ES |
| dc.subject.classification | 3304 Tecnología de los ordenadores | es_ES |
| dc.title | Focused Crawling and Model Evaluation in the field of Conversational Agents and Motivational Interviewing | es_ES |
| dc.type | master thesis | es_ES |
| dspace.entity.type | Publication | |
| relation.isTutorOfPublication | 7ddb36fe-bf39-4c79-85bc-540ce4d9a23b | |
| relation.isTutorOfPublication | ad1c87f4-64b2-44aa-ab80-4709cef31dfe | |
| relation.isTutorOfPublication.latestForDiscovery | 7ddb36fe-bf39-4c79-85bc-540ce4d9a23b |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Rosenova Tsakova Gergana_big data (2).pdf
- Size:
- 196.82 KB
- Format:
- Adobe Portable Document Format
- Description: