Bridging human and AI perspectives: semantic annotation of generic nouns in German

dc.contributor.affiliationUniversidade de Santiago de Compostela. Departamento de Filoloxía Inglesa e Alemá
dc.contributor.authorArias Arias, Iván
dc.contributor.authorMartín-Cancela, Elena
dc.date.accessioned2025-11-18T13:15:15Z
dc.date.available2025-11-18T13:15:15Z
dc.date.issued2025-11-17
dc.description.abstractGeneric nouns such as Sache and Ding pose a challenge for semantic annotation due to their referential underspecification and context-dependent meaning. Although frequently classified under categories like {artefact} or {object}, their actual referents often belong to abstract or cognitive domains, as in Der Placeboeffekt ist eines der faszinierendsten Dinge in der Welt der Medizin. Drawing on valency grammar, this study shows that these nouns activate different argument structures depending on their syntagmatic environment, reflecting semantic flexibility and combinatorial variability. Lexical databases such as GalNet or GermaNet frequently assign multiple synsets to these nouns, illustrating their ontological ambiguity. This paper examines whether large language models (LLMs) can replicate this nuanced classification. Using a gold standard corpus annotated by linguists, we implement a two-step prompting strategy —supplying LLMs with predefined semantic tags and contextual windows— to test their performance. The results underscore the limitations of current LLMs in dealing with the lexical underspecification of generic nouns, even when provided with an extended context window. These findings contribute to ongoing discussions on the automation of semantic tagging and point to meaningful ways in which AI systems can complement human expertise in natural language processing tasks.
dc.description.peerreviewedSI
dc.description.sponsorshipThis paper presents results from the ESMAS-ES+ project (grant PID2022-137170OB-I00) funded by MICIU/AEI/10.13039/501100011033 and ERDF/EU. Iván Arias-Arias acknowledges support from grant FPU21/00188 of the Formación de Profesorado Universitario programme, Spanish Ministry of Science, Innovation and Universities.
dc.identifier.issn2533-5626
dc.identifier.urihttps://hdl.handle.net/10347/43898
dc.journal.titleElectronic lexicography in the 21st century (eLex 2023): Intelligent lexicography. Proceedings of the eLex 2025 conference.
dc.language.isoeng
dc.page.final18
dc.page.initial1
dc.publisherLexical Computing CZ s.r.o.
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2022-137170OB-I00/ES/ETIQUETADOR SEMANTICO MULTILINGUE AUTOMATICO Y SOSTENIBLE
dc.rightsAttribution-ShareAlike 4.0 Internationalen
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by-sa/4.0/
dc.subjectautomatic semantic annotation
dc.subjectgeneric nouns
dc.subjectlarge language models
dc.subjectlexicological information systems
dc.subjectvalency grammar
dc.titleBridging human and AI perspectives: semantic annotation of generic nouns in German
dc.typejournal article
dc.type.hasVersionVoR
dspace.entity.typePublication

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
eLex2025-01-Arias-Arias_Martin-Cancela.pdf
Size:
363.23 KB
Format:
Adobe Portable Document Format