An Architecture for Document Routing in Spanish: Two Language Components, PreProcessor and Parser

dc.contributor.affiliationUniversidade de Santiago de Compostela. Departamento de Lingua e Literatura Españolas, Teoría da Literatura e Lingüística Xeral
dc.contributor.authorRojo Sánchez, Guillermo
dc.contributor.authorÁlvarez, Concepción
dc.contributor.authorAlvariño, Pilar
dc.contributor.authorGil, Adelaida
dc.contributor.authorSantalla del Río, María Paula
dc.contributor.authorSotelo, Susana
dc.contributor.editorGavridilou, Maria
dc.contributor.editorCarayannis, George
dc.contributor.editorMarkantonatou, Stella
dc.contributor.editorPiperidis, Stelios
dc.contributor.editorStainhauer, Gregory
dc.date.accessioned2025-01-02T13:04:17Z
dc.date.available2025-01-02T13:04:17Z
dc.date.issued2000
dc.description.abstractThis paper describes the language components of a system for Document Routing in Spanish. The system identifies relevant terms for classification within involved documents by means of natural language processing techniques. These techniques are based on the isolation and normalization of syntactic unities considered relevant for the classification, especially noun phrases, but also other constituents built around verbs, adverbs, pronouns or adjectives. After a general introduction about the research project, the second Section relates our approach to the problem with other previous and current approaches, the third one describes corpora used for evaluating the system. The linguistic analysis architecture, including pre-processing and two different levels of syntactic analysis, is described in following fourth and fifth Sections, while the last one is dedicated to a comparative analysis of results obtained from the processing of corpora introduced in third Section. Certain future developments of the system are also included in this Section.
dc.description.sponsorshipEuropean Commission Directorate General III (DGIII), within the Fourth Framework Programme (1994-1998) of European Union
dc.identifier.citationRojo, Guillermo, Concepción Álvarez, Pilar Alvariño, Adelaida Gil, María Paula Santalla, Susana Sotelo. (2000). An Architecture for Document Routing in Spanish: Two Language Components, PreProcessor and Parser. En: M. Gavrilidou, G. Carayannis, S. Markantonatou, S. Piperidis, G. Stainhauer, "Proceedings of the Second International Conference on Language Resources and Evaluation (LREC-2000, 31 de mayo--2 de junio de 2000, Atenas)", (Volumen 2: pp. 675-682). European Language Resources Association (ELRA).
dc.identifier.urihttps://hdl.handle.net/10347/38320
dc.language.isoeng
dc.publisherEuropean Language Resources Association (ELRA)
dc.relation.projectID22716
dc.relation.publisherversionhttps://aclanthology.org/L00-1068/
dc.rights.accessRightsopen access
dc.subjectDocument routing
dc.subjectNatural language processing
dc.subjectSyntactic analysis
dc.subjectNormalization
dc.subject.classification570104 Lingüística informatizada
dc.titleAn Architecture for Document Routing in Spanish: Two Language Components, PreProcessor and Parser
dc.typebook part
dc.type.hasVersionVoR
dspace.entity.typePublication
relation.isAuthorOfPublicationa8b33ed7-a607-40fb-be4e-33ec153d97f2
relation.isAuthorOfPublication806db122-49f0-4d28-a4e8-669c95b8739f
relation.isAuthorOfPublication538fea22-743b-43f1-8ae2-691bb313ce17
relation.isAuthorOfPublication.latestForDiscoverya8b33ed7-a607-40fb-be4e-33ec153d97f2

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2000_Santalla_Architecture.pdf
Size:
80.59 KB
Format:
Adobe Portable Document Format