An Architecture for Document Routing in Spanish: Two Language Components, PreProcessor and Parser

Rojo Sánchez, Guillermo; Álvarez, Concepción; Alvariño, Pilar; Gil, Adelaida; Santalla del Río, María Paula; Sotelo, Susana

An Architecture for Document Routing in Spanish: Two Language Components, PreProcessor and Parser

Files

2000_Santalla_Architecture.pdf (80.59 KB)

Identifiers

URI: https://hdl.handle.net/10347/38320

Publication date

2000

Authors

Rojo Sánchez, Guillermo

Álvarez, Concepción

Alvariño, Pilar

Gil, Adelaida

Santalla del Río, María Paula

Sotelo, Susana

Editors

Gavridilou, Maria

Carayannis, George

Markantonatou, Stella

Piperidis, Stelios

Stainhauer, Gregory

Publisher

European Language Resources Association (ELRA)

Metrics

Export

Abstract

This paper describes the language components of a system for Document Routing in Spanish. The system identifies relevant terms for classification within involved documents by means of natural language processing techniques. These techniques are based on the isolation and normalization of syntactic unities considered relevant for the classification, especially noun phrases, but also other constituents built around verbs, adverbs, pronouns or adjectives. After a general introduction about the research project, the second Section relates our approach to the problem with other previous and current approaches, the third one describes corpora used for evaluating the system. The linguistic analysis architecture, including pre-processing and two different levels of syntactic analysis, is described in following fourth and fifth Sections, while the last one is dedicated to a comparative analysis of results obtained from the processing of corpora introduced in third Section. Certain future developments of the system are also included in this Section.

Keywords

Document routing| Natural language processing| Syntactic analysis| Normalization

Bibliographic citation

Rojo, Guillermo, Concepción Álvarez, Pilar Alvariño, Adelaida Gil, María Paula Santalla, Susana Sotelo. (2000). An Architecture for Document Routing in Spanish: Two Language Components, PreProcessor and Parser. En: M. Gavrilidou, G. Carayannis, S. Markantonatou, S. Piperidis, G. Stainhauer, "Proceedings of the Second International Conference on Language Resources and Evaluation (LREC-2000, 31 de mayo--2 de junio de 2000, Atenas)", (Volumen 2: pp. 675-682). European Language Resources Association (ELRA).

Publisher version

https://aclanthology.org/L00-1068/

Collections

Lingua e Literatura Españolas, Teoría da Literatura e Lingüística Xeral

Full item page

An Architecture for Document Routing in Spanish: Two Language Components, PreProcessor and Parser

Files

Identifiers

Publication date

Authors

Advisors

Tutors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Metrics

Export

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Bibliographic citation

Relation

Has part

Has version

Is based on

Is part of

Is referenced by

Is version of

Requires

Publisher version

Sponsors

Rights

Collections