Doval Reixa, Irene2025-01-292025-01-292023Doval, Irene. 2023. "The English–Spanish Parallel Corpus PaEnS." In Current Trends on Digital Technologies and Gaming for Language Teaching and Linguistics, edited by I. Santos Díaz et al., 145–164. Berlin: Peter Lang9783631889008https://hdl.handle.net/10347/39211This chapter presents the PaEnS English-Spanish Parallel Corpus, a sentence-level aligned parallel corpus, which at the time of writing comprises some 130 million words. This corpus is part of a larger ongoing project, PaCorES, an acronym for Spanish Parallel Corpora, which aims to build a series of parallel corpora between Spanish and several major languages. This paper presents the main features of the PaEnS corpus, starting with a brief description of the drawbacks that other similar resources pose for the intended applications. The design and composition of the corpus is described, explaining the data selection criteria. Next, the different phas-es of the workflow are discussed: text preprocessing, segmentation, automatic alignment and manual review. Next, the web presentation of the corpus and the search possibilities are described. Finally, the future development of the corpus is outlined, and a brief recapitulation of its distinctive features is given.eng570104 Lingüística informatizada570111 Enseñanza de lenguasThe English–Spanish parallel corpus PaEnSCurrent Trends on Digital Technologies and Gaming for Teaching and Linguisticsbook part10.3726/b20963restricted access