LinguaKit: a Big Data-based multilingual tool for linguistic analysis and information extraction

dc.contributor.affiliationUniversidade de Santiago de Compostela. Centro de Investigación en Tecnoloxías Intelixentes da USC (CiTIUS)
dc.contributor.authorGamallo Otero, Pablo
dc.contributor.authorGarcía González, Marcos
dc.contributor.authorPiñeiro Pomar, César Alfredo
dc.contributor.authorMartínez-Castaño, Rodrigo
dc.contributor.authorPichel Campos, Juan Carlos
dc.date.accessioned2025-01-22T13:12:36Z
dc.date.available2025-01-22T13:12:36Z
dc.date.issued2018-12-02
dc.description.abstractThis paper presents LinguaKit, a multilingual suite of tools for analysis, extraction, annotation and linguistic correction, as well as its integration into a Big Data infrastructure. LinguaKit allows the user to perform different tasks such as PoS-tagging, syntactic parsing, coreference resolution (among others), including applications for relation extraction, sentiment analysis, summarization, extraction of multiword expressions, or entity linking to DBpedia. Most modules work in four languages: Portuguese, Spanish, English, and Galician. The system is programmed in Perl and is freely available under a GPLv3 license.
dc.description.peerreviewedSI
dc.description.sponsorshipThis work has been supported by MINECO (TIN2014-54565-JIN, FFI2014- 51978-C2-1-R), MICINN (IJCI-2016-29598), Xunta de Galicia (ED431G/08), European Regional Development Fund (ERDF), and by two BBVA Foundation Grants for Researchers and Cultural Creators (2016 and 2017).
dc.identifier.citationP. Gamallo, M. Garcia, C. Piñeiro, R. Martinez-Castaño and J. C. Pichel, "LinguaKit: A Big Data-Based Multilingual Tool for Linguistic Analysis and Information Extraction," 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS), Valencia, Spain, 2018, pp. 239-244, doi: 10.1109/SNAMS.2018.8554689.
dc.identifier.doi10.1109/SNAMS.2018.8554689
dc.identifier.urihttps://hdl.handle.net/10347/38902
dc.issue.number2018
dc.journal.title2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS)
dc.language.isoeng
dc.page.final244
dc.page.initial239
dc.publisherIEEE
dc.relation.projectIDinfo:eu-repo/grantAgreement/MINECO//TIN2014-54565-JIN/ES/APROXIMANDO LA COMPUTACION DE ALTAS PRESTACIONES A LAS TECNOLOGIAS BIG DATA: APLICACION AL PROCESAMIENTO DEL LENGUAJE NATURAL/
dc.relation.projectIDinfo:eu-repo/grantAgreement/MINECO//FFI2014-51978-C2-1-R/ES/TECNOLOGIAS DE LA LENGUA PARA ANALISIS DE OPINIONES EN REDES SOCIALES/
dc.rights© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.rights.accessRightsopen access
dc.subjectBilingual
dc.subjectInformation Extraction
dc.subjectBig Data
dc.subjectSentiment Analysis
dc.subjectPostage
dc.subjectRelation Extraction
dc.subjectSyntactic Analysis
dc.subjectMulti-word
dc.subjectBasis Of Analysis
dc.subjectFault-tolerant
dc.subjectAnalysis Module
dc.subjectDisambiguation
dc.subjectState Machine
dc.subjectTokenized
dc.subjectRelated Entities
dc.subjectInput Text
dc.subjectList Of Pairs
dc.subjectBasic Module
dc.subjectBig Data Technology
dc.subjectProper Nouns
dc.subjectPhonetic Transcription
dc.subjectKeyword Extraction
dc.subjectSemantic Annotation
dc.subjectLemmatization
dc.subjectApache Spark
dc.subjectLanguage Identification
dc.titleLinguaKit: a Big Data-based multilingual tool for linguistic analysis and information extraction
dc.typejournal article
dc.type.hasVersionAM
dspace.entity.typePublication
relation.isAuthorOfPublication898ee1bb-f9e8-4a75-9858-a6c9142bc99e
relation.isAuthorOfPublicationae090fc6-2387-4087-ba21-7271835b4b35
relation.isAuthorOfPublication665c60c6-1b37-4499-8c35-aa52bd7ffcf5
relation.isAuthorOfPublicationdb334853-753e-4afc-9f4f-ad847d0353a7
relation.isAuthorOfPublication.latestForDiscovery665c60c6-1b37-4499-8c35-aa52bd7ffcf5

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
GamGarPinMarPic2018a.pdf
Size:
529.76 KB
Format:
Adobe Portable Document Format