Taboada Iglesias, María JesúsRodríguez Castiñeira, HadrianaMartínez Hernández, DiegoPardo Parrado, MaríaSobrido Gómez, María Jesús2020-04-272020-04-272014Maria Taboada, Hadriana Rodríguez, Diego Martínez, María Pardo, María Jesús Sobrido, Automated semantic annotation of rare disease cases: a case study, Database, Volume 2014, 2014, bau045, https://doi.org/10.1093/database/bau045http://hdl.handle.net/10347/21797Motivation: As the number of clinical reports in the peer-reviewed medical literature keeps growing, there is an increasing need for online search tools to find and analyze publications on patients with similar clinical characteristics. This problem is especially critical and challenging for rare diseases, where publications of large series are scarce. Through an applied example, we illustrate how to automatically identify new relevant cases and semantically annotate the relevant literature about patient case reports to capture the phenotype of a rare disease named cerebrotendinous xanthomatosis. Results: Our results confirm that it is possible to automatically identify new relevant case reports with a high precision and to annotate them with a satisfactory quality (74% F-measure). Automated annotation with an emphasis to entirely describe all phenotypic abnormalities found in a disease may facilitate curation efforts by supplying phenotype retrieval and assessment of their frequencyeng© The Author(s) 2014. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly citedhttp://creativecommons.org/licenses/by/3.0/Cerebrotendinous xanthomatosisRare diseaseGenetic sequencing technologiesAutomated semantic annotation of rare disease cases: a case studyjournal article10.1093/database/bau04510.1093/database/bav1071758-0463open access