SparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Data

dc.contributor.affiliationUniversidade de Santiago de Compostela. Centro de Investigación en Tecnoloxías da Informacióngl
dc.contributor.authorAbuín Mosquera, José Manuel
dc.contributor.authorPichel Campos, Juan Carlos
dc.contributor.authorFernández Pena, Anselmo Tomás
dc.contributor.authorAmigo Lechuga, Jorge
dc.date.accessioned2017-10-21T12:44:12Z
dc.date.available2017-10-21T12:44:12Z
dc.date.issued2016-05-16
dc.description.abstractNext-generation sequencing (NGS) technologies have led to a huge amount of genomic data that need to be analyzed and interpreted. This fact has a huge impact on the DNA sequence alignment process, which nowadays requires the mapping of billions of small DNA sequences onto a reference genome. In this way, sequence alignment remains the most time-consuming stage in the sequence analysis workflow. To deal with this issue, state of the art aligners take advantage of parallelization strategies. However, the existent solutions show limited scalability and have a complex implementation. In this work we introduce SparkBWA, a new tool that exploits the capabilities of a big data technology as Spark to boost the performance of one of the most widely adopted aligner, the Burrows-Wheeler Aligner (BWA). The design of SparkBWA uses two independent software layers in such a way that no modifications to the original BWA source code are required, which assures its compatibility with any BWA version (future or legacy). SparkBWA is evaluated in different scenarios showing noticeable results in terms of performance and scalability. A comparison to other parallel BWA-based aligners validates the benefits of our approach. Finally, an intuitive and flexible API is provided to NGS professionals in order to facilitate the acceptance and adoption of the new tool. The source code of the software described in this paper is publicly available at https://github.com/citiususc/SparkBWA, with a GPL3 licensegl
dc.description.peerreviewedSIgl
dc.description.sponsorshipThis work was supported by Ministerio de Economía y Competitividad (Spain) (http://www.mineco.gob.es) grants TIN2013-41129-P and TIN2014-54565-JIN. There was no additional external funding received for this studygl
dc.identifier.citationAbuín JM, Pichel JC, Pena TF, Amigo J (2016) SparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Data. PLOS ONE 11(5): e0155461gl
dc.identifier.doi10.1371/journal.pone.0155461
dc.identifier.issn1932-6203
dc.identifier.urihttp://hdl.handle.net/10347/15959
dc.language.isoenggl
dc.publisherPLOSgl
dc.relation.publisherversionhttps://doi.org/10.1371/journal.pone.0155461gl
dc.rights© 2016 Abuín et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.gl
dc.rightsAtribución 3.0 España
dc.rights.accessRightsopen accessgl
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/es/
dc.titleSparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Datagl
dc.typejournal articlegl
dc.type.hasVersionVoRgl
dspace.entity.typePublication
relation.isAuthorOfPublication9ae70b97-c52b-415b-b4aa-0e8a7ff70d4c
relation.isAuthorOfPublicationdb334853-753e-4afc-9f4f-ad847d0353a7
relation.isAuthorOfPublicationdecb372f-b9cd-4237-8dda-2c0f5c40acbe
relation.isAuthorOfPublication.latestForDiscovery9ae70b97-c52b-415b-b4aa-0e8a7ff70d4c

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
abuin_pichel_pena_amigo_sparkbwa_plosone.PDF
Size:
3.38 MB
Format:
Adobe Portable Document Format
Description: