RabbitMash: accelerating hash-based genome analysis on modern multi-core architectures
- Publikationstyp:
- Zeitschriftenaufsatz
- Metadaten:
-
- Abstract
- <jats:title>Abstract</jats:title> <jats:sec> <jats:title>Motivation</jats:title> <jats:p>Mash is a popular hash-based genome analysis toolkit with applications to important downstream analyses tasks such as clustering and assembly. However, Mash is currently not able to fully exploit the capabilities of modern multi-core architectures, which in turn leads to high runtimes for large-scale genomic datasets.</jats:p> </jats:sec> <jats:sec> <jats:title>Results</jats:title> <jats:p>We present RabbitMash, an efficient highly optimized implementation of Mash which can take full advantage of modern hardware including multi-threading, vectorization and fast I/O. We show that our approach achieves speedups of at least 1.3, 9.8, 8.5 and 4.4 compared to Mash for the operations sketch, dist, triangle and screen, respectively. Furthermore, RabbitMash is able to compute the all-versus-all distances of 100 321 genomes in &lt;5 min on a 40-core workstation while Mash requires over 40 min.</jats:p> </jats:sec> <jats:sec> <jats:title>Availability and implementation</jats:title> <jats:p>RabbitMash is available at https://github.com/ZekunYin/RabbitMash.</jats:p> </jats:sec> <jats:sec> <jats:title>Supplementary information</jats:title> <jats:p>Supplementary data are available at Bioinformatics online.</jats:p> </jats:sec>
- Autoren
- Zekun Yin
- Xiaoming Xu
- Jinxiao Zhang
- Yanjie Wei
- Bertil Schmidt
- Weiguo Liu
- DOI
- 10.1093/bioinformatics/btaa754
- Editoren
- Arne Elofsson
- eISSN
- 1367-4811
- ISSN
- 1367-4803
- Ausgabe der Veröffentlichung
- 6
- Zeitschrift
- Bioinformatics
- Sprache
- en
- Online publication date
- 2020
- Paginierung
- 873 - 875
- Datum der Veröffentlichung
- 2021
- Status
- Published
- Herausgeber
- Oxford University Press (OUP)
- Herausgeber URL
- http://dx.doi.org/10.1093/bioinformatics/btaa754
- Datum der Datenerfassung
- 2023
- Titel
- RabbitMash: accelerating hash-based genome analysis on modern multi-core architectures
- Ausgabe der Zeitschrift
- 37
Datenquelle: Crossref
- Andere Metadatenquellen:
-
- Abstract
- <h4>Motivation</h4>Mash is a popular hash-based genome analysis toolkit with applications to important downstream analyses tasks such as clustering and assembly. However, Mash is currently not able to fully exploit the capabilities of modern multi-core architectures, which in turn leads to high runtimes for large-scale genomic datasets.<h4>Results</h4>We present RabbitMash, an efficient highly optimized implementation of Mash which can take full advantage of modern hardware including multi-threading, vectorization and fast I/O. We show that our approach achieves speedups of at least 1.3, 9.8, 8.5 and 4.4 compared to Mash for the operations sketch, dist, triangle and screen, respectively. Furthermore, RabbitMash is able to compute the all-versus-all distances of 100 321 genomes in <5 min on a 40-core workstation while Mash requires over 40 min.<h4>Availability and implementation</h4>RabbitMash is available at https://github.com/ZekunYin/RabbitMash.<h4>Supplementary information</h4>Supplementary data are available at Bioinformatics online.
- Addresses
- School of Software, Shandong University, Jinan 250101, China.
- Autoren
- Zekun Yin
- Xiaoming Xu
- Jinxiao Zhang
- Yanjie Wei
- Bertil Schmidt
- Weiguo Liu
- DOI
- 10.1093/bioinformatics/btaa754
- eISSN
- 1367-4811
- Externe Identifier
- PubMed Identifier: 32845281
- Funding acknowledgements
- NSFC: U1806205
- NSFC: 61972231
- Key Project of Joint Fund of Shandong Province: ZR2019LZH007
- Shenzhen Basic Research Fund: JCYJ20180507182818013
- Pilot National Laboratory for Marine Science and Technology:
- Open access
- false
- ISSN
- 1367-4803
- Ausgabe der Veröffentlichung
- 6
- Zeitschrift
- Bioinformatics (Oxford, England)
- Schlüsselwörter
- Genomics
- Genome
- Algorithms
- Computers
- Software
- Sprache
- eng
- Medium
- Paginierung
- 873 - 875
- Datum der Veröffentlichung
- 2021
- Status
- Published
- Datum der Datenerfassung
- 2020
- Titel
- RabbitMash: accelerating hash-based genome analysis on modern multi-core architectures.
- Sub types
- Research Support, Non-U.S. Gov't
- Journal Article
- Ausgabe der Zeitschrift
- 37
Datenquelle: Europe PubMed Central
- Abstract
- MOTIVATION: Mash is a popular hash-based genome analysis toolkit with applications to important downstream analyses tasks such as clustering and assembly. However, Mash is currently not able to fully exploit the capabilities of modern multi-core architectures, which in turn leads to high runtimes for large-scale genomic datasets. RESULTS: We present RabbitMash, an efficient highly optimized implementation of Mash which can take full advantage of modern hardware including multi-threading, vectorization and fast I/O. We show that our approach achieves speedups of at least 1.3, 9.8, 8.5 and 4.4 compared to Mash for the operations sketch, dist, triangle and screen, respectively. Furthermore, RabbitMash is able to compute the all-versus-all distances of 100 321 genomes in <5 min on a 40-core workstation while Mash requires over 40 min. AVAILABILITY AND IMPLEMENTATION: RabbitMash is available at https://github.com/ZekunYin/RabbitMash. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
- Date of acceptance
- 2020
- Autoren
- Zekun Yin
- Xiaoming Xu
- Jinxiao Zhang
- Yanjie Wei
- Bertil Schmidt
- Weiguo Liu
- Autoren-URL
- https://www.ncbi.nlm.nih.gov/pubmed/32845281
- DOI
- 10.1093/bioinformatics/btaa754
- eISSN
- 1367-4811
- Ausgabe der Veröffentlichung
- 6
- Zeitschrift
- Bioinformatics
- Schlüsselwörter
- Algorithms
- Computers
- Genome
- Genomics
- Software
- Sprache
- eng
- Country
- England
- Paginierung
- 873 - 875
- PII
- 5897409
- Datum der Veröffentlichung
- 2021
- Status
- Published
- Datum, an dem der Datensatz öffentlich gemacht wurde
- 2021
- Titel
- RabbitMash: accelerating hash-based genome analysis on modern multi-core architectures.
- Sub types
- Journal Article
- Research Support, Non-U.S. Gov't
- Ausgabe der Zeitschrift
- 37
Datenquelle: PubMed
- Autoren
- Zekun Yin
- Xiaoming Xu
- Jinxiao Zhang
- Yanjie Wei
- Bertil Schmidt
- Weiguo Liu
- Zeitschrift
- Bioinform.
- Artikelnummer
- 6
- Paginierung
- 873 - 875
- Datum der Veröffentlichung
- 2021
- Titel
- RabbitMash: accelerating hash-based genome analysis on modern multi-core architectures.
- Ausgabe der Zeitschrift
- 37
Datenquelle: DBLP
- Beziehungen:
- Eigentum von