MetaTransformer: deep metagenomic sequencing read classification using self-attention models
- Publikationstyp:
- Zeitschriftenaufsatz
- Metadaten:
-
- Autoren
- Alexander Wichmann
- Etienne Buschong
- Andre Mueller
- Daniel Juenger
- Andreas Hildebrandt
- Thomas Hankeln
- Bertil Schmidt
- Autoren-URL
- https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=fis-test-1&SrcAuth=WosAPI&KeyUT=WOS:001064348000003&DestLinkType=FullRecord&DestApp=WOS_CPL
- DOI
- 10.1093/nargab/lqad082
- eISSN
- 2631-9268
- Externe Identifier
- Clarivate Analytics Document Solution ID: R4VX0
- PubMed Identifier: 37705831
- Ausgabe der Veröffentlichung
- 3
- Zeitschrift
- NAR GENOMICS AND BIOINFORMATICS
- Artikelnummer
- ARTN lqad082
- Datum der Veröffentlichung
- 2023
- Status
- Published
- Titel
- MetaTransformer: deep metagenomic sequencing read classification using self-attention models
- Sub types
- Article
- Ausgabe der Zeitschrift
- 5
Datenquelle: Web of Science (Lite)
- Andere Metadatenquellen:
-
- Abstract
- <jats:title>Abstract</jats:title> <jats:p>Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yielding convincing results in fields such as motif identification and variant calling. DeepMicrobes, a machine learning-based classifier, has recently been introduced for taxonomic prediction at species and genus level. However, it relies on complex models based on bidirectional long short-term memory cells resulting in slow runtimes and excessive memory requirements, hampering its effective usability. We present MetaTransformer, a self-attention-based deep learning metagenomic analysis tool. Our transformer-encoder-based models enable efficient parallelization while outperforming DeepMicrobes in terms of species and genus classification abilities. Furthermore, we investigate approaches to reduce memory consumption and boost performance using different embedding schemes. As a result, we are able to achieve 2× to 5× speedup for inference compared to DeepMicrobes while keeping a significantly smaller memory footprint. MetaTransformer can be trained in 9 hours for genus and 16 hours for species prediction. Our results demonstrate performance improvements due to self-attention models and the impact of embedding schemes in deep learning on metagenomic sequencing data.</jats:p>
- Autoren
- Alexander Wichmann
- Etienne Buschong
- André Müller
- Daniel Jünger
- Andreas Hildebrandt
- Thomas Hankeln
- Bertil Schmidt
- DOI
- 10.1093/nargab/lqad082
- eISSN
- 2631-9268
- Ausgabe der Veröffentlichung
- 3
- Zeitschrift
- NAR Genomics and Bioinformatics
- Sprache
- en
- Online publication date
- 2023
- Datum der Veröffentlichung
- 2023
- Status
- Published
- Herausgeber
- Oxford University Press (OUP)
- Herausgeber URL
- http://dx.doi.org/10.1093/nargab/lqad082
- Datum der Datenerfassung
- 2024
- Titel
- MetaTransformer: deep metagenomic sequencing read classification using self-attention models
- Ausgabe der Zeitschrift
- 5
Datenquelle: Crossref
- Abstract
- Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yielding convincing results in fields such as motif identification and variant calling. DeepMicrobes, a machine learning-based classifier, has recently been introduced for taxonomic prediction at species and genus level. However, it relies on complex models based on bidirectional long short-term memory cells resulting in slow runtimes and excessive memory requirements, hampering its effective usability. We present MetaTransformer, a self-attention-based deep learning metagenomic analysis tool. Our transformer-encoder-based models enable efficient parallelization while outperforming DeepMicrobes in terms of species and genus classification abilities. Furthermore, we investigate approaches to reduce memory consumption and boost performance using different embedding schemes. As a result, we are able to achieve 2× to 5× speedup for inference compared to DeepMicrobes while keeping a significantly smaller memory footprint. MetaTransformer can be trained in 9 hours for genus and 16 hours for species prediction. Our results demonstrate performance improvements due to self-attention models and the impact of embedding schemes in deep learning on metagenomic sequencing data.
- Addresses
- Institute of Computer Science, Johannes Gutenberg University, Staudingerweg 9, 55128 Mainz, Rhineland-Palatinate, Germany.
- Autoren
- Alexander Wichmann
- Etienne Buschong
- André Müller
- Daniel Jünger
- Andreas Hildebrandt
- Thomas Hankeln
- Bertil Schmidt
- DOI
- 10.1093/nargab/lqad082
- eISSN
- 2631-9268
- Externe Identifier
- PubMed Identifier: 37705831
- PubMed Central ID: PMC10495543
- Funding acknowledgements
- Carl-Zeiss-Stiftung:
- German Federal Ministry of Education and Research: 01IS19071C
- Open access
- true
- ISSN
- 2631-9268
- Ausgabe der Veröffentlichung
- 3
- Zeitschrift
- NAR genomics and bioinformatics
- Sprache
- eng
- Medium
- Electronic-eCollection
- Online publication date
- 2023
- Open access status
- Open Access
- Paginierung
- lqad082
- Datum der Veröffentlichung
- 2023
- Status
- Published
- Publisher licence
- CC BY
- Datum der Datenerfassung
- 2023
- Titel
- MetaTransformer: deep metagenomic sequencing read classification using self-attention models.
- Sub types
- research-article
- Journal Article
- Ausgabe der Zeitschrift
- 5
Files
https://europepmc.org/articles/PMC10495543?pdf=render
Datenquelle: Europe PubMed Central
- Abstract
- Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yielding convincing results in fields such as motif identification and variant calling. DeepMicrobes, a machine learning-based classifier, has recently been introduced for taxonomic prediction at species and genus level. However, it relies on complex models based on bidirectional long short-term memory cells resulting in slow runtimes and excessive memory requirements, hampering its effective usability. We present MetaTransformer, a self-attention-based deep learning metagenomic analysis tool. Our transformer-encoder-based models enable efficient parallelization while outperforming DeepMicrobes in terms of species and genus classification abilities. Furthermore, we investigate approaches to reduce memory consumption and boost performance using different embedding schemes. As a result, we are able to achieve 2× to 5× speedup for inference compared to DeepMicrobes while keeping a significantly smaller memory footprint. MetaTransformer can be trained in 9 hours for genus and 16 hours for species prediction. Our results demonstrate performance improvements due to self-attention models and the impact of embedding schemes in deep learning on metagenomic sequencing data.
- Date of acceptance
- 2023
- Autoren
- Alexander Wichmann
- Etienne Buschong
- André Müller
- Daniel Jünger
- Andreas Hildebrandt
- Thomas Hankeln
- Bertil Schmidt
- Autoren-URL
- https://www.ncbi.nlm.nih.gov/pubmed/37705831
- DOI
- 10.1093/nargab/lqad082
- eISSN
- 2631-9268
- Externe Identifier
- PubMed Central ID: PMC10495543
- Ausgabe der Veröffentlichung
- 3
- Zeitschrift
- NAR Genom Bioinform
- Sprache
- eng
- Country
- England
- Paginierung
- lqad082
- PII
- lqad082
- Datum der Veröffentlichung
- 2023
- Status
- Published online
- Titel
- MetaTransformer: deep metagenomic sequencing read classification using self-attention models.
- Sub types
- Journal Article
- Ausgabe der Zeitschrift
- 5
Datenquelle: PubMed
- Beziehungen:
- Eigentum von