Improving Metagenomic Classification Using Discriminative k-mers from Sequencing Data

3Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The major problem when analyzing a metagenomic sample is to taxonomically annotate its reads to identify the species they contain. Most of the methods currently available focus on the classification of reads using a set of reference genomes and their k-mers. While in terms of precision these methods have reached percentages of correctness close to perfection, in terms of recall (the actual number of classified reads) the performances fall at around 50%. One of the reasons is the fact that the sequences in a sample can be very different from the corresponding reference genome, e.g. viral genomes are highly mutated. To address this issue, in this paper we study the problem of metagenomic reads classification by improving the reference k-mers library with novel discriminative k-mers from the input sequencing reads. We evaluated the performance in different conditions against several other tools and the results showed an improved F-measure, especially when close reference genomes are not available. Availability: https://github.com/davide92/K2Mem.git

Cite

CITATION STYLE

APA

Storato, D., & Comin, M. (2020). Improving Metagenomic Classification Using Discriminative k-mers from Sequencing Data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12304 LNBI, pp. 68–81). Springer. https://doi.org/10.1007/978-3-030-57821-3_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free