Finding motifs in biological sequences is one of the most intriguing problems for string algorithms designers as it is necessary to deal with approximations and this complicates the problem. Existing algorithms run in time linear with the input size. Nevertheless, the output size can be very large due to the approximation. This makes the output often unreadable, next to slowing down the inference itself. Since only a subset of the motifs, i.e. the maximal motifs, could be enough to give the information of all of them, in this paper, we aim at removing such redundancy. We define notions of maximality that we characterize in the suffix tree data structure. Given that this is used by a whole class of motifs extraction tools, we show how these tools can be modified to include the maximality requirement on the fly without changing the asymptotical complexity. © Springer-Verlag Berlin Heidelberg 2008.
CITATION STYLE
Federico, M., & Pisanti, N. (2008). Suffix tree characterization of maximal motifs in biological sequences. Communications in Computer and Information Science, 13, 456–465. https://doi.org/10.1007/978-3-540-70600-7_35
Mendeley helps you to discover research relevant for your work.