Matching a mass spectrum against a text (a key computational task in proteomics) is slow since the existing text indexing algorithms (with search time independent of the text size) are not applicable in the domain of mass spectrometry. As a result, many important applications (e.g., searches for mutated peptides) are prohibitively time-consuming and even the standard search for non-mutated peptides is becoming too slow with recent advances in high-throughput genomics and proteomics technologies. We introduce a new paradigm – the Blocked Pattern Matching (BPM) Problem - that models peptide identification. BPM corresponds to matching a pattern against a text (over the alphabet of integers) under the assumption that each symbol a in the pattern can match a block of consecutive symbols in the text with total sum a. BPM opens a new, still unexplored, direction in combinatorial pattern matching and leads to the Mutated BPM (modeling identification of mutated peptides) and Fused BPM (modeling identification of fused peptides in tumor genomes). We illustrate how BPM algorithms solve problems that are beyond the reach of existing proteomics tools.
CITATION STYLE
Ng, J., Amir, A., & Pevzner, P. A. (2011). Blocked Pattern Matching Problem and Its Applications in Proteomics. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6577 LNBI, pp. 298–319). Springer Verlag. https://doi.org/10.1007/978-3-642-20036-6_27
Mendeley helps you to discover research relevant for your work.