A simple and general homology-based method for gene finding was applied to the 2.9-Mb Drosophila melanogaster Adh region, the target sequence of the Genome Annotation Assessment Project (GASP). Each strand of the entire sequence was used as query of the BLOCKS+ database of conserved regions of proteins. This led to functional assignments for more than one-third of the genes and two-thirds of the transposons. Considering the enormous size of the query, the fact that only two false-positive matches were reported emphasizes the high selectivity of protein family-based methods for gene finding. We used the search results to improve BLOCKS+ by identifying compositionally biased blocks. Our results confirm that protien family databases can be used effectively in automated sequence annotation efforts.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Henikoff, J. G., & Henikoff, S. (2000). Drosophila genomic sequence annotation using the BLOCKS+ database. Genome Research, 10(4), 543–546. https://doi.org/10.1101/gr.10.4.543