| Literature DB >> 10779495 |
Abstract
A simple and general homology-based method for gene finding was applied to the 2.9-Mb Drosophila melanogaster Adh region, the target sequence of the Genome Annotation Assessment Project (GASP). Each strand of the entire sequence was used as query of the BLOCKS+ database of conserved regions of proteins. This led to functional assignments for more than one-third of the genes and two-thirds of the transposons. Considering the enormous size of the query, the fact that only two false-positive matches were reported emphasizes the high selectivity of protein family-based methods for gene finding. We used the search results to improve BLOCKS+ by identifying compositionally biased blocks. Our results confirm that protein family databases can be used effectively in automated sequence annotation efforts.Entities:
Mesh:
Substances:
Year: 2000 PMID: 10779495 PMCID: PMC310867 DOI: 10.1101/gr.10.4.543
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043