| Literature DB >> 26787666 |
Monica-Andreea Drăgan1, Ismail Moghul2, Anurag Priyam2, Claudio Bustos3, Yannick Wurm2.
Abstract
UNLABELLED: : Genomes of emerging model organisms are now being sequenced at very low cost. However, obtaining accurate gene predictions remains challenging: even the best gene prediction algorithms make substantial errors and can jeopardize subsequent analyses. Therefore, many predicted genes must be time-consumingly visually inspected and manually curated. We developed GeneValidator (GV) to automatically identify problematic gene predictions and to aid manual curation. For each gene, GV performs multiple analyses based on comparisons to gene sequences from large databases. The resulting report identifies problematic gene predictions and includes extensive statistics and graphs for each prediction to guide manual curation efforts. GV thus accelerates and enhances the work of biocurators and researchers who need accurate gene predictions from newly sequenced genomes.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26787666 PMCID: PMC4866521 DOI: 10.1093/bioinformatics/btw015
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Contrasting GV graphs: (a), (e) sequence lengths; (b), (f) HSP offsets; (c), (g) overviews of hit regions; (d), (h) conserved regions. Graphs (a–d) were produced with a sequence for which GV detected no problems. The other graphs show typical problems: (e) query is short; (f), (g) query sequence is a fusion of unrelated genes; (h): query includes sequence absent from first 10 hits