Sin-Gi Park1, DongSung Ryu1, Hyunsung Lee1, Hojin Ryu2, Yong Ju Ahn1, Seung Il Yoo1, Junsu Ko3, Chang Pyo Hong4. 1. TheragenEtex Bio Institute, Suwon, 16229, Republic of Korea. 2. Department of Biology, Chungbuk National University, Cheongju, 28644, Republic of Korea. 3. TheragenEtex Bio Institute, Suwon, 16229, Republic of Korea. junsuko@gmail.com. 4. TheragenEtex Bio Institute, Suwon, 16229, Republic of Korea. changpyo.hong@theragenetex.com.
Abstract
INTRODUCTION: The accurate prediction and annotation of gene structures from the genome sequence of an organism enable genome-wide functional analyses to obtain insight into the biological properties of an organism. OBJECTIVES: We recently developed a highly accurate filamentous fungal gene prediction pipeline and web platform called TaF. TaF is a homology-based gene predictor employing large-scale taxonomic profiling to search for close relatives in genome queries. METHODS: TaF pipeline consists of four processing steps; (1) taxonomic profiling to search for close relatives to query, (2) generation of hints for determining exon-intron boundaries from orthologous protein sequence data of the profiled species, (3) gene prediction by combination of ab inito and evidence-based prediction methods, and (4) homology search for gene models. RESULTS: TaF generates extrinsic evidence that suggests possible exon-intron boundaries based on orthologous protein sequence data, thus reducing false-positive predictions of gene structure based on distantly related orthologs data. In particular, the gene prediction method using taxonomic profiling shows very high accuracy, including high sensitivity and specificity for gene models, suggesting a new approach for homology-based gene prediction from newly sequenced or uncharacterized fungal genomes, with the potential to improve the quality of gene prediction. CONCLUSION: TaF will be a useful tool for fungal genome-wide analyses, including the identification of targeted genes associated with a trait, transcriptome profiling, comparative genomics, and evolutionary analysis.
INTRODUCTION: The accurate prediction and annotation of gene structures from the genome sequence of an organism enable genome-wide functional analyses to obtain insight into the biological properties of an organism. OBJECTIVES: We recently developed a highly accurate filamentous fungal gene prediction pipeline and web platform called TaF. TaF is a homology-based gene predictor employing large-scale taxonomic profiling to search for close relatives in genome queries. METHODS:TaF pipeline consists of four processing steps; (1) taxonomic profiling to search for close relatives to query, (2) generation of hints for determining exon-intron boundaries from orthologous protein sequence data of the profiled species, (3) gene prediction by combination of ab inito and evidence-based prediction methods, and (4) homology search for gene models. RESULTS:TaF generates extrinsic evidence that suggests possible exon-intron boundaries based on orthologous protein sequence data, thus reducing false-positive predictions of gene structure based on distantly related orthologs data. In particular, the gene prediction method using taxonomic profiling shows very high accuracy, including high sensitivity and specificity for gene models, suggesting a new approach for homology-based gene prediction from newly sequenced or uncharacterized fungal genomes, with the potential to improve the quality of gene prediction. CONCLUSION:TaF will be a useful tool for fungal genome-wide analyses, including the identification of targeted genes associated with a trait, transcriptome profiling, comparative genomics, and evolutionary analysis.
Entities:
Keywords:
Ab initio; Exon–intron boundary; Filamentous fungal genome; Homology-based gene prediction; Taxonomic profile; Web platform
Authors: David DeCaprio; Jade P Vinson; Matthew D Pearson; Philip Montgomery; Matthew Doherty; James E Galagan Journal: Genome Res Date: 2007-08-09 Impact factor: 9.043
Authors: Gabriele Schweikert; Alexander Zien; Georg Zeller; Jonas Behr; Christoph Dieterich; Cheng Soon Ong; Petra Philips; Fabio De Bona; Lisa Hartmann; Anja Bohlen; Nina Krüger; Sören Sonnenburg; Gunnar Rätsch Journal: Genome Res Date: 2009-06-29 Impact factor: 9.043
Authors: Ian Reid; Nicholas O'Toole; Omar Zabaneh; Reza Nourzadeh; Mahmoud Dahdouli; Mostafa Abdellateef; Paul M K Gordon; Jung Soh; Gregory Butler; Christoph W Sensen; Adrian Tsang Journal: BMC Bioinformatics Date: 2014-07-01 Impact factor: 3.169