Literature DB >> 18728043

Direct mapping and alignment of protein sequences onto genomic sequence.

Osamu Gotoh1.   

Abstract

MOTIVATION: Finding protein-coding genes in a newly determined genomic sequence is the first step toward understanding the content written in the genome. Sequences of transcripts of homologous genes, if available, can considerably improve accuracy of prediction of genes and their structures, compared with that without such knowledge. As protein sequences are generally better conserved than nucleotide sequences, remote homologs can be used as templates, extending the applicability of evidence-based gene recognition methods. However, no tool seems to have been developed so far to simultaneously map and align a number of protein sequences on mammalian-sized genomic sequence.
RESULTS: We have extended our computer program Spaln to accept protein sequences, as well as cDNA sequences, as queries. When the query and the target sequences are reasonably similar, e.g. between mammalian orthologs, Spaln runs one to two orders of magnitude faster than conventional approaches that rely on Blast search followed by dynamic-programming-based spliced alignment. Exon-level and gene-level accuracies of Spaln are significantly higher than those obtained by the best available methods of the same type, particularly when the query and the target are distantly related. AVAILABILITY: Spaln is accessible online for a few species at http://www.genome.ist.i.kyoto-u.ac.jp/~aln_user. The source code is available for free for academic users from the same site.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18728043     DOI: 10.1093/bioinformatics/btn460

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  21 in total

1.  Classifier assessment and feature selection for recognizing short coding sequences of human genes.

Authors:  Kai Song; Ze Zhang; Tuo-Peng Tong; Fang Wu
Journal:  J Comput Biol       Date:  2012-03       Impact factor: 1.479

2.  Cooperation of Spaln and Prrn5 for Construction of Gene-Structure-Aware Multiple Sequence Alignment.

Authors:  Osamu Gotoh
Journal:  Methods Mol Biol       Date:  2021

3.  GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins.

Authors:  Tomáš Brůna; Alexandre Lomsadze; Mark Borodovsky
Journal:  NAR Genom Bioinform       Date:  2020-05-13

4.  RNA-Seq improves annotation of protein-coding genes in the cucumber genome.

Authors:  Zhen Li; Zhonghua Zhang; Pengcheng Yan; Sanwen Huang; Zhangjun Fei; Kui Lin
Journal:  BMC Genomics       Date:  2011-11-02       Impact factor: 3.969

5.  Cgaln: fast and space-efficient whole-genome alignment.

Authors:  Ryuichiro Nakato; Osamu Gotoh
Journal:  BMC Bioinformatics       Date:  2010-04-30       Impact factor: 3.169

6.  Contrasted patterns of molecular evolution in dominant and recessive self-incompatibility haplotypes in Arabidopsis.

Authors:  Pauline M Goubet; Hélène Bergès; Arnaud Bellec; Elisa Prat; Nicolas Helmstetter; Sophie Mangenot; Sophie Gallina; Anne-Catherine Holl; Isabelle Fobis-Loisy; Xavier Vekemans; Vincent Castric
Journal:  PLoS Genet       Date:  2012-03-22       Impact factor: 5.917

7.  Comparative analysis of information contents relevant to recognition of introns in many species.

Authors:  Hiroaki Iwata; Osamu Gotoh
Journal:  BMC Genomics       Date:  2011-01-19       Impact factor: 3.969

8.  Single nucleus genome sequencing reveals high similarity among nuclei of an endomycorrhizal fungus.

Authors:  Kui Lin; Erik Limpens; Zhonghua Zhang; Sergey Ivanov; Diane G O Saunders; Desheng Mu; Erli Pang; Huifen Cao; Hwangho Cha; Tao Lin; Qian Zhou; Yi Shang; Ying Li; Trupti Sharma; Robin van Velzen; Norbert de Ruijter; Duur K Aanen; Joe Win; Sophien Kamoun; Ton Bisseling; René Geurts; Sanwen Huang
Journal:  PLoS Genet       Date:  2014-01-09       Impact factor: 5.917

9.  Benchmarking spliced alignment programs including Spaln2, an extended version of Spaln that incorporates additional species-specific features.

Authors:  Hiroaki Iwata; Osamu Gotoh
Journal:  Nucleic Acids Res       Date:  2012-07-30       Impact factor: 16.971

10.  Genomic characterization of the European sea bass Dicentrarchus labrax reveals the presence of a novel uncoupling protein (UCP) gene family member in the teleost fish lineage.

Authors:  Mbaye Tine; Heiner Kuhl; Martin Jastroch; Richard Reinhardt
Journal:  BMC Evol Biol       Date:  2012-05-11       Impact factor: 3.260

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.