Literature DB >> 15262786

Improved techniques for the identification of pseudogenes.

L Coin1, R Durbin.   

Abstract

MOTIVATION: Pseudogenes are the remnants of genomic sequences of genes which are no longer functional. They are frequent in most eukaryotic genomes, and an important resource for comparative genomics. However, pseudogenes are often mis-annotated as functional genes in sequence databases. Current methods for identifying pseudogenes include methods which rely on the presence of stop codons and frameshifts, as well as methods based on the ratio of non-silent to silent nucleotide substitution rates (dN/dS). A recent survey concluded that 50% of human pseudogenes have no detectable truncation in their pseudo-coding regions, indicating that the former methods lack sensitivity. The latter methods have been used to find sets of genes enriched for pseudogenes, but are not specific enough to accurately separate pseudogenes from expressed genes.
RESULTS: We introduce a program called pseudogene inference from loss of constraint (PSILC) which incorporates novel methods for separating pseudogenes from functional genes. The methods calculate the log-odds score that evolution along the final branch of the gene tree to the query gene has been according to the following constraints: A neutral nucleotide model compared to a Pfam domain encoding model (PSILC(nuc/dom)); A protein coding model compared to a Pfam domain encoding model (PSILC(prot/dom)). Using the manual annotation of human chromosome 6, we show that both these methods result in a more accurate classification of pseudogenes than dN/dS when a Pfam domain alignment is available. AVAILABILITY: PSILC is available from http://www.sanger.ac.uk/Software/PSILC

Entities:  

Mesh:

Year:  2004        PMID: 15262786     DOI: 10.1093/bioinformatics/bth942

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  15 in total

1.  cnvHap: an integrative population and haplotype-based multiplatform model of SNPs and CNVs.

Authors:  Lachlan J M Coin; Julian E Asher; Robin G Walters; Julia S El-Sayed Moustafa; Adam J de Smith; Rob Sladek; David J Balding; Philippe Froguel; Alexandra I F Blakemore
Journal:  Nat Methods       Date:  2010-05-30       Impact factor: 28.547

2.  Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution.

Authors:  Deyou Zheng; Adam Frankish; Robert Baertsch; Philipp Kapranov; Alexandre Reymond; Siew Woh Choo; Yontao Lu; France Denoeud; Stylianos E Antonarakis; Michael Snyder; Yijun Ruan; Chia-Lin Wei; Thomas R Gingeras; Roderic Guigó; Jennifer Harrow; Mark B Gerstein
Journal:  Genome Res       Date:  2007-06       Impact factor: 9.043

3.  Profile hidden Markov model sequence analysis can help remove putative pseudogenes from DNA barcoding and metabarcoding datasets.

Authors:  T M Porter; M Hajibabaei
Journal:  BMC Bioinformatics       Date:  2021-05-19       Impact factor: 3.169

4.  Recognizing the pseudogenes in bacterial genomes.

Authors:  Emmanuelle Lerat; Howard Ochman
Journal:  Nucleic Acids Res       Date:  2005-06-02       Impact factor: 16.971

5.  A probabilistic classifier for olfactory receptor pseudogenes.

Authors:  Idan Menashe; Ronny Aloni; Doron Lancet
Journal:  BMC Bioinformatics       Date:  2006-08-29       Impact factor: 3.169

6.  Evolutionary models for insertions and deletions in a probabilistic modeling framework.

Authors:  Elena Rivas
Journal:  BMC Bioinformatics       Date:  2005-03-21       Impact factor: 3.169

7.  Tools for simulating evolution of aligned genomic regions with integrated parameter estimation.

Authors:  Avinash Varadarajan; Robert K Bradley; Ian H Holmes
Journal:  Genome Biol       Date:  2008-10-08       Impact factor: 13.583

8.  Frequency matrix approach demonstrates high sequence quality in avian BARCODEs and highlights cryptic pseudogenes.

Authors:  Mark Y Stoeckle; Kevin C R Kerr
Journal:  PLoS One       Date:  2012-08-27       Impact factor: 3.240

9.  FGF: a web tool for Fishing Gene Family in a whole genome database.

Authors:  Hongkun Zheng; Junjie Shi; Xiaodong Fang; Yuan Li; Søren Vang; Wei Fan; Junyi Wang; Zhang Zhang; Wen Wang; Karsten Kristiansen; Jun Wang
Journal:  Nucleic Acids Res       Date:  2007-06-21       Impact factor: 16.971

10.  Probabilistic phylogenetic inference with insertions and deletions.

Authors:  Elena Rivas; Sean R Eddy
Journal:  PLoS Comput Biol       Date:  2008-09-19       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.