Literature DB >> 14960723

Analysis and recognition of 5' UTR intron splice sites in human pre-mRNA.

E Eden1, S Brunak.   

Abstract

Prediction of splice sites in non-coding regions of genes is one of the most challenging aspects of gene structure recognition. We perform a rigorous analysis of such splice sites embedded in human 5' untranslated regions (UTRs), and investigate correlations between this class of splice sites and other features found in the adjacent exons and introns. By restricting the training of neural network algorithms to 'pure' UTRs (not extending partially into protein coding regions), we for the first time investigate the predictive power of the splicing signal proper, in contrast to conventional splice site prediction, which typically relies on the change in sequence at the transition from protein coding to non-coding. By doing so, the algorithms were able to pick up subtler splicing signals that were otherwise masked by 'coding' noise, thus enhancing significantly the prediction of 5' UTR splice sites. For example, the non-coding splice site predicting networks pick up compositional and positional bias in the 3' ends of non-coding exons and 5' non-coding intron ends, where cytosine and guanine are over-represented. This compositional bias at the true UTR donor sites is also visible in the synaptic weights of the neural networks trained to identify UTR donor sites. Conventional splice site prediction methods perform poorly in UTRs because the reading frame pattern is absent. The NetUTR method presented here performs 2-3-fold better compared with NetGene2 and GenScan in 5' UTRs. We also tested the 5' UTR trained method on protein coding regions, and discovered, surprisingly, that it works quite well (although it cannot compete with NetGene2). This indicates that the local splicing pattern in UTRs and coding regions is largely the same. The NetUTR method is made publicly available at www.cbs.dtu.dk/services/NetUTR.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 14960723      PMCID: PMC373407          DOI: 10.1093/nar/gkh273

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  23 in total

Review 1.  Initiation of translation in prokaryotes and eukaryotes.

Authors:  M Kozak
Journal:  Gene       Date:  1999-07-08       Impact factor: 3.688

Review 2.  Current methods of gene prediction, their strengths and weaknesses.

Authors:  Catherine Mathé; Marie-France Sagot; Thomas Schiex; Pierre Rouzé
Journal:  Nucleic Acids Res       Date:  2002-10-01       Impact factor: 16.971

Review 3.  Computational prediction of eukaryotic protein-coding genes.

Authors:  Michael Q Zhang
Journal:  Nat Rev Genet       Date:  2002-09       Impact factor: 53.242

4.  Selection of representative protein data sets.

Authors:  U Hobohm; M Scharf; R Schneider; C Sander
Journal:  Protein Sci       Date:  1992-03       Impact factor: 6.725

5.  Sequence logos: a new way to display consensus sequences.

Authors:  T D Schneider; R M Stephens
Journal:  Nucleic Acids Res       Date:  1990-10-25       Impact factor: 16.971

6.  Frequent alternative splicing of human genes.

Authors:  A A Mironov; J W Fickett; M S Gelfand
Journal:  Genome Res       Date:  1999-12       Impact factor: 9.043

7.  Prediction of human mRNA donor and acceptor sites from the DNA sequence.

Authors:  S Brunak; J Engelbrecht; S Knudsen
Journal:  J Mol Biol       Date:  1991-07-05       Impact factor: 5.469

8.  Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information.

Authors:  S M Hebsgaard; P G Korning; N Tolstrup; J Engelbrecht; P Rouzé; S Brunak
Journal:  Nucleic Acids Res       Date:  1996-09-01       Impact factor: 16.971

9.  Cleaning the GenBank Arabidopsis thaliana data set.

Authors:  P G Korning; S M Hebsgaard; P Rouze; S Brunak
Journal:  Nucleic Acids Res       Date:  1996-01-15       Impact factor: 16.971

10.  Gene recognition via spliced sequence alignment.

Authors:  M S Gelfand; A A Mironov; P A Pevzner
Journal:  Proc Natl Acad Sci U S A       Date:  1996-08-20       Impact factor: 11.205

View more
  14 in total

1.  Varying levels of complexity in transcription factor binding motifs.

Authors:  Jens Keilwagen; Jan Grau
Journal:  Nucleic Acids Res       Date:  2015-06-26       Impact factor: 16.971

2.  Genome-wide functional analysis of human 5' untranslated region introns.

Authors:  Can Cenik; Adnan Derti; Joseph C Mellor; Gabriel F Berriz; Frederick P Roth
Journal:  Genome Biol       Date:  2010-03-11       Impact factor: 13.583

3.  Genome-wide data-mining of candidate human splice translational efficiency polymorphisms (STEPs) and an online database.

Authors:  Christopher A Raistrick; Ian N M Day; Tom R Gaunt
Journal:  PLoS One       Date:  2010-10-11       Impact factor: 3.240

4.  Molecular cloning and characterization of OsCDase, a ceramidase enzyme from rice.

Authors:  Mickael O Pata; Bill X Wu; Jacek Bielawski; Tou Cheu Xiong; Yusuf A Hannun; Carl K-Y Ng
Journal:  Plant J       Date:  2008-06-10       Impact factor: 6.417

5.  A splice-site variant in FLVCR1 produces retinitis pigmentosa without posterior column ataxia.

Authors:  Imran H Yusuf; Morag E Shanks; Penny Clouston; Robert E MacLaren
Journal:  Ophthalmic Genet       Date:  2017-12-01       Impact factor: 1.803

6.  Visual gene developer: a fully programmable bioinformatics software for synthetic gene optimization.

Authors:  Sang-Kyu Jung; Karen McDonald
Journal:  BMC Bioinformatics       Date:  2011-08-16       Impact factor: 3.307

7.  POEM, A 3-dimensional exon taxonomy and patterns in untranslated exons.

Authors:  Keith Knapp; Ashley Chonka; Yi-Ping Phoebe Chen
Journal:  BMC Genomics       Date:  2008-09-20       Impact factor: 3.969

8.  Automatic generation of gene finders for eukaryotic species.

Authors:  Kasper Munch; Anders Krogh
Journal:  BMC Bioinformatics       Date:  2006-05-21       Impact factor: 3.169

9.  A method of predicting changes in human gene splicing induced by genetic variants in context of cis-acting elements.

Authors:  Alexander Churbanov; Igor Vorechovský; Chindo Hicks
Journal:  BMC Bioinformatics       Date:  2010-01-12       Impact factor: 3.169

10.  Method of predicting splice sites based on signal interactions.

Authors:  Alexander Churbanov; Igor B Rogozin; Jitender S Deogun; Hesham Ali
Journal:  Biol Direct       Date:  2006-04-03       Impact factor: 4.540

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.