Literature DB >> 2395643

Neural network detects errors in the assignment of mRNA splice sites.

S Brunak1, J Engelbrecht, S Knudsen.   

Abstract

The use of databanks in genetic research assumes reliability of the information they contain. Currently, error-detection in the manually or electronically entered data contained in the nucleotide sequence databanks at EMBL, Heidelberg and GenBank at Los Alamos is limited. We have used a subset of sequences from these databanks to train neural networks to recognize pre-mRNA splicing signals in human genes. During the training on 33 human genes from the EMBL databank seven genes appeared to disturb the learning process. Subsequent investigation revealed discrepancies from the original published papers, for three genes. In four genes, we found wrongly assigned splicing frames of introns. We believe this to be a reflection of the fact that splicing frames cannot always be unambiguously assigned on the basis of experimental data. Thus incorrect assignment appear both due to mere typographical misprints as well as erroneous interpretation of experiments. Training on 241 human sequences from GenBank revealed nine new errors. We propose that such errors could be detected by computer algorithms designed to check the consistency of data prior to their incorporation in databanks.

Entities:  

Mesh:

Substances:

Year:  1990        PMID: 2395643      PMCID: PMC331948          DOI: 10.1093/nar/18.16.4797

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  22 in total

1.  Predicting the secondary structure of globular proteins using neural network models.

Authors:  N Qian; T J Sejnowski
Journal:  J Mol Biol       Date:  1988-08-20       Impact factor: 5.469

Review 2.  Splicing of messenger RNA precursors.

Authors:  R A Padgett; P J Grabowski; M M Konarska; S Seiler; P A Sharp
Journal:  Annu Rev Biochem       Date:  1986       Impact factor: 23.643

3.  RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression.

Authors:  M B Shapiro; P Senapathy
Journal:  Nucleic Acids Res       Date:  1987-09-11       Impact factor: 16.971

4.  The beta-delta crossover leading to the beta delta hybrid gene of hemoglobin P-Nilotic is located within 54 base-pairs of the 5' end of exon 2 or between codons 31 and 50.

Authors:  J Z Liu; T Harano; K D Lanclos; T H Huisman
Journal:  Biochim Biophys Acta       Date:  1987-08-25

Review 5.  Pre-mRNA splicing.

Authors:  M R Green
Journal:  Annu Rev Genet       Date:  1986       Impact factor: 16.830

6.  Tandem arrangement of genes coding for tumor necrosis factor (TNF-alpha) and lymphotoxin (TNF-beta) in the human genome.

Authors:  S A Nedospasov; A N Shakhov; R L Turetskaya; V A Mett; M M Azizov; G P Georgiev; V G Korobko; V N Dobrynin; S A Filippov; N S Bystrov
Journal:  Cold Spring Harb Symp Quant Biol       Date:  1986

7.  Isolation and characterization of genomic and cDNA clones of human erythropoietin.

Authors:  K Jacobs; C Shoemaker; R Rudersdorf; S D Neill; R J Kaufman; A Mufson; J Seehra; S S Jones; R Hewick; E F Fritsch
Journal:  Nature       Date:  1985 Feb 28-Mar 6       Impact factor: 49.962

8.  Sequence and organization of genes encoding the human 27 kDa heat shock protein.

Authors:  E Hickey; S E Brandon; R Potter; G Stein; J Stein; L A Weber
Journal:  Nucleic Acids Res       Date:  1986-05-27       Impact factor: 16.971

9.  New subgroups in the human T cell rearranging V gamma gene locus.

Authors:  A Forster; S Huck; N Ghanem; M P Lefranc; T H Rabbitts
Journal:  EMBO J       Date:  1987-07       Impact factor: 11.598

10.  Complete nucleotide sequence of a functional class I HLA gene, HLA-A3: implications for the evolution of HLA genes.

Authors:  T Strachan; R Sodoyer; M Damotte; B R Jordan
Journal:  EMBO J       Date:  1984-04       Impact factor: 11.598

View more
  10 in total

1.  Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach.

Authors:  E C Uberbacher; R J Mural
Journal:  Proc Natl Acad Sci U S A       Date:  1991-12-15       Impact factor: 11.205

2.  O-GLYCBASE version 2.0: a revised database of O-glycosylated proteins.

Authors:  J E Hansen; O Lund; K Rapacki; S Brunak
Journal:  Nucleic Acids Res       Date:  1997-01-01       Impact factor: 16.971

3.  Analysis of missense variants in the PKHD1-gene in patients with autosomal recessive polycystic kidney disease (ARPKD).

Authors:  Monique Losekoot; Cathleen Haarloo; Claudia Ruivenkamp; Stefan J White; Martijn H Breuning; Dorien J M Peters
Journal:  Hum Genet       Date:  2005-11-15       Impact factor: 4.132

4.  Analysis of donor splice sites in different eukaryotic organisms.

Authors:  I B Rogozin; L Milanesi
Journal:  J Mol Evol       Date:  1997-07       Impact factor: 2.395

5.  Quantitative sequence-activity models (QSAM)--tools for sequence design.

Authors:  J Jonsson; T Norberg; L Carlsson; C Gustafsson; S Wold
Journal:  Nucleic Acids Res       Date:  1993-02-11       Impact factor: 16.971

6.  O-GLYCBASE: a revised database of O-glycosylated proteins.

Authors:  J E Hansen; O Lund; J O Nielsen; J E Hansen; S Brunak
Journal:  Nucleic Acids Res       Date:  1996-01-01       Impact factor: 16.971

7.  Self-organized neural maps of human protein sequences.

Authors:  E A Ferrán; B Pflugfelder; P Ferrara
Journal:  Protein Sci       Date:  1994-03       Impact factor: 6.725

8.  Neural network optimization for E. coli promoter prediction.

Authors:  B Demeler; G W Zhou
Journal:  Nucleic Acids Res       Date:  1991-04-11       Impact factor: 16.971

9.  Cleaning the GenBank Arabidopsis thaliana data set.

Authors:  P G Korning; S M Hebsgaard; P Rouze; S Brunak
Journal:  Nucleic Acids Res       Date:  1996-01-15       Impact factor: 16.971

10.  Method of predicting splice sites based on signal interactions.

Authors:  Alexander Churbanov; Igor B Rogozin; Jitender S Deogun; Hesham Ali
Journal:  Biol Direct       Date:  2006-04-03       Impact factor: 4.540

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.