Literature DB >> 17395691

A large number of novel coding small open reading frames in the intergenic regions of the Arabidopsis thaliana genome are transcribed and/or under purifying selection.

Kousuke Hanada1, Xu Zhang, Justin O Borevitz, Wen-Hsiung Li, Shin-Han Shiu.   

Abstract

Large-scale cDNA sequencing projects and tiling array studies have revealed the presence of many unannotated genes. For protein coding genes, small coding sequences may not be identified by gene finders because of the conservative nature of prediction algorithms. In this study, we identified small open reading frames (sORFs) with high coding potential by a simple gene finding method (Coding Index, CI) based on the nucleotide composition bias found in most coding sequences. Applying this method to 18 Arabidopsis thaliana and 84 yeast sORF genes with evidence of expression at the protein level gives 100% accurate prediction. In the A. thaliana genome, we identified 7159 sORFs that are likely coding sequences (coding sORFs) with the CI measure at the 1% false-positive rate. To determine if these coding sORFs are parts of functional genes, we evaluated each coding sORF for evidence of transcription or evolutionary conservation. At the 5% false-positive rate, we found that 2996 coding sORFs are likely expressed in at least one experimental condition of the A. thaliana tiling array data. In addition, the evolutionary conservation of each A. thaliana sORF was examined within A. thaliana or between A. thaliana and five plants with complete or partial genome sequences. In 3997 coding sORFs with readily identifiable homologous sequences, 2376 are subject to purifying selection at the 1% false-positive rate. After eliminating coding sORFs with similarity to known transposable elements and those that are likely missing exons of known genes, the remaining 3241 coding sORFs with either evidence of transcription or purifying selection likely belong to novel coding genes in the A. thaliana genome.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17395691      PMCID: PMC1855179          DOI: 10.1101/gr.5836207

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  46 in total

Review 1.  Assessment of protein coding measures.

Authors:  J W Fickett; C S Tung
Journal:  Nucleic Acids Res       Date:  1992-12-25       Impact factor: 16.971

2.  A gene expression map for the euchromatic genome of Drosophila melanogaster.

Authors:  Viktor Stolc; Zareen Gauhar; Christopher Mason; Gabor Halasz; Marinus F van Batenburg; Scott A Rifkin; Sujun Hua; Tine Herreman; Waraporn Tongprasit; Paolo Emilio Barbano; Harmen J Bussemaker; Kevin P White
Journal:  Science       Date:  2004-10-22       Impact factor: 47.728

3.  Determination of eukaryotic protein coding regions using neural networks and information theory.

Authors:  R Farber; A Lapedes; K Sirotkin
Journal:  J Mol Biol       Date:  1992-07-20       Impact factor: 5.469

4.  Analysis of compositionally biased regions in sequence databases.

Authors:  J C Wootton; S Federhen
Journal:  Methods Enzymol       Date:  1996       Impact factor: 1.600

Review 5.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

6.  Codon usage in bacteria: correlation with gene expressivity.

Authors:  M Gouy; C Gautier
Journal:  Nucleic Acids Res       Date:  1982-11-25       Impact factor: 16.971

7.  Codon catalog usage is a genome strategy modulated for gene expressivity.

Authors:  R Grantham; C Gautier; M Gouy; M Jacobzone; R Mercier
Journal:  Nucleic Acids Res       Date:  1981-01-10       Impact factor: 16.971

8.  Codon selection in yeast.

Authors:  J L Bennetzen; B D Hall
Journal:  J Biol Chem       Date:  1982-03-25       Impact factor: 5.157

Review 9.  Preferential codon usage in prokaryotic genes: the optimal codon-anticodon interaction energy and the selective codon usage in efficiently expressed genes.

Authors:  H Grosjean; W Fiers
Journal:  Gene       Date:  1982-06       Impact factor: 3.688

10.  Detection of new genes in a bacterial genome using Markov models for three gene classes.

Authors:  M Borodovsky; J D McIninch; E V Koonin; K E Rudd; C Médigue; A Danchin
Journal:  Nucleic Acids Res       Date:  1995-09-11       Impact factor: 16.971

View more
  63 in total

1.  A binary search approach to whole-genome data analysis.

Authors:  Leonid Brodsky; Simon Kogan; Eshel Benjacob; Eviatar Nevo
Journal:  Proc Natl Acad Sci U S A       Date:  2010-09-10       Impact factor: 11.205

2.  PLAZA: a comparative genomics resource to study gene and genome evolution in plants.

Authors:  Sebastian Proost; Michiel Van Bel; Lieven Sterck; Kenny Billiau; Thomas Van Parys; Yves Van de Peer; Klaas Vandepoele
Journal:  Plant Cell       Date:  2009-12-29       Impact factor: 11.277

3.  Transcriptional responses to flooding stress in roots including hypocotyl of soybean seedlings.

Authors:  Yohei Nanjo; Kyonoshin Maruyama; Hiroshi Yasue; Kazuko Yamaguchi-Shinozaki; Kazuo Shinozaki; Setsuko Komatsu
Journal:  Plant Mol Biol       Date:  2011-06-08       Impact factor: 4.076

Review 4.  Roles of Long Noncoding RNAs and Circular RNAs in Translation.

Authors:  Marina Chekulaeva; Nikolaus Rajewsky
Journal:  Cold Spring Harb Perspect Biol       Date:  2019-06-03       Impact factor: 10.005

5.  Small open reading frames associated with morphogenesis are hidden in plant genomes.

Authors:  Kousuke Hanada; Mieko Higuchi-Takeuchi; Masanori Okamoto; Takeshi Yoshizumi; Minami Shimizu; Kentaro Nakaminami; Ranko Nishi; Chihiro Ohashi; Kei Iida; Maho Tanaka; Yoko Horii; Mika Kawashima; Keiko Matsui; Tetsuro Toyoda; Kazuo Shinozaki; Motoaki Seki; Minami Matsui
Journal:  Proc Natl Acad Sci U S A       Date:  2013-01-22       Impact factor: 11.205

Review 6.  Emerging evidence for functional peptides encoded by short open reading frames.

Authors:  Shea J Andrews; Joseph A Rothnagel
Journal:  Nat Rev Genet       Date:  2014-02-11       Impact factor: 53.242

7.  Substantial expression of novel small open reading frames in Oryza sativa.

Authors:  Masanori Okamoto; Mieko Higuchi-Takeuchi; Minami Shimizu; Kazuo Shinozaki; Kousuke Hanada
Journal:  Plant Signal Behav       Date:  2014-02-13

8.  The EPIP peptide of INFLORESCENCE DEFICIENT IN ABSCISSION is sufficient to induce abscission in arabidopsis through the receptor-like kinases HAESA and HAESA-LIKE2.

Authors:  Grethe-Elisabeth Stenvik; Nora M Tandstad; Yongfeng Guo; Chun-Lin Shi; Wenche Kristiansen; Asbjørn Holmgren; Steven E Clark; Reidunn B Aalen; Melinka A Butenko
Journal:  Plant Cell       Date:  2008-07-25       Impact factor: 11.277

9.  Analysis of antisense expression by whole genome tiling microarrays and siRNAs suggests mis-annotation of Arabidopsis orphan protein-coding genes.

Authors:  Casey R Richardson; Qing-Jun Luo; Viktoria Gontcharova; Ying-Wen Jiang; Manoj Samanta; Eunseog Youn; Christopher D Rock
Journal:  PLoS One       Date:  2010-05-26       Impact factor: 3.240

10.  Meeting report: a workshop on Best Practices in Genome Annotation.

Authors:  Ramana Madupu; Lauren M Brinkac; Jennifer Harrow; Laurens G Wilming; Ulrike Böhme; Philippe Lamesch; Linda I Hannick
Journal:  Database (Oxford)       Date:  2010-02-18       Impact factor: 3.451

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.