Literature DB >> 18347326

De novo search for non-coding RNA genes in the AT-rich genome of Dictyostelium discoideum: performance of Markov-dependent genome feature scoring.

Pontus Larsson1, Andrea Hinas, David H Ardell, Leif A Kirsebom, Anders Virtanen, Fredrik Söderbom.   

Abstract

Genome data are increasingly important in the computational identification of novel regulatory non-coding RNAs (ncRNAs). However, most ncRNA gene-finders are either specialized to well-characterized ncRNA gene families or require comparisons of closely related genomes. We developed a method for de novo screening for ncRNA genes with a nucleotide composition that stands out against the background genome based on a partial sum process. We compared the performance when assuming independent and first-order Markov-dependent nucleotides, respectively, and used Karlin-Altschul and Karlin-Dembo statistics to evaluate the significance of hits. We hypothesized that a first-order Markov-dependent process might have better power to detect ncRNA genes since nearest-neighbor models have been shown to be successful in predicting RNA structures. A model based on a first-order partial sum process (analyzing overlapping dinucleotides) had better sensitivity and specificity than a zeroth-order model when applied to the AT-rich genome of the amoeba Dictyostelium discoideum. In this genome, we detected 94% of previously known ncRNA genes (at this sensitivity, the false positive rate was estimated to be 25% in a simulated background). The predictions were further refined by clustering candidate genes according to sequence similarity and/or searching for an ncRNA-associated upstream element. We experimentally verified six out of 10 tested ncRNA gene predictions. We conclude that higher-order models, in combination with other information, are useful for identification of novel ncRNA gene families in single-genome analysis of D. discoideum. Our generalizable approach extends the range of genomic data that can be searched for novel ncRNA genes using well-grounded statistical methods.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18347326      PMCID: PMC2413156          DOI: 10.1101/gr.069104.107

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  53 in total

1.  No evidence that mRNAs have lower folding free energies than random sequences with the same dinucleotide distribution.

Authors:  C Workman; A Krogh
Journal:  Nucleic Acids Res       Date:  1999-12-15       Impact factor: 16.971

2.  A kingdom-level phylogeny of eukaryotes based on combined protein data.

Authors:  S L Baldauf; A J Roger; I Wenk-Siefert; W F Doolittle
Journal:  Science       Date:  2000-11-03       Impact factor: 47.728

Review 3.  Spliceosomal UsnRNP biogenesis, structure and function.

Authors:  C L Will; R Lührmann
Journal:  Curr Opin Cell Biol       Date:  2001-06       Impact factor: 8.382

Review 4.  Computational genomics of noncoding RNA genes.

Authors:  Sean R Eddy
Journal:  Cell       Date:  2002-04-19       Impact factor: 41.582

5.  Secondary structure prediction for aligned RNA sequences.

Authors:  Ivo L Hofacker; Martin Fekete; Peter F Stadler
Journal:  J Mol Biol       Date:  2002-06-21       Impact factor: 5.469

Review 6.  RNA structure prediction.

Authors:  D H Turner; N Sugimoto; S M Freier
Journal:  Annu Rev Biophys Biophys Chem       Date:  1988

7.  Noncoding RNA genes identified in AT-rich hyperthermophiles.

Authors:  Robert J Klein; Ziva Misulovin; Sean R Eddy
Journal:  Proc Natl Acad Sci U S A       Date:  2002-05-28       Impact factor: 11.205

8.  Searching for RNA genes using base-composition statistics.

Authors:  Peter Schattner
Journal:  Nucleic Acids Res       Date:  2002-05-01       Impact factor: 16.971

9.  The complex repeats of Dictyostelium discoideum.

Authors:  G Glöckner; K Szafranski; T Winckler; T Dingermann; M A Quail; E Cox; L Eichinger; A A Noegel; A Rosenthal
Journal:  Genome Res       Date:  2001-04       Impact factor: 9.043

10.  Genome sequence of the human malaria parasite Plasmodium falciparum.

Authors:  Malcolm J Gardner; Neil Hall; Eula Fung; Owen White; Matthew Berriman; Richard W Hyman; Jane M Carlton; Arnab Pain; Karen E Nelson; Sharen Bowman; Ian T Paulsen; Keith James; Jonathan A Eisen; Kim Rutherford; Steven L Salzberg; Alister Craig; Sue Kyes; Man-Suen Chan; Vishvanath Nene; Shamira J Shallom; Bernard Suh; Jeremy Peterson; Sam Angiuoli; Mihaela Pertea; Jonathan Allen; Jeremy Selengut; Daniel Haft; Michael W Mather; Akhil B Vaidya; David M A Martin; Alan H Fairlamb; Martin J Fraunholz; David S Roos; Stuart A Ralph; Geoffrey I McFadden; Leda M Cummings; G Mani Subramanian; Chris Mungall; J Craig Venter; Daniel J Carucci; Stephen L Hoffman; Chris Newbold; Ronald W Davis; Claire M Fraser; Bart Barrell
Journal:  Nature       Date:  2002-10-03       Impact factor: 49.962

View more
  13 in total

1.  RNAspace.org: An integrated environment for the prediction, annotation, and analysis of ncRNA.

Authors:  Marie-Josée Cros; Antoine de Monte; Jérôme Mariette; Philippe Bardou; Benjamin Grenier-Boley; Daniel Gautheret; Hélène Touzet; Christine Gaspin
Journal:  RNA       Date:  2011-09-23       Impact factor: 4.942

2.  Revisiting the Relationships Between Genomic G + C Content, RNA Secondary Structures, and Optimal Growth Temperature.

Authors:  Michelle M Meyer
Journal:  J Mol Evol       Date:  2020-11-20       Impact factor: 2.395

3.  MicroRNAs in Amoebozoa: deep sequencing of the small RNA population in the social amoeba Dictyostelium discoideum reveals developmentally regulated microRNAs.

Authors:  Lotta Avesson; Johan Reimegård; E Gerhart H Wagner; Fredrik Söderbom
Journal:  RNA       Date:  2012-08-08       Impact factor: 4.942

4.  Identification of non-coding RNAs with a new composite feature in the Hybrid Random Forest Ensemble algorithm.

Authors:  Supatcha Lertampaiporn; Chinae Thammarongtham; Chakarida Nukoolkit; Boonserm Kaewkamnerdpong; Marasri Ruengjitchatchawalya
Journal:  Nucleic Acids Res       Date:  2014-04-25       Impact factor: 16.971

Review 5.  De novo prediction of structured RNAs from genomic sequences.

Authors:  Jan Gorodkin; Ivo L Hofacker; Elfar Torarinsson; Zizhen Yao; Jakob H Havgaard; Walter L Ruzzo
Journal:  Trends Biotechnol       Date:  2009-11-26       Impact factor: 19.536

6.  Adjacent nucleotide dependence in ncRNA and order-1 SCFG for ncRNA identification.

Authors:  Thomas K F Wong; Tak-Wah Lam; Wing-Kin Sung; Siu-Ming Yiu
Journal:  PLoS One       Date:  2010-09-28       Impact factor: 3.240

Review 7.  From structure prediction to genomic screens for novel non-coding RNAs.

Authors:  Jan Gorodkin; Ivo L Hofacker
Journal:  PLoS Comput Biol       Date:  2011-08-04       Impact factor: 4.475

8.  In Silico Prediction of Evolutionarily Conserved GC-Rich Elements Associated with Antigenic Proteins of Plasmodium falciparum.

Authors:  Porkodi Panneerselvam; Praveen Bawankar; Surashree Kulkarni; Swati Patankar
Journal:  Evol Bioinform Online       Date:  2011-11-10       Impact factor: 1.625

9.  De novo computational prediction of non-coding RNA genes in prokaryotic genomes.

Authors:  Thao T Tran; Fengfeng Zhou; Sarah Marshburn; Mark Stead; Sidney R Kushner; Ying Xu
Journal:  Bioinformatics       Date:  2009-09-10       Impact factor: 6.937

10.  Identification of candidate structured RNAs in the marine organism 'Candidatus Pelagibacter ubique'.

Authors:  Michelle M Meyer; Tyler D Ames; Daniel P Smith; Zasha Weinberg; Michael S Schwalbach; Stephen J Giovannoni; Ronald R Breaker
Journal:  BMC Genomics       Date:  2009-06-16       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.