| Literature DB >> 21984475 |
Alban Ott1, Anouar Idali, Antonin Marchais, Daniel Gautheret.
Abstract
Nucleic acid phylogenetic profiling (NAPP) classifies coding and non-coding sequences in a genome according to their pattern of conservation across other genomes. This procedure efficiently distinguishes clusters of functional non-coding elements in bacteria, particularly small RNAs and cis-regulatory RNAs, from other conserved sequences. In contrast to other non-coding RNA detection pipelines, NAPP does not require the presence of conserved RNA secondary structure and therefore is likely to identify previously undetected RNA genes or elements. Furthermore, as NAPP clusters contain both coding and non-coding sequences with similar occurrence profiles, they can be analyzed under a functional perspective. We recently improved the NAPP pipeline and applied it to a collection of 949 bacterial and 68 archaeal species. The database and web interface available at http://napp.u-psud.fr/ enable detailed analysis of NAPP clusters enriched in non-coding RNAs, graphical display of phylogenetic profiles, visualization of predicted RNAs in their genome context and extraction of predicted RNAs for use with genome browsers or other software.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21984475 PMCID: PMC3245103 DOI: 10.1093/nar/gkr807
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Main displays of the NAPP database interface. (A) ‘RNA-rich Clusters’ table: for each cluster, the table provides cluster number, total number of elements, number of annotated genes, number of tiles, number of known ncRNAs (RFAM) and P-value of ncRNA enrichment. (B) Profile page: each column represents the average phylogenetic profile of all members of an RNA-rich cluster. There are as many lines as species in the database (at present 1017). Shaded/white squares indicate that members of this cluster have/do not have homologs in this species. Darkness indicates average Blast scores of homologues (darker: higher scores). The popup window displays GO term biases and other information on cluster #1. (C) Contig page. Contigs are produced by aggregating all overlapping or adjacent tiles from RNA-rich clusters. A single contig may contain tiles from different clusters. Columns indicate contig number, Genbank id. of chromosome, tiles forming contig (the cluster number of each tile is indicated in parentheses as C.1, C.2, etc.) and RFAM annotation. (D) Context view: from the contig or cluster views, users can visualize tiles or contigs in their genomic context through links to the NCBI genome server.