| Literature DB >> 17621584 |
Zasha Weinberg1, Jeffrey E Barrick, Zizhen Yao, Adam Roth, Jane N Kim, Jeremy Gore, Joy Xin Wang, Elaine R Lee, Kirsten F Block, Narasimhan Sudarsan, Shane Neph, Martin Tompa, Walter L Ruzzo, Ronald R Breaker.
Abstract
We applied a computational pipeline based on comparative genomics to bacteria, and identified 22 novel candidate RNA motifs. We predicted six to be riboswitches, which are mRNA elements that regulate gene expression on binding a specific metabolite. In separate studies, we confirmed that two of these are novel riboswitches. Three other riboswitch candidates are upstream of either a putative transporter gene in the order Lactobacillales, citric acid cycle genes in Burkholderiales or molybdenum cofactor biosynthesis genes in several phyla. The remaining riboswitch candidate, the widespread Genes for the Environment, for Membranes and for Motility (GEMM) motif, is associated with genes important for natural competence in Vibrio cholerae and the use of metal ions as electron acceptors in Geobacter sulfurreducens. Among the other motifs, one has a genetic distribution similar to a previously published candidate riboswitch, ykkC/yxkD, but has a different structure. We identified possible non-coding RNAs in five phyla, and several additional cis-regulatory RNAs, including one in epsilon-proteobacteria (upstream of purD, involved in purine biosynthesis), and one in Cyanobacteria (within an ATP synthase operon). These candidate RNAs add to the growing list of RNA motifs involved in multiple cellular processes, and suggest that many additional RNAs remain to be discovered.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17621584 PMCID: PMC1950547 DOI: 10.1093/nar/gkm487
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Consensus sequences and structures are depicted for seven of the 22 motifs identified. Other motifs are presented as Supplementary Data, as are the alignments on which these diagrams are based. Calculations for conservation of nucleotide identity/presence and evidence of covariation are described in the ‘Materials and methods’ section. Proposed base pairs with more than 5% non-canonical or missing nucleotides are not classified as covarying. Note that the levels of nucleotide conservation are affected both by biochemical constraints on the motif and by phylogenetic diversity; motifs with limited range (e.g. the COG4708 motif) will appear more conserved. Some covarying positions in variable-length stems are not shown.
Summary of putative structured RNA motifs
| Motif | RNA? | Cis? | Switch? | Phylum/class | M,V | Cov. | # | Non-cis |
|---|---|---|---|---|---|---|---|---|
| GEMM | Y | Y | y | Widespread | V | 21 | 322 | 12/309 |
| Moco | Y | Y | Y | Widespread | M,V | 15 | 105 | 3/81 |
| SAH | Y | Y | Y | Proteobacteria | M,V | 22 | 42 | 0/41 |
| SAM-IV | Y | Y | Y | Actinobacteria | V | 28 | 54 | 2/54 |
| COG4708 | Y | Y | y | Firmicutes | M,V | 8 | 23 | 0/23 |
| Y | Y | y | β-proteobacteria | 9 | 40 | 0/40 | ||
| 23S-methyl | Y | y | n | Firmicutes | 12 | 38 | 1/37 | |
| Y | ? | ? | β-proteobacteria | V | 12 | 50 | 2/50 | |
| (anti- | (n) | (n) | (37) | (31/37) | ||||
| MAEB | ? | Y | n | β-proteobacteria | 3 | 662 | 15/646 | |
| mini- | Y | Y | ? | Widespread | V | 17 | 208 | 1/205 |
| y | y | ? | ε-proteobacteria | M | 16 | 21 | 0/20 | |
| 6C | y | ? | n | Actinobacteria | 21 | 27 | 1/27 | |
| alpha-transposases | ? | N | N | α-proteobacteria | 16 | 102 | 39/99 | |
| excisionase | ? | ? | n | Actinobacteria | 7 | 27 | 0/27 | |
| ATPC | y | ? | ? | Cyanobacteria | 11 | 29 | 0/23 | |
| Cyano-30S | Y | Y | n | Cyanobacteria | 7 | 26 | 0/23 | |
| lacto-1 | ? | ? | n | Firmicutes | 10 | 97 | 18/95 | |
| lacto-2 | y | N | n | Firmicutes | 14 | 357 | 67/355 | |
| TD-1 | y | ? | n | Spirochaetes | M,V | 25 | 29 | 2/29 |
| TD-2 | y | N | n | Spirochaetes | V | 11 | 36 | 17/36 |
| coccus-1 | ? | N | N | Firmicutes | 6 | 246 | 112/189 | |
| gamma-150 | ? | N | N | γ-proteobacteria | 9 | 27 | 6/27 |
‘RNA’ = functions as RNA (as opposed to dsDNA), ‘Cis’ = cis-regulatory, ‘Switch’ = riboswitch. Evaluation: ‘Y’ = certainly true, ‘y’ = probably true, ‘?’ = possible, ‘n’ = probably not, ‘N’ = certainly not. These evaluations were conducted prior to experimental examinations. Criteria for classification as an RNA include evidence of covariation and variable-length or modular stems. Evidence of covariation is strongest with covarying nucleotide positions for which surrounding sequence conservation permits high confidence that the covarying positions are correctly aligned, and was assessed manually based on alignments. Probable cis-regulatory motifs were consistently located upstream of homologous genes, or a set of genes with related functions, and often had features typical of known gene-control mechanisms (a transcription terminator or stem sequestering the Shine–Dalgarno sequence). Likely riboswitches were motifs that were classified as an RNA and a cis-regulatory element, showed evidence of high conservation of nucleotides at some positions, exhibited a complex secondary structure (not just a hairpin) and were associated with genes that were judged likely to be controlled by a small molecule. Motifs are characterized in detail according to these criteria in Supplementary Data. Remaining columns are ‘Phylum/class’ (phylum containing the motif, or class for Proteobacteria), ‘M,V’ (‘M’ = has modular stems, which are stems that are only sometimes present, ‘V’ = variable-length stems), ‘Cov.’ = number of covarying paired positions (see ‘Methods’ section; note that it is not advisable to rank motifs solely by this number, but rather the alignment as a whole should be evaluated), ‘#’ = number of representatives, ‘Non-cis’ = X/Y where X is number of representatives that are not in a 5′ regulatory configuration to a gene and Y is the number of representatives within sequences that have annotated genes (some RefSeq sequences lack annotations). Moco and SAM-IV riboswitch data will be presented in future reports. Gamma-150 and coccus-1 are only in the supplement.
Figure 2.Common features of GEMM motifs. Two GEMM instances were selected to illustrate common features, although these two examples do not represent the full 322 GEMMs (see Supplementary Data). (A) This putative RNA contains a canonical GNRA tetraloop and receptor (gray regions). Almost 50% of GEMM instances contain a likely tetraloop receptor. Only the first gene in the downstream operon is shown. (B) Some GEMM RNAs lack the tetraloop receptor, but there are two extra bulged A residues (gray shading) that are found in roughly half of the sequences lacking a receptor. Gray overlined nucleotides can fold to form a stem of a rho-independent transcription terminator (followed by 3′-trailing Us). This terminator appears to compete with the 3′ part of the P2 stem (right-most hairpin). 78 of 322 GEMM instances have predicted transcription terminators overlapping P2.
Gene families that appear to be regulated by GEMM in more than one instance
| Functional role | Gene families |
|---|---|
| Pili and flagella | |
| Secretion (related to pili/flagella) | |
| Chemotaxis regulator | |
| Signal transduction | PAS domain, histidine kinase, HAMP (Histidine kinases, Adenylyl cyclases, Methyl binding proteins, Phosphatases), HD-GYP domain, GGDEF domain |
| Chitin | chitin/cellulose binding domain, chitinase, carbohydrate-binding protein |
| Membranes | lysin domain (involved in cell wall remodeling, but might have general peptidoglycan binding function), uncharacterized outer membrane proteins and lipoproteins, putative collagen binding protein |
| Peptides | non-ribosomal peptide synthase, condensation domain (synthesis of peptide antibiotics), transglutaminase-like cysteine protease, subtilase (superfamily of extracellular peptidases) |
| Other |