| Literature DB >> 17389640 |
Yanglong Zhu1, Dileep K Pulukkunat, Yong Li.
Abstract
Metagenomics has been employed to systematically sequence, classify, analyze and manipulate the entire genetic material isolated from environmental samples. Finding genes within metagenomic sequences remains a formidable challenge, and noncoding RNA genes other than those encoding rRNA and tRNA are not well annotated in metagenomic projects. In this work, we identify, validate and analyze the genes coding for RNase P RNA (P RNA) from all published metagenomic projects. P RNA is the RNA subunit of a ubiquitous endoribonuclease RNase P that consists of one RNA subunit and one or more protein subunits. The bacterial P RNAs are classified into two types, Type A and Type B, based on the constituents of the structure involved in precursor tRNA binding. Archaeal P RNAs are classified into Type A and Type M, whereas the Type A is ancestral and close to Type A bacterial P RNA. Bacterial and some archaeal P RNAs are catalytically active without protein subunits, capable of cleaving precursor tRNA transcripts to produce their mature 5'-termini. We have found 328 distinctive P RNAs (320 bacterial and 8 archaeal) from all published metagenomics sequences, which led us to expand by 60% the total number of this catalytic RNA from prokaryotes. Surprisingly, all newly identified P RNAs from metagenomics sequences are Type A, i.e. neither Type B bacterial nor Type M archaeal P RNAs are found. We experimentally validate the authenticity of an archaeal P RNA from Sargasso Sea. One of the distinctive features of some new P RNAs is that the P2 stem has kinked nucleotides in its 5' strand. We find that the single nucleotide J2/3 joint region linking the P2 and P3 stem that was used to distinguish a bacterial P RNA from an archaeal one is no longer applicable, i.e. some archaeal P RNAs have only one nucleotide in the J2/3 joint. We also discuss the phylogenetic analysis based on covariance model of P RNA that offers a few advantages over the one based on 16S rRNA.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17389640 PMCID: PMC1874661 DOI: 10.1093/nar/gkm057
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The P RNA structural model for bacteria and archaea. All P RNAs have five conserved regions (CR I–CR V). Bacterial and archaeal sequences are distinguished by the P11 stem. Type A and Type B bacterial P RNAs are distinctive based on the P10.1 and P15.1 stem (Type B; shown in dashed green lines). See text for details about the J2/3 region of archaeal P RNAs. The difference between Type A and Type M of archaeal P RNA is based on the P8 stem (Type M lacks P8, shown in dashed line). Catalytic domain (right bottom) and Specificity Domain (left top) are divided by dashed lines.
P RNAs identified in metagenomics projects
| Metagenomic projects | Archaea | Bacterial Type A | Bacterial Type B | Total number of P RNA | Total nucleotides sequenced (M bps) | Reference |
|---|---|---|---|---|---|---|
| Acid Mine Biofilm (AM) | 4 (3) | 1 (1) | 0 | 5 | 75 | ( |
| Deep Sea Sediment (DS) | 1 | 0 | 0 | 1 | 111 | ( |
| Minnesota Soil (MS) | 0 | 15 | 0 | 15 | 100 | ( |
| Whale Falls (W1, W2 and W3) | 0 | 17 | 0 | 17 | 25 | ( |
| Sargasso Sea (SS) | 6 | 289 (2) | 0 | 295 | 1045 | ( |
| Uncultured Others (UO) | 1 | 0 | 0 | 1 | 0.6 | ( |
| Total | 12 | 322 | 0 | 334 | 1911 | |
| Known P RNA | 50 | 391 | 79 | 520 | ( |
a4(3) denotes that 4 P RNA found, yet 3 of them were reported in a previous study (14).
bThere are 99/455/145 archaeal/bacteria Type A/bacterial Type B P RNA entries in Rfam and RNase P databases (26,27), yet some of them are repetitive sequences. This number represents the unique P RNA (not identical to any other entry).
Figure 2.The bacterial and archaeal P RNAs are distinguished by their P11 stem. Only bacterial P11 stem is disrupted by AA dinucleotides. (A) The structure model for the bacterial P11 helix. (B) The RNAMotif descriptor to identify bacterial P11 and its frank region (A:G pair is frequently observed in the 3-bp segment).
Figure 3.An archaeal P RNA identified from Sargasso Sea with a J2/3 of a single G. (A) The folding of this P RNA (AACY01084936) is drawn according to the INFERNAL alignment. In the inset box, an alternative folding of P2 stem and its flanking regions is provided. (B) The folding of AB201308. Nucleotides in blue are those identical ones in these two sequences. (C) Functional reconstitution of metagenomic P RNA (AACY01084936) with Mja RNase P proteins (Rpps). RNase P activity was reconstituted by mixing 500 nM RNA with 5 µM protein (E. coli RNase P protein C5, Eco Rpp, or Mja Rpps, lanes 5 and 6) and assayed at 55°C for 2 h using 2 µM E. coli ptRNATyr as substrate. Lane 2 represents a control reaction E. coli RNase P. Lanes 3 and 4 represent control reactions in which the substrate was incubated with either the RNA alone or Mja Rpps, respectively.
P2 stem with kinked nucleotides from bacterial P RNAs
| Feature | Accession number | P2 Stem | Note | |
|---|---|---|---|---|
| One kinked nucleotide | AACY01001706 | Alternatively U5 kinked | ||
| AACY01023864 | ||||
| AACY01085511 | ||||
| AACY01100012 | ||||
| AACY01171914 | ||||
| AACY01177717 | ||||
| AACY01313347 | 5′ | |||
| AACY01418924 | 3′ | |||
| AACY01445082 | ||||
| AACY01568993 | ||||
| AACY01599112 | ||||
| AACY01634211 | ||||
| AACY01642849 | ||||
| AACY01756413 | ||||
| AACY01248068 | 5′ | Alternatively U5 kinked | ||
| AACY01254266 | ||||
| AACY01099407 | 5′ | Alternatively 5 kinked, but less likely though | ||
| AACY01044687 | 5′ | Alternatively U7 kinked | ||
| AACY01108785 | 5′ | |||
| AACY01074654 | 5′AUAACUU3′ 3′UGUU-AG5′ | |||
| AACY01189100 | 5′ | |||
| AACY01207302 | 5′ | |||
| AACY01283453 | 5′ | Alternatively U4 kinked | ||
| AACY01300066 | 5′ | |||
| AACY01313726 | 5′ | Alternatively A5 kinked | ||
| AACY01357080 | 5′ | |||
| AACY01518059 | 5′ | Alternatively U5 kinked | ||
| AACY-1667100 | 5′ | |||
| AAFY01022437 | 5′AC | Alternatively G3 kinked | ||
| Two kinked nucleotides | AACY01733681 | 5′ | ||
| AACY01749490 | 5′ | Alternatively U4 kinked |
aNucleotides in bold are conserved. The italicized ones can be paired alternatively.
Metagenomics P RNAs with over 97% identity with known ones
| Contig/scaffold | Sequence similarity | Known P RNA (Accession number/Annotation) |
|---|---|---|
| AACY01065403/CH009720 | 98 | BX569689.1/ |
| AACY01183524/CH034265 | 97 | BX569689.1/ |
| AACY01070379/CH011043 | 97 | AJ272225.1/ |
| AACY01043807/CH022746 | 97 | AJ272225.1/ |
| AACY01628203/CH172592 | 99 | AJ272223.1 ( |
| AACY01010320/CH010342 | 97 | AJ272224.1 ( |
| AACY01117529 | 97 | AJ272219.1 ( |
| AACY01637978 | 97 | AJ272225.1 ( |
aSequence similarity between the new metagenomics P RNA with contig/scaffold number and known P RNA.
Taxonomical distribution of metagenomics and known P RNA
| AM | DS | MS | W1 | W2 | W3 | SS | UO | Total new per taxon | Total known from Rfam database | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Archaea | Crenarchaeota | 4 | 1 | 8 | |||||||
| Euryarchaeota | 4 | 1 | 2 | 75 | |||||||
| Actinobacteria | 3 | 41 | |||||||||
| Bacterioides | 4 | 1 | 11 | 9 | |||||||
| Chlamydiae | 1 | 30 | |||||||||
| Chloroflexi | 1 | 2 | 2 + 1 | ||||||||
| Cyanobacteria | 18 | 27 | |||||||||
| Firmicutes | 1 + 1 | 115 + 6 | |||||||||
| Fusobacteria | 3 | 2 | |||||||||
| Group 1 | Deinocuccus-Thermus | 1 | 6 | ||||||||
| Thermodesulfobacteria | 1 | 1 | |||||||||
| Group 2 | Chlorobi | 1 | 3 | ||||||||
| Thermotogae | 4 | ||||||||||
| Verrucomicrobia | 4 | 1 | |||||||||
| Nitrospirae | 1 | 2 | |||||||||
| Planctomycetes | 1 | 1 | 10 | ||||||||
| Proteobacteria | α | 2 | 3 | 1 | 166 | 29 | |||||
| β | 1 | 11 | |||||||||
| γ | 1 | 1 + 1 | 2 | 10 + 54 | 47 + 5 | ||||||
| δ | 1 | 1 | 1 | 4 | |||||||
| ε | 2 | 8 | |||||||||
| Spirochaetes | 2 | 13 + 4 | 3 + 6 | ||||||||
| Thermodesulfobacteria | 1 | ||||||||||
| Total per metagenome | 334 | ||||||||||
an1 + n2 represents the fact that the group is split in two branches on the phylogenetic tree (Figure S2). bThese three sequences categorically assigned to Fusobacteria (Figure S2).
cThese four sequences categorically assigned to the Group 2, yet it is not clear how close it is related to the Verrucomicrobia (Figure S2).