| Literature DB >> 32349659 |
Abstract
BACKGROUND: The rnpB gene encodes for an essential catalytic RNA (RNase P). Like other essential RNAs, RNase P's sequence is highly variable. However, unlike other essential RNAs (i.e. tRNA, 16 S, 6 S,...) its structure is also variable with at least 5 distinct structure types observed in prokaryotes. This structural variability makes it labor intensive and challenging to create and maintain covariance models for the detection of RNase P RNA in genomic and metagenomic sequences. The lack of a facile and rapid annotation algorithm has led to the rnpB gene being the most grossly under annotated essential gene in completed prokaryotic genomes with only a 24% annotation rate. Here we describe the coupling of the largest RNase P RNA database with the local alignment scoring algorithm to create the most sensitive and rapid prokaryote rnpB gene identification and annotation algorithm to date.Entities:
Keywords: Gene annotation; Genome annotation; Genomic; Metagenomic; RNase P RNA; Ribonuclease P; rnpB
Mesh:
Substances:
Year: 2020 PMID: 32349659 PMCID: PMC7191817 DOI: 10.1186/s12864-020-6615-z
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Structural variability of RNase P RNA (RNase P RNA structures were adapted with permission from the RNAse P Database [14]). RNase P RNA varies in both structure and length. a) Escherichia coli A-type b) Methanococcus jannaschii M-type c) Pyrobaculum aerophilum T-type d) Bacillus subtilis B-type e) Thermomicrobium roseum C-type RNase P RNA
Fig. 2Distribution of RNase P RNA structural classes by type and length. RNase P RNA has a broad diversity of sequence length and structure. A-type RNase P RNAs are the most common structural class. The minimal T-type, C-type, and M-type are uncommon with only 28 organisms identified to date with one of these structural classes
Examples of rnpB predictions by P Finder compared with BCheck
| Structural Type | Accesion | P Finder | BCheck | Strand | ||
|---|---|---|---|---|---|---|
| Start | End | Start | End | |||
| trachomatis | NC_022118.1 | 457146 | 457551 | 457145 | 457548 | plus |
| Escherichia coli LF82 | NC_011993.1 | 3317722 | 3318098 | 3317726 | 3318098 | minus |
| Klebsiella pneumoniae MGH 78578 | NC_009648.1 | 3896359 | 3896741 | 3896363 | 3896741 | minus |
| Halobacteroides halobius DSM 5150 | NC_019978.1 | 1887825 | 1888178 | 1887824 | 1888177 | plus |
| Cyanobacterium aponinum PCC 10605 | NC_019776.1 | 1149026 | 1149401 | 1149025 | 1149400 | plus |
| Candidatus Korarchaeum cryptofilum | NC_010482.1 | 603218 | 603511 | 603217 | 603510 | plus |
| Methanobacterium sp. SWAN-1 | NC_015574.1 | 2345793 | 2346086 | 2345792 | 2346085 | plus |
| Picrophilus torridus DSM 9790 | NC_005877.1 | 818122 | 818422 | 818121 | 818421 | plus |
| Methanobacterium sp. AL-21 | NC_015216.1 | 221739 | 222031 | 221738 | 222030 | plus |
| Mycoplasma conjunctivae HRC/581 | NC_012806.1 | 816571 | 816858 | 816570 | 816857 | plus |
| CandidatusPhytoplasmaasteris | NC_007716.1 | 652124 | 652495 | 652117 | 652499 | minus |
| Bacillus subtilis QB928 | NC_018520.1 | 2310826 | 2311206 | 2310818 | 2311211 | minus |
| Streptococcus pyogenes MGAS6180 | NC_007296.1 | 1389794 | 1390162 | 1389786 | 1390167 | minus |
| Staphylococcus aureus T0131 | NC_017347.1 | 1533810 | 1534189 | 1533802 | 1534194 | minus |
| Sphaerobacter thermophilus DSM 20745 | NC_013523.1 | 1710922 | 1711286 | - | - | plus |
| Thermomicrobium roseum DSM 5159 | NC_011959.1 | 714587 | 714936 | - | - | minus |
| Methanothermococcus okinawensis IH1 | NC_015636.1 | 713221 | 713458 | 713220 | 713457 | plus |
| Methanocaldococcus vulcanius M7 | NC_013407.1 | 1569639 | 1569896 | 1569638 | 1569895 | plus |
| Archaeoglobus fulgidus DSM 4304 | NC_000917.1 | 86045 | 86279 | 86044 | 86278 | plus |
| Methanotorris igneus Kol 5 | NC_015562.1 | 1095229 | 1095484 | 1095228 | 1095483 | plus |
| Methanocaldococcus jannaschii | NC_000909.1 | 643505 | 643762 | 643504 | 643761 | plus |
| Methanococcus maripaludis C7 | NC_009637.1 | 992687 | 992925 | 992686 | 992924 | plus |
| Methanococcus vannielii SB | NC_009634.1 | 1218466 | 1218705 | 1218465 | 1218704 | plus |
| Methanocaldococcus fervens AG86 | NC_013156.1 | 1009728 | 1009985 | 1009727 | 1009984 | plus |
| Caldivirga maquilingensis IC-167 | NC_009954.1 | 1690026 | 1690220 | - | - | plus |
| Pyrobaculum aerophilum str. IM2 | NC_003364.1 | 542975 | 543185 | - | - | plus |
| Pyrobaculum arsenaticum DSM 13514 | NC_009376.1 | 124150 | 124362 | - | - | plus |
| Pyrobaculum calidifontis JCM 11548 | NC_009073.1 | 104104 | 104313 | - | - | minus |
| Pyrobaculum islandicum DSM 4184 | NC_008701.1 | 1063572 | 1063783 | - | - | minus |
| Pyrobaculum neutrophilum V24Sta | NC_010525.1 | 114806 | 115020 | - | - | minus |
| Pyrobaculum oguniense TE7 | NC_016885.1 | 2039219 | 2039431 | - | - | minus |
| Thermoproteus tenax Kra 1 | NC_016070.1 | 1226650 | 1226847 | - | - | minus |
| Thermoproteus uzoniensis 768-20 | NC_015315.1 | 1810769 | 1810975 | - | - | minus |
Fig. 3The distribution of different structural classes of RNase P RNA in genomic sequences. a) A plot of the different structural classes by length and quality (bitscore) identified by P Finder b) A plot of the distribution by strand location of rnpB c) A plot of the RNase P RNA length found in archaea and bacteria
Fig. 4Speed comparison of P Finder with existing rnpB gene annotation software (BCheck). P a) Finder is 100X faster than currently available rnpB gene identification software. In addition, P Finder can differentiate between archaea RNaseP RNA and bacterial RNase P RNA eliminating the need for multiple analysis steps. P Finder demonstrates good agreement with BCheck in the prediction of start and stop locations of the rnpB gene in the genome b) Comparison of P Finder and BCheck’s predicted start location of the rnpB gene c) Comparison of P Finder and BCheck’s predicted end location of the rnpB gene
Life without P? A list of microorganisms in which no RNase P RNA can be identified.
| No Rnase P RNA? | Genome size (nt) | endosymbiont of… |
|---|---|---|
| Candidatus Carsonella ruddii uid58773 | 159, 662 | |
| Candidatus Hodgkinia cicadicola Dsem uid59311 | 144,000 | Cicadas |
| Candidatus Nasuia deltocephalinicola NAS ALF uid214084 | 112,000 | mealy bugs |
| Candidatus Portiera aleyrodidarum BT B uid173859 | 357,000 | |
| Candidatus Tremblaya phenacola PAVE uid209173 | 170,00 | mealy bugs |
| Candidatus Tremblaya princeps PCIT uid68741 | 138,000 | mealy bugs |
| Candidatus Uzinura diaspidicola ASNER uid186740 | 263, 000 | armoured scale insects |
| Candidatus Zinderia insecticola CARI uid52459 | 208,000 |