| Literature DB >> 29692912 |
S N Shpynov1, P-E Fournier2, N N Pozdnichenko3, A S Gumenuk3, A A Skiba3.
Abstract
The development of a formal order analysis (FOA) allowed constructing a classification of 49 genomes of Rickettsiaceae family representatives. Recently FOA has been extended with new tools-'Map of genes,' 'Matrix of similarity' and 'Locality-sensitive hashing'-for a more in-depth study of the structure of rickettsial genomes. The new classification confirmed and supplemented the previously constructed one by determining the position of Rickettsia africae str. ESF-5, R. heilongjiangensis 054, R. monacensis str. IrR/Munich, R. montanensis str. OSU 85-930, R. raoultii str. Khabarovsk, R. rhipicephali str. 3-7-female6-CWPP and Rickettsiales bacterium str. Ac37b. The 'Map of genes' demonstrated the complete genomes and their components in a graphical form. The 'Matrix of similarity' was applied for an in-depth classification to a subtaxonomic category of the strain within the species R. rickettsii (11 strains) and R. prowazekii (ten strains). The 'Matrix of similarity' determines the degree of homology of complete genomes by pairwise comparison of their components and identification of those being identical and similar in the arrangement of nucleotides. A new genomosystematics approach is proposed for the study of complete genomes and their components through the development and application of FOA tools. Its applications include the development of principles for the classification of microorganisms, based on the analysis of complete genomes and their annotations. This approach may help in the taxonomic classification and characterization of some Candidatus Rickettsia spp. that are found in large numbers in arthropods worldwide.Entities:
Keywords: Arthropods; ecology; epidemiology; formal order analysis; genome; genomosystematics; rickettsiae; rickettsioses; systematics; virulence
Year: 2018 PMID: 29692912 PMCID: PMC5913362 DOI: 10.1016/j.nmni.2018.02.012
Source DB: PubMed Journal: New Microbes New Infect ISSN: 2052-2975
Genome features of sequenced and average remoteness (g) of Rickettsia spp. and Orientia tsutsugamushi.
| No. | No. | Species and strains of | Access Number in GeneBank | Genome size (bp) | G+C% | |
|---|---|---|---|---|---|---|
| 1 | 1. | 1 111 454 | 29 | 1.41823242027653 | ||
| 2 | 1 111 445 | 29 | 1.41824128821358 | |||
| 3 | 1 111 523 | 29 | 1.41824702490549 | |||
| 4 | 1 111 612 | 29 | 1.41826432159974 | |||
| 5 | 1 111 969 | 29 | 1.41831400875350 | |||
| 6 | 1 112 101 | 29 | 1.41832503153582 | |||
| 7 | 1 109 804 | 29 | 1.41838368933900 | |||
| 8 | 1 109 301 | 29 | 1.41849488956929 | |||
| 9 | NZ_CP014865 | 1 111 769 | 29 | 1.41827647360422 | ||
| 10 | 1 111 520 | 29 | 1.41848807421873 | |||
| 11 | 2. | 1 112 957 | 28.9 | 1.41989725883639 | ||
| 12 | 1 112 372 | 28.9 | 1.41989961796300 | |||
| 13 | 1 111 496 | 28.9 | 1.41990899257420 | |||
| 14 | 3. | NZ_CP009217 | 1 851 238 | 30.8 | 1.42369690175481 | |
| 15 | 4. | 1 528 980 | 31.6 | 1.42461059098798 | ||
| 16 | 1 522 076 | 31.6 | 1.42478460772370 | |||
| 17 | 5. | 1 150 228 | 31.0 | 1.42505825951416 | ||
| 18 | 1 159 772 | 31.0 | 1.42576172562109 | |||
| 19 | 6. | NZ_LN794217 | 1 353 450 | 32.4 | 1.42639157617608 | |
| 20 | 7. | 1 485 148 | 32.6 | 1.42911847695814 | ||
| 21 | 8. | 1 278 468 | 32.3 | 1.4307561483309 | ||
| 22 | 9. | 1 290 368 | 32.4 | 1.43088037051608 | ||
| 23 | 10. | 1 283 087 | 32.4 | 1.43117915307354 | ||
| 24 | 11. | 1 296 670 | 32.3 | 1.43123685110919 | ||
| 25 | 12. | 1 279 798 | 32.6 | 1.43172145972565 | ||
| 26 | 13. | 1 278 530 | 32.4 | 1.43188388215182 | ||
| 27 | 14. | 1 275 720 | 32.5 | 1.43199033266176 | ||
| 28 | 1 275 089 | 32.5 | 1.43200176043628 | |||
| 29 | 15. | 1 300 386 | 32.4 | 1.43209421751330 | ||
| 30 | 16. | 1 268 755 | 32.4 | 1.43213975511604 | ||
| 31 | 17. | NZ_CP010969 | 1 344 517 | 32.5 | 1.43249957974534 | |
| 32 | 18. | 1 267 197 | 32.4 | 1.43264659062342 | ||
| 33 | 1 268 188 | 32.4 | 1.43268083393081 | |||
| 34 | 1 255 681 | 32.5 | 1.43272749268022 | |||
| 35 | 1 269 837 | 32.5 | 1.43279406482271 | |||
| 36 | 1 270 083 | 32.5 | 1.43286684945432 | |||
| 37 | 1 270 751 | 32.5 | 1.43286996863738 | |||
| 38 | 1 257 710 | 32.5 | 1.43291630589912 | |||
| 39 | NZ_CP006009 | 1 257 005 | 32.5 | 1.43288001638348 | ||
| 40 | NZ_CP000766 | 1 268 201 | 32.4 | 1.43268083393081 | ||
| 41 | NZ_CP018913 | 1 268 220 | 32.4 | 1.4326837678714 | ||
| 42 | NZ_CP018914 | 1 268 242 | 32.4 | 1.43268429687864 | ||
| 43 | 19. | 1 360 898 | 32.5 | 1.43314583756584 | ||
| 44 | 20. | 1 287 740 | 32.5 | 1.43325070488125 | ||
| 45 | 21. | Candidatus | 1 407 796 | 32.45 | 1.43334598270974 | |
| 46 | 22. | 1 288 492 | 32.6 | 1.43514665884247 | ||
| 47 | 23. | 1 231 060 | 32.3 | 1.43747339509228 | ||
| 48 | 24. | 2 127 051 | 30.5 | 1.44599460730303 | ||
| 49 | 2 008 987 | 30.5 | 1.44642319193296 |
All genomes were imported from GenBank NCBI (USA): http://www.ncbi.nlm.nih.gov/genome/.
Number on Fig. 1.
Fig. 1Systematics of Rickettsia spp. and Orientia tsutsugamushi using characteristics of average remoteness (g) of their genomes, as well as ecological, epidemiologic and nosologic (aetiologic) features (genomosystematics of rickettsiae).
Study of homology degree of components for genomes of strains of Rickettsia prowazekii using ‘Matrix of similarity’
| No. | Katsinyian | BuV67-CWPP | Madrid E | Rp22 | Naples-1 | GvV257 | RpGvF24 | Chernikova | NMRC madrid E | Breinl | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Katsinyian | 100.00% | 91.44% | 79.40% | 86.36% | 85.67% | 60.11% | 61.65% | 84.71% | 77.01% | 73.90% |
| 2 | BuV67-CWPP | 91.44% | 100.00% | 74.90% | 86.78% | 85.87% | 60.07% | 62.38% | 85.68% | 73.04% | 74.42% |
| 3 | Madrid E | 79.40% | 74.90% | 100.00% | 69.75% | 70.52% | 47.79% | 48.91% | 68.41% | 66.63% | 60.08% |
| 4 | Rp22 | 86.36% | 86.78% | 69.75% | 100.00% | 96.09% | 58.65% | 60.19% | 92.44% | 69.21% | 80.11% |
| 5 | Naples-1 | 85.67% | 85.87% | 70.52% | 96.09% | 100.00% | 58.12% | 59.67% | 91.24% | 69.30% | 80.37% |
| 6 | GvV257 | 60.11% | 60.067% | 47.79% | 58.65% | 58.12% | 100.00% | 86.56% | 58.32% | 49.59% | 50.88% |
| 7 | RpGvF24 | 61.65% | 62.38% | 48.91% | 60.19% | 59.67% | 86.56% | 100.00% | 60.30% | 50.68% | 52.31% |
| 8 | Chernikova | 84.71% | 85.68% | 68.41% | 92.44% | 91.24% | 58.32% | 60.30% | 100.00% | 67.57% | 85.70% |
| 9 | NMRC Madrid E | 77.001% | 73.04% | 66.63% | 69.21% | 69.30% | 49.59% | 50.68% | 67.58% | 100.00% | 64.55% |
| 10 | Breinl | 73.90% | 74.42% | 60.08% | 80.11% | 80.37% | 50.88% | 52.30% | 85.70% | 64.55% | 100.00% |
Study of homology degree of components for genomes of strains of Rickettsia rickettsii using ‘Matrix of similarity’
| No. | Arizona | Iowa | Iowa isolate, large clone | Iowa isolate, small clone | Brazil | Morgan | Hino | Colombia | Hlp#2 | R | Sheila smith | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Arizona | 100.00% | 92.50% | 78.18% | 78.11% | 74.91% | 93.92% | 94.00% | 75.85% | 26.43% | 75.81% | 75.68% |
| 2 | Iowa | 92.49% | 100.00% | 82.66% | 82.60% | 74.20% | 94.73% | 96.63% | 74.19% | 25.80% | 74.44% | 74.31% |
| 3 | Iowa isolate, large clone | 78.18% | 82.66% | 100.00% | 99.87% | 63.44% | 80.10% | 81.24% | 63.03% | 21.52% | 63.64% | 63.32% |
| 4 | Iowa isolate, small clone | 78.11% | 82.59% | 99.87% | 100.00% | 63.44% | 80.03% | 81.24% | 63.03% | 21.52% | 63.64% | 63.32% |
| 5 | Brazil | 74.91% | 74.20 | 63.44% | 63.44% | 100.00% | 73.95% | 74.20% | 77.94% | 26.26% | 80.93% | 80.80% |
| 6 | Morgan | 93.92% | 94.73% | 80.10% | 80.03% | 73.95% | 100.00% | 96.14% | 74.60% | 26.26% | 75.28% | 75.01% |
| 7 | Hino | 94.00% | 96.63% | 81.24% | 81.24% | 74.20% | 96.14% | 100.00% | 75.07% | 26.22% | 75.17% | 75.05% |
| 8 | Colombia | 75.85% | 74.19% | 63.03% | 63.03% | 77.94% | 74.60% | 75.07% | 100.00% | 26.01% | 79.29% | 79.24% |
| 9 | Hlp#2 | 26.43% | 25.80% | 21.52% | 21.52% | 26.26% | 26.26% | 26.22% | 26.01% | 100.00% | 26.31% | 26.22% |
| 10 | R | 75.81% | 74.44% | 63.64% | 63.64% | 80.93% | 75.28% | 75.17% | 79.30% | 26.31% | 100.00% | 98.75% |
| 11 | Sheila Smith | 75.68% | 74.31% | 63.32% | 63.32% | 80.79% | 75.01% | 75.05% | 79.24% | 26.22% | 98.75% | 100.00% |