| Literature DB >> 35336180 |
Mitchell T Caudill1,2, Kelly A Brayton1.
Abstract
With the advent of cheaper, high-throughput sequencing technologies, the ability to survey biodiversity in previously unexplored niches and geographies has expanded massively. Within Anaplasma, a genus containing several intra-hematopoietic pathogens of medical and economic importance, at least 25 new species have been proposed since the last formal taxonomic organization. Given the obligate intracellular nature of these bacteria, none of these proposed species have been able to attain formal standing in the nomenclature per the International Code of Nomenclature of Prokaryotes rules. Many novel species' proposals use sequence data obtained from targeted or metagenomic PCR studies of only a few genes, most commonly the 16S rRNA gene. We examined the utility of the 16S rRNA gene sequence for discriminating Anaplasma samples to the species level. We find that while the genetic diversity of the genus Anaplasma appears greater than appreciated in the last organization of the genus, caution must be used when attempting to resolve to a species descriptor from the 16S rRNA gene alone. Specifically, genomically distinct species have similar 16S rRNA gene sequences, especially when only partial amplicons of the 16S rRNA are used. Furthermore, we provide key bases that allow classification of the formally named species of Anaplasma.Entities:
Keywords: 16S rRNA; Anaplasma; microbiome; species definition; taxonomy
Year: 2022 PMID: 35336180 PMCID: PMC8949108 DOI: 10.3390/microorganisms10030605
Source DB: PubMed Journal: Microorganisms ISSN: 2076-2607
ANI to 16S rRNA Gene Percent Identities for Anaplasma species.
|
|
|
|
| |||||
|---|---|---|---|---|---|---|---|---|
| Strain | HZ | HZ2 | Norway V2 | JM | St. Maries | Florida | Israel | Haibei |
| HZ | 99.98–100% | 96.51–99.8% | 99.69–100% | 67.89–96.3% | 68.10–96.3% | 68.17–96.3% | 68.32–96.2% | |
| HZ2 | 99.98–100% | 96.57–99.8% | 99.63–100% | 67.78–96.3% | 68.38–96.3% | 68.20–96.3% | 68.22–96.2% | |
| Norway V2 | 96.51–99.8% | 96.57–99.8% | 96.43–99.8% | 68.16–96.3% | 68.44–96.3% | 68.26–96.3% | 68.13–96.2% | |
| JM | 99.69–100% | 99.30–100% | 96.43–99.8% | 68.10–96.3% | 68.27–96.3% | 67.96–96.3% | 67.70–96.2% | |
| St. Maries | 67.89–96.3% | 67.78–96.3% | 68.16–96.3% | 68.10–96.3% | 99.02–99.9% | 87.56–99.3% | 84.87–99.3% | |
| Florida | 68.1–96.3% | 68.38–96.3% | 68.44–96.3% | 68.27–96.3% | 99.02–99.9% | 87.81–99.2% | 85.28–99.3% | |
| Israel | 68.17–96.3% | 68.20–96.3% | 68.26–96.3% | 67.96–96.3% | 87.56–99.3% | 87.81–99.2% | 81.46–99.5% | |
| Haibei | 68.32–96.2% | 68.22–96.2% | 68.13–96.2% | 67.70–96.2% | 84.87–99.3% | 85.28–99.3% | 81.46–99.5% | |
White text on black background indicates species that conform to the ANI to 16S rRNA gene percent identity for grouping in a species. The black text on a gray background highlights strains that share high 16S rRNA gene identity with another species, but low ANI values. Black text on a white background shows values for which both ANI and 16S percent indicate different species classification.
Putative Anaplasma spp., host source, 16S rRNA sequence accession numbers and references.
| Putative Species | Host | Accession # | Ref |
|---|---|---|---|
| “ | Sheep | None | [ |
| “ | Tick ( | None | [ |
| “ | Sheep | None | [ |
| “ | Sheep, Cattle, Goats | MN317253–MN317255 * | [ |
| “ | Mosquitos, Cattle | KU585969, KU586025 | [ |
| KU586041, KU586162 | [ | ||
| KU586164, KU586169 | [ | ||
| KU586177, KU586180 | [ | ||
| KU586182 | [ | ||
| MH169152 * | [ | ||
| “ | Camels | KX765882 | [ |
| KF843823–KF843825 | [ | ||
| “ | Mosquitos | KU586127 *, KU586148 * | [ |
| KU586144–KU586146 * | [ | ||
| KU586134–KU586136 * | [ | ||
| KU586141 * | [ | ||
| “ | Penguin ( | MG748724 * | [ |
| “ | Pangolin ( | KU189193 | [ |
| Tick ( | AF497580 * | [ | |
| “ | Tortise ( | MT62341-MT62345 | [ |
| “ | Sloths | None | [ |
| “ | Anteaters | None | [ |
|
| Sheep | None | [ |
|
| Human, | KR261618–KR261622 | [ |
| domestic and wild ruminants, | KP314237–KP314238 | [ | |
| Dogs | KM206273 | [ | |
| MG869526–MG869594 | [ | ||
| MG869482–MG869510 | [ | ||
| MH762071–MH762077 | [ | ||
| AB211164 | [ | ||
| AB454075 | [ | ||
| AB509223 | [ | ||
| AB588977 | [ | ||
| AF283007 | [ | ||
| EU709493 | [ | ||
| FJ389574, FJ389576 | [ | ||
| JN558820, JN558827 | [ | ||
| KP062964–KP062966 | [ | ||
| KP314241 | [ | ||
| KX817983 | [ | ||
| KX987331 | [ | ||
| LC432092–LC432126 | [ | ||
| MT798599–MT798604 | [ | ||
| MW721591 | [ | ||
| Dogs | AY570538–AY570540 | [ | |
| Ticks ( | MF576175.1 | [ | |
| MK815558-MK814449 | [ | ||
|
| Deer ( | NR_118489, JX876644 | [ |
| Sheep, Cattle, Goats | U54806 | [ | |
| KC189853 | [ | ||
| Sheep | MK575506 | [ | |
| Dogs | LC269823 | [ | |
| Izard ( | EU857675 * | [ | |
| Cattle | KY924884 | [ | |
| Cattle | KY924885 | [ | |
| Cattle | KY924886 | [ | |
| Tick ( | LC558313 | [ | |
| Tick ( | LC558314 | [ | |
| Goats | FJ389575 | [ |
* These are partial or fragmented sequences that were not included in the analyses. “Accession #” refers to the Genbank sequence accession number.
Sequence identity matrix for 16S rRNA gene “consensus” sequences for Anaplasma spp.
| cent | marg | ovis | Mon | capra | bovis | phag | platys | Mym | Omat | cam | odoc | SA | ZAM | Saso | Hade | bole | Dede | pang | test | walk | moub | Shiz | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
| 98.1 | 94.6 | 95.5 | 95.7 | 96.0 | 95.9 | 95.9 | 95.7 | 95.8 | 95.9 | 94.3 | 94.3 | 96.2 | 96.5 | 96.3 | 96.4 | 96.8 | 96.1 | 97.2 |
|
|
|
|
|
| 98.1 | 94.5 | 95.5 | 95.8 | 96.0 | 95.9 | 95.9 | 95.6 | 95.7 | 95.8 | 94.0 | 94.0 | 96.1 | 96.4 | 96.1 | 96.4 | 96.8 | 96.1 | 97.0 |
|
|
|
|
|
| 97.8 | 94.5 | 95.6 | 95.6 | 95.9 | 95.9 | 95.9 | 95.9 | 95.7 | 95.8 | 94.3 | 94.3 | 96.0 | 96.3 | 96.0 | 96.5 | 96.8 | 96.2 | 96.9 |
|
|
|
|
| 97.9 | 94.6 | 95.6 | 95.7 | 96.0 | 95.9 | 95.9 | 96.0 | 96.0 | 96.1 | 94.4 | 94.4 | 96.2 | 96.5 | 96.3 | 96.7 | 96.8 | 96.4 | 97.0 | |
|
| 98.1 | 98.1 | 97.8 | 97.9 |
| 93.9 | 95.2 | 95.5 | 95.9 | 95.5 | 95.8 | 95.4 | 95.7 | 95.8 | 93.9 | 93.9 | 95.8 | 96.1 | 95.6 | 95.4 | 96.5 | 95.9 | 97.2 |
|
| 94.6 | 94.5 | 94.5 | 94.6 | 93.9 |
| 94.9 | 94.9 | 95.4 | 95.1 | 95.5 | 95.3 | 95.4 | 95.5 | 92.9 | 92.9 | 94.6 | 94.7 | 96.4 | 93.2 | 94.2 | 94.5 | 96.3 |
|
| 95.5 | 95.5 | 95.6 | 95.6 | 95.2 | 94.9 |
| 96.9 | 97.0 | 96.8 | 97.0 | 96.8 | 97.8 | 97.9 | 94.0 | 94.0 | 96.7 | 96.9 | 96.4 | 94.4 | 95.5 | 95.2 | 95.9 |
|
| 95.7 | 95.8 | 95.6 | 95.7 | 95.5 | 94.9 | 96.9 |
|
|
|
| 98.2 | 97.2 | 97.2 | 93.9 | 93.9 | 96.6 | 96.8 | 96.4 | 94.5 | 95.6 | 95.2 | 95.9 |
| 96.0 | 96.0 | 95.9 | 96.0 | 95.9 | 95.4 | 97.0 |
|
|
|
|
| 97.9 | 98.0 | 94.1 | 94.1 | 97.1 | 97.2 | 97.0 | 94.7 | 96.0 | 95.7 | 96.3 | |
| 95.9 | 95.9 | 95.9 | 95.9 | 95.5 | 95.1 | 96.8 |
|
|
|
|
| 97.5 | 97.6 | 94.0 | 94.0 | 96.9 | 97.1 | 96.7 | 94.7 | 95.9 | 95.4 | 96.0 | |
| 95.9 | 95.9 | 95.9 | 95.9 | 95.8 | 95.5 | 97.0 |
|
|
|
|
| 97.8 | 97.9 | 94.0 | 94.0 | 97.0 | 97.2 | 96.9 | 94.8 | 95.9 | 95.8 | 96.3 | |
| 95.7 | 95.6 | 95.9 | 96.0 | 95.4 | 95.3 | 96.8 | 98.2 |
|
|
|
| 97.5 | 97.6 | 94.1 | 94.1 | 96.8 | 97.0 | 96.8 | 94.9 | 95.9 | 95.7 | 96.1 | |
| 95.8 | 95.7 | 95.7 | 96.0 | 95.7 | 95.4 | 97.8 | 97.2 | 97.9 | 97.5 | 97.8 | 97.5 |
|
| 94.6 | 94.6 | 97.4 | 97.6 | 97.2 | 94.8 | 95.8 | 95.8 | 96.5 | |
| 95.9 | 95.8 | 95.8 | 96.1 | 95.8 | 95.5 | 97.9 | 97.2 | 98.0 | 97.6 | 97.9 | 97.6 |
|
| 94.7 | 94.7 | 97.5 | 97.7 | 97.2 | 94.9 | 95.8 | 95.9 | 96.6 | |
| 94.3 | 94.0 | 94.3 | 94.4 | 93.9 | 92.9 | 94.0 | 93.9 | 94.1 | 94.0 | 94.0 | 94.1 | 94.6 | 94.7 |
|
| 94.4 | 94.5 | 94.8 | 93.6 | 94.0 | 93.7 | 94.3 | |
| 94.3 | 94.0 | 94.3 | 94.4 | 93.9 | 92.9 | 94.0 | 93.9 | 94.1 | 94.0 | 94.0 | 94.1 | 94.6 | 94.7 |
|
| 94.4 | 94.5 | 94.8 | 93.6 | 94.0 | 93.7 | 94.3 | |
| 96.2 | 96.1 | 96.0 | 96.2 | 95.8 | 94.6 | 96.7 | 96.6 | 97.1 | 96.9 | 97.0 | 96.8 | 97.4 | 97.5 | 94.4 | 94.4 |
|
| 96.6 | 95.2 | 96.4 | 95.3 | 96.4 | |
| 96.5 | 96.4 | 96.3 | 96.5 | 96.1 | 94.7 | 96.9 | 96.8 | 97.2 | 97.1 | 97.2 | 97.0 | 97.6 | 97.7 | 94.5 | 94.5 |
|
| 96.7 | 95.4 | 96.5 | 95.4 | 96.7 | |
| 96.3 | 96.1 | 96.0 | 96.3 | 95.6 | 96.4 | 96.4 | 96.4 | 97.0 | 96.7 | 96.9 | 96.8 | 97.2 | 97.2 | 94.8 | 94.8 | 96.6 | 96.7 |
| 94.7 | 95.6 | 96.6 | 96.5 | |
| 96.4 | 96.4 | 96.5 | 96.7 | 95.4 | 93.2 | 94.4 | 94.5 | 94.7 | 94.7 | 94.8 | 94.9 | 94.8 | 94.9 | 93.6 | 93.6 | 95.2 | 95.4 | 94.7 |
| 96.2 | 95.3 | 94.9 | |
| 96.8 | 96.8 | 96.8 | 96.8 | 96.5 | 94.2 | 95.5 | 95.6 | 96.0 | 95.9 | 95.9 | 95.9 | 95.8 | 95.8 | 94.0 | 94.0 | 96.4 | 96.5 | 95.6 | 96.2 |
| 95.9 | 96.4 | |
| 96.1 | 96.1 | 96.2 | 96.4 | 95.9 | 94.5 | 95.2 | 95.2 | 95.7 | 95.4 | 95.8 | 95.7 | 95.8 | 95.9 | 93.7 | 93.7 | 95.3 | 95.4 | 96.6 | 95.3 | 95.9 |
| 95.7 | |
| 97.2 | 97.0 | 96.9 | 97.0 | 97.2 | 96.3 | 95.9 | 95.9 | 96.3 | 96.0 | 96.3 | 96.1 | 96.5 | 96.6 | 94.3 | 94.3 | 96.4 | 96.7 | 96.5 | 94.9 | 96.4 | 95.7 |
|
White text on black background indicates organisms with sequence identity of >98.7%. The species epithet is shown in full on the left side. The species/putative species are listed from left to right with abbreviated names along the top, and full names from top to bottom at the left side in the same order. Can = Candidatus.
Figure 1Phylogenetic tree of the genus Anaplasma with putative species. Validly named species are represented by their consensus sequences and are highlighted in bold. Ehrlichia ruminantium serves as an outgroup for comparison of species distance. Sequences used to construct the tree were 16S rRNA gene regions V2–V7 for each species/putative species and were approximately 1200 bp in length. Phylogeny constructed using a modification of the likelihood-ratio test via the PhyML algorithm with an HKY85 evolutionary model [21,22,23].
Figure 2Comparisons of Anaplasma 16S rRNA sequences. (A) A map of the variable regions of 16S rRNA sequence A. marginale St. Maries strain. The start of each variable region was determined by the beginning of the conserved portion of the given variable region. Panels (B–D) are colorized representations of sequence identity matrices of regions of the 16SrRNA gene for known and putative Anaplasma species. (B) Comparison of the near-full-length (V2–V9 regions) 16S rRNA sequence identity for 294 sequences of Anaplasma. (C) Comparison of the V3–V4 regions of 16S rRNA sequence identity 294 Anaplasma sequences. (D) Comparison of concatenated V2 and V6 regions of 16S rRNA sequence identity 294 Anaplasma samples. In (B–D), each box represents a single 16S rRNA sequence of the indicated species of Anaplasma. Dark blue shading represents shared identity above or equal to 98.7% with white representing shared identity below 95%. Light blue shading represents identities between 95–98.7%. Coding is as follows: Ac: A. centrale; Am: A. marginale; Ao: A. ovis; Mon: Anaplasma sp. Mongolia; Boo: “Candidatus A. boleense”; Ab: A. bovis; Aca: A. capra; Aph: A. phagocytophilum; Apl: A. platys; Omat: Anaplasma sp. Omatjenne; Apan: “Candidatus A. pangolinii”; Ded: Anaplasma sp. Dedessa; Saso: Anaplasma sp. Saso; Had: Anaplasma sp. Hadesa; Arw: Anaplasma sp. Ar. walkerae; Omo: Anaplasma sp. O. moubata; Aod: A. odocoilei; Ate: “Candidatus A. testudines”; SA: Anaplasma sp. SA dog; ZAM: Anaplasma sp. ZAM dog; Acam: “Candidatus A. camelii”; Shi: Anaplasma sp. Shizhu. My: Anaplasma sp. Mymensingh. For explicit sample coding (accession numbers), see Supplementary Table S3.
Differentiating bases for the ruminant clade of Anaplasma species.
| Base Number * | ||||||
|---|---|---|---|---|---|---|
| 144 | 156 | 220 | 265 | 274 | 1250 | |
|
| A | A | T | T | G | T |
|
| A | G | T | T | G | T |
|
| G | R | Y | C | T | T |
| G | A | C | C | G | C | |
* Numbering based on Anaplasma marginale St. Maries strain sequence.
Differentiating bases for Anaplasma platys and closely related species.
| Base Number * | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 213 | 224 | 262 | 289 | 693 | 696 | 878 | 879 | 885 | 886 | 890 | 1052 | 1309 | 1358 | |
|
| A | T | T | T | N | T | R | C | G | T | T | R | Y | C |
| A | T | T | T | C | T | A | C | G | T | T | A | C | C | |
| A | C | T | T | C | T | R | C | G | T | T | G | C | T | |
| A | T | T | T | C | T | A | C | G | T | T | A | T | C | |
|
| G | A | G | C | A | A | G | T | A | C | C | G | C | C |
* Numbering based on Anaplasma platys S3 strain genome sequence. The consensus sequence of A. platys contains a “C” between bases at position 555–556 and a “T” between bases at position 1030–1031. These insertions are not present in all A. platys strains and are absent in the S3 strain.