| Literature DB >> 31164106 |
Claudia M Hemsley1, Paul A O'Neill1, Angela Essex-Lopresti2, Isobel H Norville2, Tim P Atkins1,2, Richard W Titball3.
Abstract
BACKGROUND: Coxiella burnetii is a zoonotic pathogen that resides in wild and domesticated animals across the globe and causes a febrile illness, Q fever, in humans. An improved understanding of the genetic diversity of C. burnetii is essential for the development of diagnostics, vaccines and therapeutics, but genotyping data is lacking from many parts of the world. Sporadic outbreaks of Q fever have occurred in the United Kingdom, but the local genetic make-up of C. burnetii has not been studied in detail.Entities:
Keywords: Coxiella burnetii; Genotyping; Pan-Genome Analysis; Patho-adaptation; Whole Genome Sequencing
Mesh:
Year: 2019 PMID: 31164106 PMCID: PMC6549354 DOI: 10.1186/s12864-019-5833-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Statistics for sequencing, assembly, and annotation for the nine C. burnetii genomes sequenced in this study. The annotation data for strain Nine Mile RSA493 and corresponding QpH1 plasmid is included for comparison. Note that Cb_D1 was sequenced at 250-bp read length, whereas all other strains were sequenced as 150-bp reads.
| Name | Source | QC passed reads | Mapped reads (%)a | Coverage | # contigs | Genome size (bp) | % GC | Predicted # CDS |
|---|---|---|---|---|---|---|---|---|
| Cb_D1 | Cow placenta | 2,826,398 | 563,469 (19.94%) | 77.51 | 42 | 2,000,727 | 42.5 | 2,225/2,017 |
| Q532 | Cow placenta | 2,046,051 | 1,800,928 (88.02%) | 106.82 | 38 | 2,001,903 | 42.5 | 2,223/2,021 |
| Q545 | Cow placenta | 2,351,449 | 2,187,509 (93.03%) | 131.80 | 37 | 2,003,604 | 42.5 | 2,228/2,021 |
| Q556 | Cow placenta | 2,150,728 | 1,170,716 (54.43%) | 69.23 | 42 | 2,004,954 | 42.5 | 2,234/2,024 |
| Q559 | Sheep placenta | 2,260,768 | 1,441,913 (63.78%) | 88.30 | 39 | 2,004,244 | 42.5 | 2,230/2,023 |
| Q540 | Goat placenta | 2,823,227 | 2,778,111 (98.40%) | 165.78 | 111 | 2,010,957 | 42.5 | 2,306/2,036 |
| Cb_D2 | Goat placenta | 2,219,644 | 2,104,345 (94.81%) | 141.98 | 111 | 1,991,633 | 42.5 | 2,245/2,018 |
| Cb_D8 | Goat placenta | 2,170,264 | 2,098,022 (96.67%) | 140.28 | 113 | 1,993,660 | 42.5 | 2,257/2,019 |
| Cb_D10 | Goat placenta | 1,162,812 | 1,127,480 (96.96%) | 72.77 | 113 | 1,994,548 | 42.5 | 2,259/2,022 |
| RSA493 + QpH1 | Tick | n.a. | n.a. | n.a. | 2 | 2,032,674 | 42.6 | 2,217/2,056 |
a against Nine Mile RSA493 genome (AE016828.2 and AE016829.1 concatenated)
Fig. 1ParSNP tree of 76 C. burnetii isolates overlaid with associated metadata on source of isolation. The same SNP-based tree as seen in Additional file 5: Figure S1, is presented in a radial form; metadata is colour coded according to the legend shown next to the figure. The tree was rooted along the branch leading to GG IV (see Methods). The nine UK genomes are highlighted in bold in the Figure
Fig. 2Analysis of MST genotype data of all C. burnetii isolates submitted to the MST database. a PhyML tree of all 55 known allele combinations. The suggested genomic groups highlighted are similar to Fig. 1 in Hornstra et al. [20]. The tree was rooted along the branch leading to GG IV (see Methods). b Number of isolates per genomic group with a described MST genotype ranked by their country of origin. Genotypes were assigned to a GG according to the tree shown in panel a)
Fig. 3Heat maps of gene conservation levels across the available C. burnetii genomes compared to the NMI reference strain. Gene conservation data was obtained from SEED viewer (see Methods). Note that plasmid data is absent for strains Cb196_SaudiArabia, Cb175_Qlymphoma, Q321, Schperling, Z3055 and Cb185. The inset graph shows average sequence conservation levels for each genomic group with standard deviation. Genomic groups were assigned as seen in Fig.1. Cb175_Guyana was here labelled as GG I-b
Fig. 4Gene frequency plots after BPGA pan-genome subset analysis using genomic group associations. Proteins annotated using PROKKA were used as input files. The protein similarity threshold for protein clustering was 90%. The bars furthest to the right in each graph represent conserved core-genes; the bars furthest to the left in each graph represent unique genes. The number of core genes is indicated within each figure
Genomic Group-specific genome content
| Absent from | ID in RSA493 | Function | Absent from | ID in RSA493 | Function |
|---|---|---|---|---|---|
| GGIIa | CBU_0584 | hypothetical protein | GGV | CBU_1158 | 7-dehydrocholesterol reductase |
| GGIIa | CBU_0945 | membrane-assoc. protein | GGV | CBU_1308 | phosphohydrolase; HD domain containing |
| GGIIa | CBU_0978 | membrane-assoc. protein, T4SS substrate | GGV | CBU_1460* | hypothetical protein; T4SS substrate |
| GGIIa |
| membrane-spanning protein | GGV |
| CBS domain protein |
| GGIIa | CBU_1213 | ankyrin repeat-containing protein; T4SS substrate | GGV |
| hypothetical protein; T4SS substrate |
| GGIIa | CBU_1404 | hypothetical protein | GGV | CBU_1788 | DNA-binding protein, KilA-N |
| GGIIa |
| toxin-antitoxin system antitoxin RelB | GGV |
| membrane-spanning protein |
| GGIIa |
| toxin-antitoxin system antitoxin RelE | GGV |
| hypothetical protein |
| GGIIb | CBU_0880 | hypothetical protein | GGV |
| hypothetical protein |
| GGIIb | CBU_1100 | hypothetical protein | GGV |
| hypothetical protein |
| GGIIb | CBU_1103 | lytic transglycosylase | GGV |
| LuxR family transcriptional regulator |
| GGIIb | CBU_1111 | membrane-bound lytic murein transglycosylase | GGV |
| LuxR family transcriptional regulator |
| GGIIb | CBU_1112 | GIY-YIG catalytic domain protein; endonuclease | GGV |
| hypothetical protein |
| GGIII | CBU_0590 | hypothetical protein; T4SS substrate | GGV | CBU_1895 | hypothetical protein |
| GGIII |
| ADP compounds hydrolase NudE | GGV |
| helix-turn-helix domain containing protein |
| GGIII | CBU_0686 | pyruvate dehydrogenase E1 subunit alpha | GGV |
| cell filamentation protein |
| GGIII | CBU_1710 | hypothetical protein | GGV |
| RelE/ParE family toxin |
| GGIII | CBU_1723 | protein-disulfide reductase DsbD | GGV |
| 3',5'-cyclic-nucleotide phosphodiesterase |
| GGIV | CBU_0777 | hypothetical protein | GGV |
| hypothetical protein |
| GGIV | CBU_0860 | hypothetical protein | GGV |
| chromosome partitioning protein |
| GGIV | CBU_1379a | hyp. protein; T4SS substrate | GGV |
| ParA protein |
| GGIV | CBU_1618 | hypothetical protein | GGV |
| ParB protein |
| GGIV | CBU_2041 | PAS domain S-box protein | GGV |
| RepA protein |
| GGV | CBU_0007a | BrnT family toxin | GGV |
| hypothetical protein |
| GGV | CBU_0183 | hyp. protein; T4SS substrate | GGVI | CBU_0793 | hypothetical protein |
| GGV | CBU_0196 | hypothetical protein | GGVI | CBU_1092 | lipoprotein |
| GGV |
| hypothetical ATPase | GGVI | CBU_1466 | hypothetical protein |
| GGV | CBU_0705 | hypothetical protein | GGVI | CBU_1822 | SodC superoxide dismutase |
| GGV | CBU_0948 | hypothetical protein | GGVI | CBU_1932 | hypothetical protein |
| GGV |
| amino acid permease | GGVI |
| hypothetical protein |
Proteins classed as absent in one GG only by BPGA subset analysis were searched for homologues in the RSA493 reference genome. Genes that have been also been shown to be group specific by Beare et al. [21] are highlighted in bold. The asterisk indicates an immunoreactive protein [65]
Summary of Pan-GWAS results
| Trait | Total # of associations | # of associations with 100% Sensitivity/Specificity | Comment |
|---|---|---|---|
| Europe | 168 | 0 | |
| Cow tissue | 13 | 0 | |
| GG I | 83 | 3 | Same results for MST16 |
| GG II_all | 148 | 0 | Includes MST33,32,18,25 |
| GG IIa only | 34 | 4 | Includes MST18 and MST25 |
| GG IIb only | 152 | 8 | Includes MST33 and MST32 |
| MST18 | 24 | 1 | |
| MST33 | 110 | 0 | |
| GG III | 215 | 4 | Same results for MST20 |
| GG IV | 300 | 8 | |
| GG V | 114 | 44 | Same results for MST21 |
| GG VI | 123 | 123 | Same results for Rodent source and MST-DG |
SNPs that were associated with a particular trait were obtained using the Scoary script on Roary output data. Traits analyzed were Genomic Group, MST genotype, Country of origin, Continent of origin, Host, Human disease type. Only traits with significant associations (Benjamini_Hochberg_p < 10-3) are reported