| Literature DB >> 32041522 |
Amber C A Hendriks1, Frans A G Reubsaet1, A M D Mirjam Kooistra-Smid2,3, John W A Rossen3, Bas E Dutilh4,5, Aldert L Zomer6, Maaike J C van den Beld7,8.
Abstract
BACKGROUND: We investigated the association of symptoms and disease severity of shigellosis patients with genetic determinants of infecting Shigella and entero-invasive Escherichia coli (EIEC), because determinants that predict disease outcome per individual patient could be used to prioritize control measures. For this purpose, genome wide association studies (GWAS) were performed using presence or absence of single genes, combinations of genes, and k-mers. All genetic variants were derived from draft genome sequences of isolates from a multicenter cross-sectional study conducted in the Netherlands during 2016 and 2017. Clinical data of patients consisting of binary/dichotomous representation of symptoms and their calculated severity scores were also available from this study. To verify the suitability of the methods used, the genetic differences between the genera Shigella and Escherichia were used as control.Entities:
Keywords: Disease control guidelines; Disease severity; E. coli; EIEC; Escherichia coli; GWAS; Shigella; Shigellosis; Symptoms
Year: 2020 PMID: 32041522 PMCID: PMC7011524 DOI: 10.1186/s12864-020-6555-7
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Phylogenetic tree based on core genome SNPs with species indication, underlying diseases and severity scores. Within the salmon squares are the main lineages or phylogroups depicted. wzx6 = S. flexneri serotype 6. PGx = phylogenetic group of S. flexneri. STxxx = Warwick sequence type of EIEC. II and III = S. sonnei lineage II and III
Fig. 2Results of Scoary: the expected versus the observed log transformed p-values. Lilac lines indicate the outcomes of the permutation dataset. a. Best comparison test for association of gene presence/absence with de Wit severity score. b. Best comparison test for association of gene presence/absence with Modified Vesikari score. c. Best comparison test for association of gene presence/absence with symptoms. d. Benjamini Hochberg’s test for association of gene presence/absence with genus
Results of Random Forest classification and k-mer association
| Characteristic | Random Forest | K-mer association with Pyseer | |
|---|---|---|---|
| OOB error rate | No. of k-mers | Lowest LRT p-value | |
| MVS severity scale | 70.1% | 0 | NA |
| De Wit severity scale | 65.1% | 17 | 0.015 |
| Abdominal cramps | 52.7% | 0 | NA |
| Abdominal pain | 40.8% | 0 | NA |
| Blood in stool | 41.2% | 0 | NA |
| Diarrhea | 51.6% | 156 | 0.313 |
| Fever | 47.7% | 0 | NA |
| Headache | 46.6% | 0 | NA |
| Mucus in stool | 43.3% | 0 | NA |
| Nausea | 53.1% | 0 | NA |
| Vomiting | 51.6% | 0 | NA |
| Genus | 15.9% | 3,036,507 | 1.94E-153 |
Comparison of misclassified isolates with Random Forest to traditional laboratory testing
| Isolate | Phenotypea | Random Forest (RF)a | Votesb | Location in SNP tree | Serotype | Properties against RF classification |
|---|---|---|---|---|---|---|
| IBESS811 | E | S | 0.99 | Within | Motility | |
| IBESS97 | E | S | 0.80 | Within | Inconclusive Shigella serotype | |
| IBESS1163 | E | S | 0.76 | Within | Inconclusive Shigella serotype | |
| IBESS911 | E | S | 0.68 | Within | Inconclusive Shigella serotype | |
| IBESS996 | S | E | 0.53 | Within EIEC / | None, hybrid isolated | |
| IBESS988 | S | E | 0.56 | Within EIEC / | None, hybrid isolated | |
| IBESS419 | S | E | 0.57 | Within | Provisional/O-negative | None, hybrid isolate, provisional |
| IBESS232 | S | E | 0.60 | Within | Provisional/O-negative | None, hybrid isolate, provisional |
| IBESS470 | S | E | 0.82 | Within EIEC | Provisional/O-negative | None, hybrid isolate, provisional |
| IBESS810 | S | E | 0.89 | Within EIEC | Auto agglutinablec | None, hybrid isolate, provisional |
RF Random Forest. E Escherchia, S Shigella. bfraction of votes for classification in Random Forest. cIn-silico serotype, using E. coli serotypeFinder 2.0 of the Center for Genomic Epidemiology [23]: provisional/O-negative. d Hybrid isolates Isolates that possess characteristics of both Shigella spp. and E. coli.
Fig. 3Blast result of k-mers resulting consensus on used isolates. a. Blast results versus severity score. b. Histogram of the relative frequency of the severity scores in the dataset versus the severity score of de Wit, displayed for three bit-score categories