| Literature DB >> 27639823 |
Stefan Taudien1, Ludwig Lausser2, Evangelos J Giamarellos-Bourboulis3, Christoph Sponholz4, Franziska Schöneweck5, Marius Felder6, Lyn-Rouven Schirra7, Florian Schmid7, Charalambos Gogos8, Susann Groth6, Britt-Sabina Petersen9, Andre Franke9, Wolfgang Lieb10, Klaus Huse6, Peter F Zipfel11, Oliver Kurzai12, Barbara Moepps13, Peter Gierschik13, Michael Bauer14, André Scherag5, Hans A Kestler15, Matthias Platzer16.
Abstract
Sepsis is a life-threatening organ dysfunction caused by dysregulated host response to infection. For its clinical course, host genetic factors are important and rare genomic variants are suspected to contribute. We sequenced the exomes of 59 Greek and 15 German patients with bacterial sepsis divided into two groups with extremely different disease courses. Variant analysis was focusing on rare deleterious single nucleotide variants (SNVs). We identified significant differences in the number of rare deleterious SNVs per patient between the ethnic groups. Classification experiments based on the data of the Greek patients allowed discrimination between the disease courses with estimated sensitivity and specificity>75%. By application of the trained model to the German patients we observed comparable discriminatory properties despite lower population-specific rare SNV load. Furthermore, rare SNVs in genes of cell signaling and innate immunity related pathways were identified as classifiers discriminating between the sepsis courses. Sepsis patients with favorable disease course after sepsis, even in the case of unfavorable preconditions, seem to be affected more often by rare deleterious SNVs in cell signaling and innate immunity related pathways, suggesting a protective role of impairments in these processes against a poor disease course.Entities:
Keywords: Classification; Exome; Population stratification; Rare single nucleotide variation; Semantic set covering machine; Sepsis
Mesh:
Year: 2016 PMID: 27639823 PMCID: PMC5078585 DOI: 10.1016/j.ebiom.2016.08.037
Source DB: PubMed Journal: EBioMedicine ISSN: 2352-3964 Impact factor: 8.143
Fig. 1Workflow of variant filtering in three steps.
SNV = Single Nucleotide Variant; GATK = Genome Analysis Toolkit; ExAC = Exome Aggregation Consortium, NFE = Non Finnish Europeans, ~ 30,000 exomes; ESP = Exome Sequencing Project NHLBI-ESP, EA = Americans of European Ancestry, ~ 4200 exomes.
Fig. 2Structure and function of the developed Semantic Set Covering Machine (Sem-SCM).
(a) Simplified structure of a trained Sem-SCM. The classifier system derives its prediction by inspecting the SNV status of a set of genes (g1,…g13). Genes are assigned to base classifiers by semantic terms (t1,…,t4) that induce a functional or structural grouping like molecular signaling pathways or cellular components. Generally, the same gene can be associated to more than one base classifier.
(b) Example of training the Sem-SCM on the genes assigned to base classifier b1. Four patients (p1, …,p4) with known categorization (yellow: class1, blue: class2) are shown. The base classifier uses a logical disjunction (OR) as a decision rule. The left decision rule will predict class1 if g2 or g10 are affected by a rare deleterious SNV (x) and class2 otherwise. The right rule represents its negated form (NOT). In this case the class1 will be predicted, if SNVs are detected neither in g2 nor in g10. Otherwise class2 will be assigned. As the application of these rules results in three vs. one correct predictions, the left rule will be utilized.
(c) Example of prediction by decision fusion of the base classifiers (logical conjunction AND). It directly operates on the decision rules of the base classifiers (b1, …,b4). The fusion classifier predicts class1 if all base classifiers predict class1. Otherwise class2 will be assigned. Predictions are shown for patients (q1, …,qn) not utilized in training.
Characteristics of sepsis patients (for individual data see Table S1).
| Greek (GR), N = 59 | German (DE), N = 15 | |||
|---|---|---|---|---|
| Group | A | B | A | B |
| Number | 32 | 27 | 5 | 10 |
| Deaths within 28 days | 0 | 9 (33%) | 0 | 3 (30%) |
| Men | 22 (69%) | 21 (78%) | 4 (80%) | 4 (40%) |
| Women | 10 (31%) | 6 (22%) | 1 (20%) | 6 (60%) |
| Age [median (Q1;Q3) | 78.0 (65.0; 82.0) | 47.0 (33.0; 53.0) | 69.0 (53.0; 70.5) | 64.5 (51.2; 72.7) |
| Sepsis focus | ||||
| – Bacteremia | 9 | 16 | 0 | 0 |
| – Acute pyelonephritis | 14 | 2 | 0 | 0 |
| – Pneumonia | 5 | 4 | 0 | 0 |
| – Cholangitis | 2 | 0 | 0 | 0 |
| – Soft tissue infection | 1 | 0 | 0 | 0 |
| – Abdominal infections | 1 | 2 | 5 | 10 |
| – Peritonitis | 0 | 2 | 0 | 0 |
| – Unknown | 0 | 1 | 0 | 0 |
| APACHE II [median (Q1;Q3)] | 17.0 (13.0; 20.5) | 18.0 (14.7; 26.0) | 27.0 (15.0; 30.0) | 22.0 (18.8; 26.3) |
| SOFA [median (Q1,Q3)] | 5.0 (4.0; 7.5) | 9.0 (6.0; 14.0) | 11.0 (7.0; 20) | 10.0 (6.0; 12.3) |
| Failing organs [median (range)] | 1 (1–4) | 2 (1–5) | 4 (2–5) | 4 (2–6) |
| Patients with ALI | 3 | 0 | 1 | 2 |
| Patients with ARDS | 11 | 16 | 3 | 6 |
| Pathogen identified | 32 (100%) | 27 (100%) | 3 (40%) | 7 (70%) |
| – Gram-positive infection only | 4 | 3 | 0 | 1 |
| – Gram-negative infection only | 26 | 22 | 1 | 3 |
| – Two gram-negative pathogens | 1 | 1 | 1 | 0 |
| – Gram-positive and -negative | 1 | 1 | 0 | 2 |
| – Fungi | 0 | 0 | 1 | 1 |
Medical patients.
Surgical patients.
Q: quantile.
Score at sepsis onset.
ALI: acute lung injury.
ARDS: acute respiratory distress syndrome.
Variants identified from sepsis patients and controls.
| Filter step | SNVs | Greek | German | ||||
|---|---|---|---|---|---|---|---|
| Sepsis | Controls | ||||||
| GR (N = 59) | Avg | DE (N = 15) | Avg | DE (N = 93) | Avg | ||
| All | 289,521 | 67,199.8 | 190,671 | 67,499.9 | 278,893 | 67,831.5 | |
| 1 | Protein affecting | 45,261 | 8581.2 | 25,729 | 8513.3 | 48,094 | 8508.6 |
| 2 | |||||||
| Missense | 17,236 | 294.1 | 4303 | 244.3 | 17,627 | 230.0 | |
| Stop and splice | 490 | 8.7 | 100 | 6.9 | 591 | 7.1 | |
| 3 | |||||||
| Missense, Damaging | 1721 | 31.6 | 377 | 26.0 | 2024 | 26.1 | |
| Stop-gain (nonsense) | 322 | 5.8 | 67 | 4.5 | 392 | 4.6 | |
| Stop-loss | 17 | 0.3 | 5 | 0.3 | 9 | 0.1 | |
| Splice-acceptor | 75 | 1.3 | 13 | 1.0 | 87 | 1.1 | |
| Splice-donor | 76 | 1.3 | 15 | 1.0 | 103 | 1.3 | |
Average per sample.
MAF < 0.005 in ExAC-NFE and ESP-EA.
Coincidently predicted to be damaging by PolyPhen, Grantham score and SIFT.
Significantly higher for GR vs. DE patients (Wilcoxon rank sum test, p < 0.001).
Leave-one-out-cross validation (LOOCV) models with accuracies > 75% for the classification of 59 Greek sepsis patients (top) and application of the two best models to 15 German patients (bottom).
| Parameters | Model | Acc | Sens | Spec | Decision | Decision rule |
|---|---|---|---|---|---|---|
| Meta = all, inv = Y, s = 10, p = 2 | 1 | 0.763 | 0.778 | 0.750 | Group B | IF NOT reactome G alpha Q signaling events AND NOT detection of stimulus AND NOT PID CDC42 pathway AND NOT reactome fatty acyl CoA biosynthesis AND NOT biocarta toll pathway AND NOT chr15q26 AND NOT biocarta HER2 pathway |
| Meta = all, inv = Y, s = 2, p = 2 | 2 | 0.763 | 0.963 | 0.594 | Group B | IF NOT reactome G alpha Q signaling events AND NOT detection of stimulus |
| Meta = all, inv = Y, s = 7, p = 2 | 3 | 0.763 | 0.778 | 0.750 | Group B | IF NOT reactome G alpha Q signaling events AND NOT detection of stimulus AND NOT PID CDC42 pathway AND NOT reactome fatty acyl CoA biosynthesis AND NOT biocarta toll pathway AND NOT chr15q26 AND NOT biocarta HER2 pathway |
| Meta = all, inv = Y, s = 8, p = 2 | 4 | 0.763 | 0.778 | 0.750 | Group B | IF NOT reactome G alpha Q signaling events AND NOT detection of stimulus AND NOT PID CDC42 pathway AND NOT reactome fatty acyl CoA biosynthesis AND NOT biocarta toll pathway AND NOT chr15q26 AND NOT biocarta HER2 pathway |
| Meta = all, inv = Y, s = 9, p = 2 | 5 | 0.763 | 0.778 | 0.750 | Group B | IF NOT reactome G alpha Q signaling events AND NOT detection of stimulus AND NOT PID CDC42 pathway AND NOT reactome fatty acyl CoA biosynthesis AND NOT biocarta toll pathway AND NOT chr15q26 AND NOT biocarta HER2 pathway |
| Meta = all, inv = Y, s = 6, p = 2 | 6 | 0.746 | 0.778 | 0.719 | Group B | IF NOT reactome G alpha Q signaling events AND NOT detection of stimulus AND NOT PID CDC42 pathway AND NOT reactome fatty acyl CoA biosynthesis AND NOT biocarta toll pathway AND NOT chr15q26 |
| Meta = react, inv = Y, s = 2, p = 2 | 7 | 0.729 | 0.963 | 0.531 | Group B | IF NOT reactome G alpha Q signaling events AND NOT reactome triglyceride biosynthesis |
| Meta = react, inv = Y, s = 3, p = 2 | 8 | 0.729 | 0.963 | 0.531 | Group B | IF NOT reactome G alpha Q signaling events AND NOT reactome triglyceride biosynthesis AND NOT reactome amine compound SLC transporters |
| Meta = kegg, inv = N, s = 4, p = Inf | 9 | 0.712 | 0.906 | 0.481 | Group A | IF NOT kegg inositol phosphate metabolism AND NOT kegg amyotrophic lateral sclerosis ALS AND NOT kegg long term potentiation AND NOT kegg butanoate metabolism |
| Meta = all, inv = Y, s = 3, p = 2 | 10 | 0.712 | 0.852 | 0.594 | Group B | IF NOT reactome G alpha Q signaling events AND NOT detection of stimulus AND NOT PID CDC42 pathway |
| Meta = kegg, inv = Y, s = 3, p = 1 | 11 | 0.712 | 0.519 | 0.875 | Group B | If kegg MAPK signaling pathway AND NOT kegg cysteine and methionine metabolism AND NOT kegg acute myeloid leukemia |
Acc: accuracy, Sens: sensitivity, Spec: specificity, meta: source of meta-information, inv: inversion of class labels (Y/N), s: maximal number of base classifiers (1–10), p: weighting parameter (0.5, 1, 2, ∞).
Fig. 3Prediction of the disease course after sepsis onset based on rare deleterious, protein affecting SNVs.
Application of the classification model following the rules listed in (a) on 59 Greek (b) and 15 German sepsis patients (c). The samples are shown in columns and sorted according to their class label (32 × GR-A vs. 27 × GR-B and 5 × DE-A vs. 10 × DE-B). First and last row depict sample's categorization and model prediction, respectively. The middle rows show the genes that are affected by SNVs and grouped according to terms. For color code see legend (a).
Fig. 4Structural model of the PAR1-Gα14 complex indicating a functional impact of amino acid exchange R33C in Gα14.
(a) The known structure of the human protease-activated receptor 1 (PAR1) (Zhang et al., 2012) was aligned with that of the β2-adrenoceptor (β2AR) contained in the quaternary complex between the agonist-bound form of β2AR with heterotrimeric Gs (α, βγ) (Chung et al., 2011) using the PyMOL Molecular Graphics System. The structure of the N-terminus of Gα14 was predicted with Swiss-Model using the structure of human Gαq as a model (Nishimura et al., 2010) and aligned with the N-terminus of Gαs in the β2AR-Gs-complex. The structures of Gβ1 and Gγ2 are those of the β2AR-Gs-complex. (b) Detailed view of the predicted contact site between PAR1 and the amino terminus of Gα14. The junction between transmembrane helices III and IV are missing in the structure of PAR1, presumably due to flexibility of the loop. The C- and N-terminal ends of helices III and IV, respectively, in the structure of PAR1 are marked by circles. In this region, the structure of β2AR is shown in light purple. R33 of Gα14 is likely to come into very close proximity to the second intracellular loop of PAR1. For example, its distance to Gln142 of β2-AR, corresponding to L211 of PAR1, previously shown to be important for PAR1-Gq-coupling (Zhang et al., 2012), would be < 3 Å in this model.