| Literature DB >> 35567538 |
Blaine G Fritz1, Julius B Kirkegaard2, Claus Henrik Nielsen3, Klaus Kirketerp-Møller4, Matthew Malone5,6, Thomas Bjarnsholt1,7.
Abstract
Clinicians and researchers utilize subjective, clinical classification systems to stratify lower extremity ulcer infections for treatment and research. The purpose of this study was to examine whether these clinical classifications are reflected in the ulcer's transcriptome. RNA sequencing (RNA-seq) was performed on biopsies from clinically infected lower extremity ulcers (n = 44). Resulting sequences were aligned to the host reference genome to create a transcriptome profile. Differential gene expression analysis and gene ontology (GO) enrichment analysis were performed between ulcer severities as well as between sample groups identified by k-means clustering. Lastly, a support vector classifier was trained to estimate clinical infection score or k-means cluster based on a subset of genes. Clinical infection severity did not explain the major sources of variability among the samples and samples with the same clinical classification demonstrated high inter-sample variability. High proportions of bacterial RNA were identified in some samples, which resulted in a strong effect on transcription and increased expression of genes associated with immune response and inflammation. K-means clustering identified two clusters of samples, one of which contained all of the samples with high levels of bacterial RNA. A support vector classifier identified a fingerprint of 20 genes, including immune-associated genes such as CXCL8, GADD45B, and HILPDA, which accurately identified samples with signs of infection via cross-validation. This study identified a unique, host-transcriptome signature in the presence of infecting bacteria, often incongruent with clinical infection-severity classifications. This suggests that stratification of infection status based on a transcriptomic fingerprint may be useful as an objective classification method to classify infection severity, as well as a tool for studying host-pathogen interactions.Entities:
Keywords: Diabetic foot; RNA sequencing; biofilm; chronic wounds; infection; machine learning; transcriptomics; ulcer
Mesh:
Substances:
Year: 2022 PMID: 35567538 PMCID: PMC9545044 DOI: 10.1111/apm.13234
Source DB: PubMed Journal: APMIS ISSN: 0903-4641 Impact factor: 3.428
Fig. 1(A–C) Characterization of host gene expression ulcer transcriptomes (n = 12,378 genes) by principal component analysis. Points are colored by: (A) IDSA/PEDIS infection severity score [2: Mild, 3: Moderate, and 4: Severe], (B) ulcer duration [0: Less than 2 weeks, 1: 2 to 6 weeks, and 2: Greater than 6 weeks], and (C) percentage of all RNA‐seq reads classified to bacteria. (D) Component loadings for the top 5% of positive and negatively weighted genes for PC1 to PC3. Points are shaded by component loading value. (E) Spearman correlation coefficients of metadata variables with positioning of a samples along PC1 to PC6. Significance tests were performed with benjamini–hochberg p‐value correction for multiple comparisons [**: p < 0.01, *: p < 0.05]. (F) Scree plot demonstrating the percent of explained variance for PC1 to PC20. The red line represents the cumulative percentage of explained variance across these PCs.
Fig. 2(A) Two clusters (C1 and C2) were identified by k‐means analysis. Results are displayed projected over the principal component analysis plot of all genes (n = 12,378) with points colored by k‐means cluster. Samples are labeled by ID, where samples prefixed with “P” and “HH” are from this study and Heravi et al. [22], respectively. (B) Gene ontology (GO) terms for GO biological processes showing a significant overrepresentation (Fisher's exact test) of genes identified as differentially expressed between C1 and C2. The top 5 overrepresented pathways for cluster 1 (green) and cluster 2 (violet) are shown. (C) Percentage of RNA‐seq reads classified to bacteria relative to the total number of reads classified as either bacterial or host for C1 (green) and C2 (purple). The mean percentage of bacterial:human reads was significantly higher in C2 (t = 4.04, p = 0.004, Welch t‐test). (D) Relative activity (percentage of RNA reads for a specific species relative to all bacterial reads, %) for bacterial species with relative activity >5%. Only samples with greater than 10% bacterial reads are shown.
Fig. 3(A) Coefficient values for the top 20 genes extracted from the support vector classifier. (B) Gene symbols and products for identified features. (C) Receiver operating characteristic curve from cross‐validation (sixfold, stratified) analysis. Given the 20 gene fingerprint, the classifier performed with 100% accuracy for classifying test samples in each fold. (D) Plot of classifier accuracy vs number of features included in the classifier and tested via stratified, sixfold crossvalidation for each number of features. (E) Normalized expression values for genes selected as positive predictive genes for cluster 2. (F) Normalized expression values for genes selected as positive predictive genes for cluster 1.