| Literature DB >> 31906978 |
Pei Chen1,2, Shuo Li2, Wenyuan Li2, Jie Ren3,4, Fengzhu Sun3, Rui Liu5,6, Xianghong Jasmine Zhou7.
Abstract
BACKGROUND: Sepsis remains a major challenge in intensive care units, causing unacceptably high mortality rates due to the lack of rapid diagnostic tools with sufficient sensitivity. Therefore, there is an urgent need to replace time-consuming blood cultures with a new method. Ideally, such a method also provides comprehensive profiling of pathogenic bacteria to facilitate the treatment decision.Entities:
Keywords: Bacteremia; Bacteria profiling; Bacterial co-occurrence network; Cell-free DNA sequence; Rapid diagnosis; Sepsis
Mesh:
Substances:
Year: 2020 PMID: 31906978 PMCID: PMC6943891 DOI: 10.1186/s12967-019-02186-x
Source DB: PubMed Journal: J Transl Med ISSN: 1479-5876 Impact factor: 5.531
Fig. 1An illustration of our approach to sepsis diagnosis and bacteria inference based on cell-free DNA (cfDNA). a We used two public cfDNA datasets to obtain 38 sepsis and 118 healthy samples. All human reads were removed from the datasets using Bowtie2. Through alignment and classification, the normalized abundances of bacteria were estimated from the remaining non-human reads using Centrifuge [27]. b Our diagnosis strategy is a two-step procedure based solely on cfDNA from blood. First, we selected candidate pathogenic bacterial species through statistical analysis (see “Methods”). Second, a Random Forest is used to calculate a diagnosis score for each sample. c Due to the limited volume of a blood sample, not all bacterial species will be identified in cfDNA sequencing data. Using the bacterial co-occurrence network, we developed a method to infer unobserved bacterial species
Fig. 2Differential abundances of some candidate pathogenic bacterial species in heathy and sepsis samples. The distributions of bacterial abundances for 12 candidate pathogens are visualized as violin plots
Fig. 3The performance of a Random Forest classifier with balanced subsampling for identifying sepsis samples and healthy samples. a The out-of-bag error converges to 0.16, if the number of decision trees is over 100. b The average AUC curves for our diagnosis strategy (red) and a logistic regression scheme (blue) based on the one-third of the samples reserved for testing the model. c The AUC curves of our diagnosis strategy (red) and a logistic regression scheme (blue) based on an independent dataset for validating the proposed algorithm
Fig. 4Bacteria co-occurrence networks constructed on the basis of cfDNA data from normal and sepsis samples. a The differential co-occurrence network describing associations between species that are only observed in the sepsis samples. b A partial list of clusters (connected components) from the differential network. For each cluster, the representative bacteria are listed
Fig. 5The performance of species inference based on the bacteria co-occurrence network. The curve shows the average recovery rate. For each testing sepsis sample, we performed 1000 trials. In each trial, we randomly removed 10–80% of observed bacterial species then inferred the presence of missing species from the co-occurrence network. The x-axis represents the removal percentage. a The y-axis represents the percentage of inferred species that were removed in the cross-validation. b The y-axis represents the total percentage of identified species for the cross-validation, including both inferred species and those that were never removed. c The y-axis represents the percentage of inferred species that were removed in for the validation based on an independent data. d The y-axis represents the total percentage of identified species for the validation of an independent data