| Literature DB >> 33688005 |
Marija Dmitrijeva1,2, Christian R Kahlert3,4, Rounak Feigelman1,2, Rebekka L Kleiner5, Oliver Nolte6, Werner C Albrich4, Florent Baty5, Christian von Mering7,2.
Abstract
In cystic fibrosis, dynamic and complex communities of microbial pathogens and commensals can colonize the lung. Cultured isolates from lung sputum reveal high inter- and intraindividual variability in pathogen strains, sequence variants, and phenotypes; disease progression likely depends on the precise combination of infecting lineages. Routine clinical protocols, however, provide a limited overview of the colonizer populations. Therefore, a more comprehensive and precise identification and characterization of infecting lineages could assist in making corresponding decisions on treatment. Here, we describe longitudinal tracking for four cystic fibrosis patients who exhibited extreme clinical phenotypes and, thus, were selected from a pilot cohort of 11 patients with repeated sampling for more than a year. Following metagenomics sequencing of lung sputum, we find that the taxonomic identity of individual colonizer lineages can be easily established. Crucially, even superficially clonal pathogens can be subdivided into multiple sublineages at the sequence level. By tracking individual allelic differences over time, an assembly-free clustering approach allows us to reconstruct multiple lineage-specific genomes with clear structural differences. Our study showcases a culture-independent shotgun metagenomics approach for longitudinal tracking of sublineage pathogen dynamics, opening up the possibility of using such methods to assist in monitoring disease progression through providing high-resolution routine characterization of the cystic fibrosis lung microbiome.IMPORTANCE Cystic fibrosis patients frequently suffer from recurring respiratory infections caused by colonizing pathogenic and commensal bacteria. Although modern therapies can sometimes alleviate respiratory symptoms by ameliorating residual function of the protein responsible for the disorder, management of chronic respiratory infections remains an issue. Here, we propose a minimally invasive and culture-independent method to monitor microbial lung content in patients with cystic fibrosis at minimal additional effort on the patient's part. Through repeated sampling and metagenomics sequencing of our selected cystic fibrosis patients, we successfully classify infecting bacterial lineages and deconvolute multiple lineage variants of the same species within a given patient. This study explores the application of modern computational methods for deconvoluting lineages in the cystic fibrosis lung microbiome, an environment known to be inhabited by a heterogeneous pathogen population that complicates management of the disorder.Entities:
Keywords: cystic fibrosis; longitudinal study; lung sputum; metagenomics; strain typing
Year: 2021 PMID: 33688005 PMCID: PMC8092271 DOI: 10.1128/mBio.02863-20
Source DB: PubMed Journal: mBio Impact factor: 7.867
FIG 1Study report of patient CFR11 displaying the dynamics of multiple parameters over time. (A) Percent forced expiratory volume (FEV) (black) and concentration of C-reactive protein (CRP) (blue), with actual measurements shown as dots. (B) Medication assigned to the patient during the course of the study and recorded exacerbation events (in red). (C) Bacteria identified in the clinical microbiology laboratory. (D) Relative abundance profiles generated by mOTUs, with actual measurements shown as bars. Selected species and their corresponding genera with more than 5% relative abundance at at least one time point are shown color-coded. Less abundant species are grouped into “Others” (gray). Taxa that could not be identified by mOTUs on the genus level are grouped into “Unknown Genus” (white). (E) Shannon’s diversity index (entropy) calculated based on the relative abundance profiles generated by mOTUs, with actual measurements shown as dots. (F) Number of reads per sample. Human reads are indicated in black and plotted on the left axis. Reads that did not concordantly map to the human genome are depicted in red and plotted on the right axis. (G) Concentration of total DNA isolated from patient sputum, with actual measurements shown as dots.
FIG 2Strain-typing of patient-specific Achromobacter in the context of a sequence-based phylogeny of the genus. (A) Maximum-likelihood tree based on sequences of 10 single-copy genes used by mOTUs. Colors on the right of the tree depict species assignments according to NCBI and GTDB taxonomies and apply to panels B and C as well. Blue circles indicate branch confidence values (≥90) based on 100 bootstraps of the tree. (B) Maximum-likelihood tree based on sequences of seven housekeeping genes from the Achromobacter MLST database. (C) UPGMA clustering of pairwise average nucleotide identities of the comprising Achromobacter genomes.
FIG 3Identification of lineage variants through assessment of temporal changes in SNV allele frequencies in the metagenomics data of patient CFR11. (Step 1) Selection of a reference genome based on generated gene family presence/absence profiles. (Step 2) Read mapping of CFR11 samples to the reference genome (CP013113.1). A pile-up of a selected region containing an SNV (1,959,777 to 1,959,791 bp) is shown for every time point. The reference sequence is displayed on the bottom. Gray indicates read base pairs that are identical to the reference sequence. Orange indicates that a substitution to guanine has occurred. (Step 3) The change in allele frequency over time for the selected SNV. (Step 4) A group of SNVs that show a similar pattern of temporal changes in allele frequencies. The selected SNV is depicted in orange. The explicit steps performed and tools used in this approach can be found in a flow chart in Fig. S6.
FIG 4Clustering of SNVs detected in patient CFR11 based on their temporal changes in allele frequencies. (A) A t-SNE plot depicting the clustering pattern of 3,079 SNVs called in patient CFR11. Most SNVs occur at low allele frequencies (gray). The remaining SNVs form seven distinctly visible genotypes that are labeled and colored accordingly. (B) Changes in the allele frequencies (p) of SNVs belonging to each distinct genotype over time. The colored line indicates mean allele frequency of the genotype. Dark gray ribbons indicate the 95% confidence intervals.
Number of SNVs consistently clustered by three different approaches (t-SNE, PCA, and DESMAN)
| SNV type | No. of SNVs |
|---|---|
| Lineage variant 1 specific | 204 |
| Lineage variant 2 specific | 106 |
| Lineage variant 3 specific | 186 |
| Shared between 2 and 3 | 67 |
FIG 5Assessment of temporal variation in the coverage of specific regions in the genome of P. aeruginosa in patient CFR11. The circos diagram provides an overview of the genome coverage profiles with chromosomal coordinates in kb. The dark gray outer circle depicts regions of the reference genome that have a coverage of at least 5% from the average coverage at at least one time point. The second outermost circle depicts detected phage regions in green. The remaining circles depict normalized coverage profiles for each of the 10 time points sampled for patient CFR11 (innermost, day 0; outermost, day 496). Orange regions indicate lower than average coverage, and purple-blue regions indicate higher than average coverage. Insets highlight three regions that display variant-specific coverage profiles. Each variant is depicted in a distinct color, and the average coverage of the selected region is depicted in gray.