| Literature DB >> 35210526 |
Aida Gonzalez-Diaz1,2, Anna Carrera-Salinas1, Miguel Pinto3, Meritxell Cubero1,2, Arie van der Ende4,5, Jeroen D Langereis6,7, M Ángeles Domínguez1,8,9, Carmen Ardanuy1,2,8, Paula Bajanca-Lavado10, Sara Marti11,12,13.
Abstract
Haemophilus influenzae is an opportunistic pathogen adapted to the human respiratory tract. Non-typeable H. influenzae are highly heterogeneous, but few studies have analysed the genomic variability of capsulated strains. This study aims to examine the genetic diversity of 37 serotype f isolates from the Netherlands, Portugal, and Spain, and to compare all capsulated genomes available on public databases. Serotype f isolates belonged to CC124 and shared few single nucleotide polymorphisms (SNPs) (n = 10,999), but a high core genome (> 80%). Three main clades were identified by the presence of 75, 60 and 41 exclusive genes for each clade, respectively. Multi-locus sequence type analysis of all capsulated genomes revealed a reduced number of clonal complexes associated with each serotype. Pangenome analysis showed a large pool of genes (n = 6360), many of which were accessory genome (n = 5323). Phylogenetic analysis revealed that serotypes a, b, and f had greater diversity. The total number of SNPs in serotype f was significantly lower than in serotypes a, b, and e (p < 0.0001), indicating low variability within the serotype f clonal complexes. Capsulated H. influenzae are genetically homogeneous, with few lineages in each serotype. Serotype f has high genetic stability regardless of time and country of isolation.Entities:
Mesh:
Year: 2022 PMID: 35210526 PMCID: PMC8873416 DOI: 10.1038/s41598-022-07185-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Pangenomic analysis of the 37 H. influenzae serotype f genomes. (A) Core-SNP phylogenetic tree, demographic data (country and invasiveness), genes detected, and the assigned allele. Clades I, II and III are indicated by coloured dots. The percentage of strains carrying each gene is presented graphically. (B) Distribution of genes detected in H. influenzae serotype f: core genes (100% of genomes), soft-core genes (95–99%), shell genes (15–94%), and cloud genes (< 15%). Core genes were classified as monoallelic (same allele in all the isolates) or clade segregating alleles (an allelic variant exclusive to one clade). The pie charts show the identity and number of SNPs for alleles of each clade in relation to the alleles of the same gene in other clades.
Figure 2Pangenomic analysis of capsulated H. influenzae. (A) Gene pool of capsulated H. influenzae genomes included in this study. The number of core, soft-core, shell, cloud, and total genes of each serotype was determined using Roary, with a minimum identity percentage of 70% for BLASTp and the -cd parameter adjusted to 100. (B) Relative pangenome composition represented as a percentage of genes per genome of each serotype. Gene pool was defined as the set of all genes in a population. Donut charts indicate the distribution of core (100% of genomes), soft-core (95–99%), shell (15–94%), and cloud genes (< 15%). (C) Correlation between total and core genes in all capsulated H. influenzae genomes from this study and from the NCBI and ENA databases by serotype.
Figure 3Core genome SNP typing of capsulated H. influenzae genomes. Each dot reflects the number of SNPs found in serotype a, b, c, d, e, and f genomes compared to the reference genomes NML-Hia-1 [CC23] (NZ_CP017811.1), 10810 [CC6] (NC_016809.1), M12125 [CC7] (SRR9847495), PTHi-10983 [CC10] (ERR2560729), M15895 [CC18] (NZ_CP031249.1), and KR494 [CC124] (NC_022356.1), respectively. Split violin plots show the distribution of the genomes based on the number of SNPs by each serotype.