| Literature DB >> 31606037 |
Caelin C Potts1, Nadav Topaz2, Lorraine D Rodriguez-Rivera3, Fang Hu3, How-Yi Chang3, Melissa J Whaley1, Susanna Schmink1, Adam C Retchless1, Alexander Chen1, Edward Ramos4, Gregory H Doho4, Xin Wang5.
Abstract
BACKGROUND: Haemophilus influenzae (Hi) can cause invasive diseases such as meningitis, pneumonia, or sepsis. Typeable Hi includes six serotypes (a through f), each expressing a unique capsular polysaccharide. The capsule, encoded by the genes within the capsule locus, is a major virulence factor of typeable Hi. Non-typeable (NTHi) does not express capsule and is associated with invasive and non-invasive diseases.Entities:
Keywords: Capsule locus; Genetic diversity; Haemophilus influenzae; Multilocus sequence typing; Serotype; Whole genome sequencing
Mesh:
Substances:
Year: 2019 PMID: 31606037 PMCID: PMC6790013 DOI: 10.1186/s12864-019-6145-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Population structure of H. influenzae isolates. The genomic relatedness of 688 Hi isolates is depicted as a maximum likelihood phylogeny. The three main clades (I, II, and III) are labeled on the tree and each isolate is color coded by the serotype, as determined by slide agglutination. The NTHi isolates within the serotype-specific subclades are denoted by ψ, σ, or *. The ψ indicates the isolate that had discrepant serotyping results between WGS and SAST methods. The σ indicates the only two NTHi isolates that contained an internal stop codon within a capsule gene. The * indicates the remaining five NTHi isolates (capsule nulls) that were detected within serotype-specific subclades. The tree scale is 0.1 substitutions per site along the length specified. Bootstrap values were 100/100 for all major nodes
MLST distribution within each serotype
| Sequence Type | Number of Isolates |
|---|---|
| Hia – clade I | |
| ST-62 | 13 |
| Other | 1 |
| Hia – clade II | |
| ST-23 | 7 |
| ST-56 | 30 |
| ST-576 | 10 |
| Other | 6 |
| Hib - small clade (most similar to Hid) | |
| ST-222 | 6 |
| Other | 5 |
| Hib - large clade | |
| ST-6 | 21 |
| Other | 17 |
| Hic | |
| ST-9 | 7 |
| Other | 2 |
| Hid | |
| ST-10 | 5 |
| Other | 3 |
| Hie | |
| ST-18 | 44 |
| ST-66 | 22 |
| ST-121 | 8 |
| ST-127 | 8 |
| ST-386 | 5 |
| ND (no | 33 |
| Other | 12 |
| Hif | |
| ST-124 | 122 |
| Other | 9 |
| NTHi | |
| ST-3 | 13 |
| ST-11 | 6 |
| ST-12 | 7 |
| ST-14 | 11 |
| ST-34 | 6 |
| ST-57 | 8 |
| ST-103 | 11 |
| ST-107 | 9 |
| ST-139 | 8 |
| ST-143 | 5 |
| ST-145 | 6 |
| ST-155 | 5 |
| ST-156 | 8 |
| ST-165 | 5 |
| ST-182 | 5 |
| ND (no | 7 |
| Other | 172 |
Fig. 2Genetic organization of the H. influenzae capsule genes. All serotypes contained the Region I genes bexABCD (black) and the Region III genes (gray): hcsA and hcsB. The Region II genes were divergent among serotypes. Each arrow represents a different gene and is labeled with the gene name. The Region II arrows are color coded by serotype
Fig. 3The allelic variation detected for each Region I, II and III gene. Each capsule gene from Regions I and III (a) or Region II (b) are listed on the y-axis. The unique number of alleles detected per gene is denoted within the parentheses. The number of genomes associated with that serotype is also provided. The x-axis depicts the percent identity shared among the nucleotide alleles for each capsule gene. The left side of the bar represents the minimum sequence identity and the right side represents the maximum sequence identity detected. If only one allele was detected, no data are shown
Inter-serotype Sequence Identity between Region II Gene Pairs
| Gene Names | Minimum Similarity | Maximum Similarity |
|---|---|---|
|
| 96.13 | 98.39 |
|
| 56.81 | 62.33 |
|
| 85.11 | 85.39 |
|
| 92.62 | 92.89 |
|
| 94.08 | 94.23 |
|
| 86.81 | 87.01 |
Fig. 4WGS serotyping method for determining H. influenzae serotype from WGS data. The three main steps of this process include identifying the capsule genes, predicting the capsule gene expression, and assigning the predicted serotype
Concordance of the WGS serotyping method and other serotyping methods
| Serotype* | Concordance: SAST and WGS | Concordance: rt-PCR and WGS | ||||
|---|---|---|---|---|---|---|
| Total Isolates | No. of Concordant Isolates | Percent | Total Isolates | No. of Concordant Isolates | Percent | |
| a | 67 | 67 | 100.0 | 64 | 64 | 100.0 |
| b | 49 | 49 | 100.0 | 46 | 46 | 100.0 |
| c | 9 | 9 | 100.0 | 9 | 9 | 100.0 |
| d | 8 | 8 | 100.0 | 8 | 8 | 100.0 |
| e | 131 | 131 | 100.0 | 122 | 122 | 100.0 |
| f | 131 | 131 | 100.0 | 131 | 131 | 100.0 |
| NT | 293 | 292 | 99.7 | 116 | 116 | 100.0 |
| Total | 688 | 687 | 99.9 | 496 | 496 | 100.0 |
Serotype* defined by SAST method. The concordance data is reported as the number and percent of isolates. Only one discordant isolate was identified