| Literature DB >> 30858412 |
Uri Obolski1, Andrea Gori2, José Lourenço3, Craig Thompson3, Robin Thompson3, Neil French4, Robert S Heyderman2, Sunetra Gupta3.
Abstract
Streptococcus pneumoniae, a normal commensal of the upper respiratory tract, is a major public health concern, responsible for substantial global morbidity and mortality due to pneumonia, meningitis and sepsis. Why some pneumococci invade the bloodstream or CSF (so-called invasive pneumococcal disease; IPD) is uncertain. In this study we identify genes associated with IPD. We transform whole genome sequence (WGS) data into a sequence typing scheme, while avoiding the caveat of using an arbitrary genome as a reference by substituting it with a constructed pangenome. We then employ a random forest machine-learning algorithm on the transformed data, and find 43 genes consistently associated with IPD across three geographically distinct WGS data sets of pneumococcal carriage isolates. Of the genes we identified as associated with IPD, we find 23 genes previously shown to be directly relevant to IPD, as well as 18 uncharacterized genes. We suggest that these uncharacterized genes identified by us are also likely to be relevant for IPD.Entities:
Mesh:
Year: 2019 PMID: 30858412 PMCID: PMC6411942 DOI: 10.1038/s41598-019-40346-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Genes associated with IPD.
| Gene | Length (bp) | Best matches, identity (%), e-value, Accession number | Information |
|---|---|---|---|
|
| 452 | 1. phtB, 9E-132, 447/465 (96%), NCBI, AF318954.1 | 1. PHT proteins (aka BHV) are thought to be involved in the invasion process of pneumococci[ |
|
| 330 | hypothetical protein (CPS), 4E-168, 328/330 (99%), NCBI, JQ653094.1 | This is a putative capsular polysaccharide biosynthesis protein, Capsular differences are known to be associated with invasive disease[ |
|
| 249 | hypothetical protein | |
|
| 504 | phtD, 0, 504/504 (100%), NCBI, KP127799.1 | The found phtD hit was a part of a sequence shown to be highly conserved in invasive isolates[ |
|
| 954 | Hypothetical protein (CPS), 0, 954/954 (100%), NCBI, HE651314.1 | This is a putative capsular polysaccharide biosynthesis protein. Capsular differences are known to be associated with invasive disease[ |
|
| 996 | Hypothetical protein | |
|
| 231 | pspC, 1E-67, 167/179 (93%), NCBI, AF154043.2 | pspC was shown to be involved in immune response to bacteremia in mice[ |
|
| 510 | Hypothetical protein | |
|
| 324 | Hypothetical protein (CPS), 2E-161, 320/324 (99%), NCBI, ADM91299.1 | This is a putative capsular polysaccharide biosynthesis protein. Capsular differences are known to be associated with invasive disease[ |
|
| 504 | pspC,0, 502/504 (99%), NCBI, AF154022.1 | pspC was shown to be involved in immune response to bacteremia in mice[ |
|
| 306 | Hypothetical protein | |
|
| 399 | Hypothetical protein | |
|
| 327 | Hypothetical protein (CPS), 0, 511/528 (97%), NCBI, AF316639.1 | This is a putative capsular polysaccharide biosynthesis protein. Capsular differences are known to be associated with invasive disease[ |
| ydcP_1 | 471 | putative protease YdcP, 0, 470/471(99%), NCBI, AFS43444.1 | YdcP is part of the U32 protease family. It is a collagenase, facilitating breaking of extracellular structures tissues, and is a known virulence factor in other bacterial species[ |
|
| 519 | Hypothetical protein (CPS), 0 509/519 (98%), NCBI, AF154022.1 | This is a putative capsular polysaccharide biosynthesis protein. Capsular differences are known to be associated with invasive disease[ |
|
| 147 | L-lactate dehydrogenase (FMN-dependent)-like/alpha-hydroxy acid dehydrogenase, 4E-70, 147/147(100%) | Lactate dehydrogenase was found to be essential enzyme for pneumococcal survival in blood[ |
|
| 480 | Hypothetical protein | |
|
| 1977 | Putative endo-beta-N-acetylglucosaminidase, 0, 1968/1977 (99%), NCBI, AJ870414.1 | |
|
| 528 | Hypothetical protein (CPS), 511/528 (97%), NCBI, JF301964.1 | This is a putative capsular polysaccharide biosynthesis protein. Capsular differences are known to be associated with invasive disease[ |
|
| 516 | pspC, 508/516 (98%), NCBI, AF154043.2 | pspC was shown to be involved in immune response to bacteremia in mice[ |
|
| 489 | Hypothetical protein | |
|
| 387 | Hypothetical protein (partial transposase), 0, 387/387(100%), NCBI, ADM91518.1 | Part of the mobile genetic elements of the bacterium. |
|
| 258 | Hypothetical protein | |
|
| 288 | Hypothetical protein | |
|
| 210 | Hypothetical protein | |
|
| 387 | Hypothetical protein (partial transposase), 0.0, 386/387(99%), NCBI, CP002176 (positions 1374937–1375323) | Part of the mobile genetic elements of the bacterium. |
|
| 510 | Hypothetical protein | |
|
| 168 | Hypothetical protein | |
|
| 489 | pspC, 0, 463/490 (94%), NCBI, AF154022.1 | pspC was shown to be involved in immune response to bacteremia in mice[ |
| lox | 1137 | Lactate oxidase (lox) gene, 0, 1001/1137(88%), NCBI, DQ984140.3 | The |
|
| 840 | Sortase (srtA), 0, 614/740 (83%), NCBI, KX147105.1 | In |
|
| 189 | Hypothetical protein | |
|
| 537 | Hypothetical protein | |
|
| 504 | Hypothetical protein | |
|
| 309 | Hypothetical protein | |
|
| 309 | Hypothetical protein | |
| cpsA | 1446 | cpsA (aka wzg), 0, 1446/1446 (100%), NCBI, KC522490.1 | wzg (aka cpsA) is part of the capsular polysaccharide synthesis gene locus. High expression of cpsA is associated with bacteremia in humans[ |
| bgaA | 6702 | bgaA (Beta-galactosidase BoGH2A), 6466/6704 (96%), NCBI, AF282987.1 | bgaA is hypothesized to be a pneumococcal virulence factor[ |
| cpsA | 1446 | cpsA (aka wzg), 0, 1446/1446 (100%), NCBI, KC522492.1 | wzg (aka cpsA) is part of the capsular polysaccharide synthesis gene locus. High expression of cpsA is associated with bacteremia in humans[ |
|
| 207 | Hypothetical protein | |
|
| 573 | pspC, 0, 566/573 (99%), NCBI, AF154043.2 | pspC was shown to be involved in immune response to bacteremia in mice[ |
|
| 684 | cpsD, 0, 682/684 (99%), NCBI, AFC94091.1 | cpsD mutations were shown to inhibit the possibility of causing bacteremia in mice[ |
|
| 840 | Hypothetical protein |
Figure 1Location and length of genes associated with IPD. (A) Location of identified IPD-associated genes (see Table 1) on a 19A streptococcal genome (accession NC_010380.1). Orange rectangle marks the capsular synthesis locus (CPS). Similar plots using other serotype samples can be found in Figs S5–S6. (B) Boxplots and distributions of log10-transformed gene lengths from the IPD-associated genes, the entire pangenome and the soft-accessory genome used in our analysis (see methods).