Literature DB >> 29298677

The variome of pneumococcal virulence factors and regulators.

Gustavo Gámez1,2,3,4, Andrés Castro5,6, Alejandro Gómez-Mejia5,7, Mauricio Gallego5,6, Alejandro Bedoya5,6, Mauricio Camargo5, Sven Hammerschmidt7.   

Abstract

BACKGROUND: In recent years, the idea of a highly immunogenic protein-based vaccine to combat Streptococcus pneumoniae and its severe invasive infectious diseases has gained considerable interest. However, the target proteins to be included in a vaccine formulation have to accomplish several genetic and immunological characteristics, (such as conservation, distribution, immunogenicity and protective effect), in order to ensure its suitability and effectiveness. This study aimed to get comprehensive insights into the genomic organization, population distribution and genetic conservation of all pneumococcal surface-exposed proteins, genetic regulators and other virulence factors, whose important function and role in pathogenesis has been demonstrated or hypothesized.
RESULTS: After retrieving the complete set of DNA and protein sequences reported in the databases GenBank, KEGG, VFDB, P2CS and Uniprot for pneumococcal strains whose genomes have been fully sequenced and annotated, a comprehensive bioinformatic analysis and systematic comparison has been performed for each virulence factor, stand-alone regulator and two-component regulatory system (TCS) encoded in the pan-genome of S. pneumoniae. A total of 25 S. pneumoniae strains, representing different pneumococcal phylogenetic lineages and serotypes, were considered. A set of 92 different genes and proteins were identified, classified and studied to construct a pan-genomic variability map (variome) for S. pneumoniae. Both, pneumococcal virulence factors and regulatory genes, were well-distributed in the pneumococcal genome and exhibited a conserved feature of genome organization, where replication and transcription are co-oriented. The analysis of the population distribution for each gene and protein showed that 49 of them are part of the core genome in pneumococci, while 43 belong to the accessory-genome. Estimating the genetic variability revealed that pneumolysin, enolase and Usp45 (SP_2216 in S. p. TIGR4) are the pneumococcal virulence factors with the highest conservation, while TCS08, TCS05, and TCS02 represent the most conserved pneumococcal genetic regulators.
CONCLUSIONS: The results identified well-distributed and highly conserved pneumococcal virulence factors as well as regulators, representing promising candidates for a new generation of serotype-independent protein-based vaccine(s) to combat pneumococcal infections.

Entities:  

Keywords:  Streptococcus Pneumoniae; Two component systems; Variome; Virulence factors

Mesh:

Substances:

Year:  2018        PMID: 29298677      PMCID: PMC5753484          DOI: 10.1186/s12864-017-4376-0

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Streptococcus pneumoniae, also known as the pneumococcus, is a Gram-positive, α-hemolytic and facultative aerobic bacterium. This microorganism is normally found as a harmless commensal in the upper respiratory tract of humans. Pneumococi have a great epidemiological importance due to their high impact on public health, causing more than one and a half million of deaths per year around the world [1]. S. pneumoniae is the main etiologic agent of community-acquired pneumonia. However, this is not its only clinical manifestation, because other kind of diseases such as otitis media, sinusitis, septicemia and meningitis are also caused by this pathogen and associated with high mortality rates [2]. Given the particular biochemical and molecular features of Streptococcus pneumoniae (Gram-positive, catalase-negative, optochin-sensitive and bile-soluble bacteria), its identification process in the laboratory is relatively simple. Nevertheless, the great molecular, biochemical and immunological diversity of its capsule and other antigens such as choline-binding proteins make them one of the hardest bacterial pathogens to face because of its variability [3, 4]. The “Quellung Reaction”, developed over 100 years ago by Neufeld, allows the specifical and reliable identification of each one of the >94 serotypes that have been discovered up to date. The capsular polysaccharide is the sine qua non virulence factor, however the pathogenic potential of serotypes may vary and similarly, the frequencies or prevalence varies from one geographic region to the other [5]. Despite this, the capsule is not the only factor required to induce disease by S. pneumoniae. In fact, the surface of the pneumococcus is decorated by various proteins, which have been already associated with its high pathogenic potential. In addition, their interaction level with the host cellular receptors has been proved, exhibiting crucial pathogenic functions such as adhesion, colonization, breaching tissue barriers and immune evasion [6]. An important group of regulatory proteins of great interest are the histidine kinases (HK), located in the bacterial surface and functioning as the sensors of two-component regulatory systems (TCS). The sensing of environmental signals via TCS, regulates the genetic expression of cellular processes that are of great importance such as natural competence, antibiotic resistance, adaptation to different environmental situations, surface proteins expression, and others [7, 8]. In general, TCS are composed of a histidine kinase, a membrane protein sensing the extracellular signals and transmitting these signals to a cytoplasmatic regulator/effector protein refered to as response regulator (RR). This happens via the HK autophosphorylation and a subsequent trans-phosphorylation process. In Streptococcus pneumoniae, 13 TCS and one orphan RR have been identified [7]. The relevance of the cellular, physiological and pathogenic functions that these pneumococcal proteins fulfill, have aroused a great scientific and biotechnological interest, given their potential pharmaceutical applications as vaccine candidates [9]. Nowadays, the antibiotic treatment of the infections caused by the pneumococcus is often complicated due to the increase of antibiotic resistance [10]. Furthermore, prevention by the use of the pneumococcal polysaccharide vaccines and/or pneumococcal conjugated vaccines only helps to control the disease caused by some of the serotypes and has an indirect impact on colonization [9]. Thus, there is an urge to define more global and effective strategies for the treatment and/or prevention, and to fight the pneumococcus and its local and invasive diseases. Consequently, the idea of a protein-based vaccine has taken great importance in the last years. However, in order to be considered or included in a recombinant vaccine formulation, a bacterial protein has to fulfill specific criteria such as: (1) playing an important role in the bacterial fitness and/or pathogenesis of S. pneumoniae, (2) possessing a wide distribution among the circulating strains and clinical isolates, (3) exhibiting a major conservation at its genetic and protein sequence, (4) being inmunogenic, (5) demonstrating protectivity in experimental assays, and (6) having favorable physico-chemical properties for expression and purification of its recombinant products. Streptococcus pneumoniae is a pathogen exhibiting a fratricide behavior and an enormous capacity for natural competence, acquiring foreign genetic material and integrating it into its genome [11]. These processes, in addition to the mutation rates [12, 13], greatly stimulate the horizontal gene transfer with other microorganisms, and explains pneumococcal genetic variability and genome plasticity [14, 15]. This model of pneumococcal population evolution, where recombination highly outpasses mutation, is also caused by the relatively high numbers of repetitive sequences in the genome thereby facilitating the incorporation of foreign DNA in the chromosome [15-18]. In consequence, these events contribute to structural reorganizations, and influence the presence or absence of protein-encoding genes in differente subsets of the global pneumococcal population, making them highly heterogeneous from the core- and pan-genomic point of view [15]. Likewise, the generation and fixation of particular changes in the genome affect the mutation rates, which in turn influence the evolution and conservation of genes and contribute to adaptative changes that potentially lead to an increased virulence and a more complex interaction with the host [19]. Due to these molecular events and their importance, there is a need to fully and globally understand the genetic heterogeneity and variability among the different pneumococcal strains/serotypes (variome), and to get a deeper and detailed molecular undestanding of the different physiological and pathogenic mechanisms that this microorganim uses to cause severe and life-threatening diseases. Definitely, obtaining this knowledge will allow to identify potential pharmaceutical targets for new antimicriobial therapies. By the recognizition of their conservation and distribution degree among pneumococcal strains, this will confirm protein candidates for vaccines. However, despite the availability of a high number of completely sequenced genomes and the importance to analyse the genetic differences among pneumococci, only a few studies have focused on studying its variability from a global perspective, similarly as the Human variome databases do [20]. To date only the “Microbial Variome Database” [21], which possesses and organizes the available information of the variome of the two Gram-negative bacterial species Escherichia coli and Salmonella enterica, is providing such information for microorganisms. Remarkably, there are no open-source data of this nature for any Gram-positive bacterial genome. Hence, this study focused on the construction of the first S. pneumoniae Variome model, starting with the identification of all allellic and protein variants, a mutation and distribution analysis (presence and absence) of the virulence factors and regulators, among a set of pneumococcal strains that possess a fully sequenced and annotated genome.

Methods

Definition of the study population set and determination of the optimal representation of the entire population of pneumococci

The search and selection of the Streptococcus pneumoniae strains for the analysis in this study was done using the microbial database of the “National Center for Biotechnology Information” NCBI (http://www.ncbi.nlm.nih.gov/genome) [22]. Likewise, in order to ensure an optimal representation of the global pneumococcal population, a genomic BLAST of 8290 available S. pneumoniae genomes was carried out. In brief, DNA alignments, employing the tool “Microbial Nucleotide BLAST” [23], that can be found in the website http://blast.ncbi.nlm.nih.gov/Blast.cgi, were performed for all the currently reported draft or complete sequenced genomes. The comparative data was then employed to construct a DNA-based Phylogenetic Tree (dendrogram), by using the Genome Tree Report Tool of the NCBI (ncbi.nlm.nih.gov/genome/tree/176). Afterwards, the file containing the dendrogram, constructed for the 8290 strains, was downloaded from the NCBI database. Finally, the dendrogram file was viewed, analyzed and adapted in order to generate circular, slanted and/or rectangular cladograms, by using the online NCBI Tool “Tree Viewer 1.17.0”, which is available online at the website: ncbi.nlm.nih.gov/projects/treeview (Fig. 1).
Fig. 1

Phylogenetic tree (slanted cladogram) of the pneumococcal genome / strains. By using the online NCBI Tools Genome Tree Report (ncbi.nlm.nih.gov/genome/tree/176) and the Tree Viewer 1.17.0 (ncbi.nlm.nih.gov/projects/treeview), a phylogenetic tree was constructed from the analysis by genomic BLAST of 8290 sequencing projects of pneumococci reported in the NCBI database. The topology of this slanted cladogram showed different pneumococcal lineages, where the selected set of 25 pneumococcal strains can be identified in red as external nodes (the “well-distributed” key features also highlighted in red), evidencing an optimal representation of the pneumococcal population. The overall number of sequenced pneumococcal genomes is provided for each external node. The blue lines depicted those external nodes where fully sequenced and annotated genomes are located

Phylogenetic tree (slanted cladogram) of the pneumococcal genome / strains. By using the online NCBI Tools Genome Tree Report (ncbi.nlm.nih.gov/genome/tree/176) and the Tree Viewer 1.17.0 (ncbi.nlm.nih.gov/projects/treeview), a phylogenetic tree was constructed from the analysis by genomic BLAST of 8290 sequencing projects of pneumococci reported in the NCBI database. The topology of this slanted cladogram showed different pneumococcal lineages, where the selected set of 25 pneumococcal strains can be identified in red as external nodes (the “well-distributed” key features also highlighted in red), evidencing an optimal representation of the pneumococcal population. The overall number of sequenced pneumococcal genomes is provided for each external node. The blue lines depicted those external nodes where fully sequenced and annotated genomes are located

Definition of the virulence factors and two-component regulatory systems to be studied in S. pneumoniae

The search and selection of genes and proteins widely known as virulence factors or gene encoding factors possessing a proven interaction with the human host was done by an exhaustive bioinformatic screening in the database “Virulence Factors DataBase - VFDB” [24], available at the website http://www.mgc.ac.cn/VFs. Aditionally, the virulence factors and proteins involved in interactions with the host were confirmed and completed by a systematic review of the literature [14, 25]. The common names of each one of the selected virulence factors were then introduced in the database UNIPROT [26], available at http://www.uniprot.org/, with the aim of obtaining the locus tag for S. pneumoniae TIGR4 genome/strain. In addition, the genes encoding the HK or RR of the pneumococcal TCS were identified by using the database Prokaryotic Two-Component Systems - P2CS [27], available at the website http://www.p2cs.org/index.php. Likewise, the corresponding locus tag for S. pneumoniae TIGR4 genome / strain, of each one of the histidine kinases genes (hk) and response regulator genes (rr), were also recovered from the same database.

Chromosomal localization of the virulence factor and two-component regulatory systems genes in S. pneumoniae

The chromosomal location of all the genes in the genome of S. pneumoniae TIGR4 and the construction of the genomic maps, in linear or circular representation, was done by using the software SnapGene® (GSL Biotech), available at http://www.snapgene.com. In brief, the studied genomes of S. pneumoniae were imported through its corresponding access code in GenBank (ie: NC_003028.3 for TIGR4). Then, the chromosomal location of each virulence factor gene, and the factors involved in the interaction with the host and the genes encodying for proteins of simple or two-component regulatory systems were identified. Finally, the lineal maps for the scale genomic localization for the virulence factors and the circular maps for the genomic periphery of the genes that form the two-component regulatory systems were constructed.

Distribution of the virulence factors and two-component regulatory systems in the different strains of S. pneumoniae

The identification of the genetic and protein sequences of interest to perform the comparative analysis was done, having as reference the codes (Locus Tag) in the genomes of S. pneumoniae TIGR4 and/or R6 in the database Kyoto Encyclopedia of Genes and Genomes – KEGG [28], available at http://www.kegg.jp/kegg/. Once every gene of interest was established in the database, a series of comparisons (BLASTs) were performed using the GenomeNet [29], available at http://www.genome.jp/, using only the fully sequenced and annotated genomes of S. pneumoniae. For the nucleotide sequences the search was performed using the program BLASTN 2.2.29+, which uses nucleotide vs nucleotide alignments based on a punctuation matrix BLOSUM62 [23, 30]. In the same way, the search was done for the amino acid sequences using the program BLASTP 2.2.29+ [31, 32], that performs amino acids vs amino acids alignments based on a similar matrix. Once the BLAST was finalized for each virulence factor, the list was purged using as selection criteria genes with an expectancy value: e-Value = 0. The inclusion of genes with an e-value >0 was done by direct visual inspection of the alignments to check that it was indeed the same sequence. By having defined the list with the genes and proteins that fulfilled the selection criteria, it was defined to which strains of S. pneumoniae they belong. All the DNA and protein sequences were downloaded and stored in an organized way using the fasta format.

Genetic variability (variome) of the virulence factors and two-component regulatory systems among the different pneumocococal strains

The multiple comparative alignments of pneumococcal sequences were done using the web tool MultAlin [33], available at http://multalin.toulouse.inra.fr/multalin/, for which an identity matrix 1–0 was used to assign a penalty even for the slightest change in the nucleotides or amino acids sequences, covering substitution, deletions, insertions and variations in the length. From these analyses, the number of allelic and protein variants were determined for each gene according to the registry value assigned by the program to each sequence, where equal sequences have the same registry value, while different sequences possess different values. The results of the alignments were manually curated and stored for further analysis. Finally, the precise determination of the total mutations, synonymous and nonsynonymous was done using the software DnaSP V.5.1 [34, 35], available at http://www.ub.edu/dnasp/. There, all the sequences found for a determined gene were introduced and the calculations were perfomed for the corresponding type of mutation as mentioned before.

Results and discussion

“Hundreds to thousands” of S. pneumoniae strains and clinical isolates recovered from the nasopharynx, blood or cerebrospinal fluid (CSF) have been included up to date in genomic sequencing projects worldwide. However, pneumococcal strains, whose genomes are fully sequenced, annotated and publicly available, are the focus of this study. Therefore, a set of 25 pneumococcal strains were selected from the NCBI database, as population study, to perform the bioinformatic analysis needed to accomplish the construction of the variome of the virulence factors and two-component regulatory systems of Streptococcus pneumoniae (Table 1).
Table 1

The study population set of 25 S. pneumoniae strains included in this study and their serotypes

S. pneumoniae StrainSerotype# of GenesNCBI Annotation
D3922069NC_008533.1
R6No Capsule1967NC_003098.1
TIGR442228NC_003028.3
INV10412003NC_017591.1
AP20011A2284NC_014494.1
JJA142235NC_012466.1
ATCC 70066923F2224NC_011900.1
INV200142113NC_017593.1
CGSP14142276NC_010582.1
G5419F2186NC_011072.1
gamPNI037312226NC_018630.1
P103112254NC_012467.1
SPN03415631956NC_021006.1
SPN99403931974NC_021005.1
SPN99403831974NC_021026.1
SPN03418331985NC_021028.1
OXC14132037NC_017592.1
670-6B6B2430NC_014498.1
A02619F2153NC_022655.1
Taiwan19F-1419F2205NC_012469.1
ST55619F2219NC_017769.1
TCH8431/19A19A2355NC_014251.1
SPNA4531921NC_018594.1
70,58552323NC_012468.1
Hungary19A 619A2402NC_010380.1
The study population set of 25 S. pneumoniae strains included in this study and their serotypes A Variome model of the Pneumococcal Virulence Factors and Regulators is an intraspecific study, aiming to highlight variable genetic loci on the genome of Streptococcus pneumonie. A perfect and ultimate Variome model would be that constructed with the 100% of the genomic information correctly assessed from the entire pneumococcal population. However, the current state of the art is far away from this scenario and an optimal representation of the pneumococcal sets assessed up to date would be appropriate in order to validate these genomic analyzes. Currently, 8290 pneumococcal sequencing projects are reported as draft or complete genomes in the Genome Assembly and Annotation Report of the NCBI database. Therefore, a global genomic BLAST (DNA alignment) of those 8290 available S. pneumoniae genomes/strains was performed and a DNA-based Phylogenetic Tree was constructed by using the Genome Tree Report Tool of the NCBI. The topology of this phylogenetic tree (slanted cladogram) showed different pneumococcal lineages, where the selected set of 25 pneumococcal genomes/strains can be identified as external nodes (“well-distributed” key features highlighted in red), evidencing an optimal representation of the pneumococcal population (Fig. 1). In addition, it is important to highlight that the serotypes (1, 2, 3, 4, 5, 6B, 11A, 14, 19A, 19F and 23F), represented in this study population set, have been described as the pneumococcal types with the highest pathogenic potencial, due to the high burden of invasive pneumococcal diseases (IPDs) they cause worldwide. This is the reason why the majority of them (except serotypes 2 and 11A) have been included in the pneumococcal conjugate vaccines (PCVs) currently used for immunization [1]. An initial considerable number of pneumococcal virulence factor genes were identified, by employing the database VFDB [24]. This database provided further detailed information to establish their function, pathogenic role and type of interaction with a receptor in its human host. Aditionally, a systematic screening of the literature [14] did not only allow the confirmation of identified factors, but also ensured the posibility to complement the list with additional factors that have not been included in the databases. Likewise, the number of the tcs genes (27) was determined using the database Prokaryotic 2-Component Systems - P2CS [27]. In total, 92 different genes encoding 61 surface proteins, 4 stand alone transcriptional regulators, 13 HKs and 14 RRs have been selected and included in this work for the construction of the variome, after being classified by their function and grouped according to their molecular mechanisms of surface-exposure (Table 2).
Table 2

Function or pathogenic role of the virulence factors and two-component regulatory systems of S. pneumoniae

Virulence FactorsProtein NameFunction and/or Pathogenic Role
LPxTG - ProteinsBgaAβ-Galactosidaseβ-Galactosidase Enzyme
EndoDEndo-β-N-Acetylglucosaminidase DVirulence
PclAPneumococcal Collagen-Like ProteinAdherence and Invasion
SpGH101Endo-α-N-AcetylgalactosaminidaseVirulence
StrHβ-N-Acetylhexosaminidaseβ-N-Acetylhexosaminidase Enzyme
NanANeuraminidase AHydrolytic Enzyme, Adherence and Colonization
PfbAPlasmin- and Fibronectin-Binding Protein AAdherence, Immune Evasion and Antiphagocytosis
PrtASubtilysin-Like Serine ProteaseVirulence
PavBPneumococcal Adherence and Virulence Protein BAdherence and Colonization
KsgADimethyladenosine TransferaseVirulence
SpuAAlkaline AmyllopullullanasePullullanase Enzyme and Immune Evasion
HysAHyaluronate LyaseHyaluronidase Enzyme and Colonization
SP_1492Cell Wall Surface Anchor Protein Family, Mucin-Binding ProteinVirulence
ZmpAZinc Metalloprotease A, IgA1IgA1 Protease Enzyme and Colonization
ZmpBZinc Metalloprotease BImmune Evasion and Colonization
ZmpCZinc Metalloprotease CImmune Evasion and Colonization
ZmpDZinc Metalloprotease D, IgA1 Paralog ProteaseImmune Evasion and Colonization
PsrPPneumococcal Serine-Rich Repeats ProteinAdherence
RrgAPilus-1 Tip Protein (Adhesin)Adherence
RrgBPilus-1 Backbone ProteinAdherence
RrgCPilus-1 Anchore ProteinAdherence
PitAPilus-2 Subunit, Ancillary ProteinAdherence
PitBPilus-2 Subunit, Backbone ProteinAdherence
Choline-Binding Proteins (CBPs)LytAAutolysin (N-Acetyl-Muramoyl-L-Alanine Amidase)Autolytic Enzyme, Cell Wall Digestion and Autolysis
LytBEndo-β-N-AcetylglucosamidaseImmune Evasion and Colonization
LytCLysozyme (1,4-β-N-Acetylmuramidase)Adherence, Immune Evasion and Colonization
PceCholine-Binding Protein E, Phosphorylcholine EstearasePhosphorylcholine Estearase Enzyme, Adherence, Colonization and Cellular Metabolism
PcpAPneumococcal Choline-Binding Protein AProtection Against Lung Infection and Sepsis
PspAPneumococcal Surface Protein ACellular Metabolism and Immune Evasion
PspCPneumococcal Surface Protein C, Choline-Binding Protein AAdherence, Immune Evasion, Colonization and Invasion
CbpCCholine-Binding Protein CVirulence
CbpDCholine-Binding Protein DColonization
CbpFCholine-Binding Protein FVirulence
CbpGCholine-Binding Protein GAdherence and Colonization
CbpICholine-Binding Protein IVirulence
CbpJCholine-Binding Protein JVirulence
SP_0667Pneumococcal Surface Protein (Putative Lysozyme)Virulence
LipoproteinsGlnQGlutamine TransporterCellular Metabolism
PiaAIron-Compound ABC TransporterPeptidil-Prolil Isomerase (PPIase) Enzyme
PiuAIron-Compound ABC TransporterPeptidil-Prolil Isomerase (PPIase) Enzyme
PpiAStreptococcal Lipoprotein Rotamase A, Peptidil-Prolil Isomerase (PPIases) EnzymePeptidil-Prolil Isomerase (PPIase) Enzyme
PsaAPeptide Permease Enzyme, Manganese ABC Transporter, Manganese-Binding LipoproteinImmune Evasion
PpmAFoldase Protein PrsA, Proteinase Maturation AAdherence, Immune Evasion, Strain-Specific Colonization and Evasion of Phagocytosis
AliAOligopeptide ABC TrasporterAdherence
PhtAPneumococcal Histidine Triad AAdherence and Immune Evasion
PhtBPneumococcal Histidine Triad BAdherence and Immune Evasion
PhtDPneumococcal Histidine Triad DAdherence and Immune Evasion
PhtEPneumococcal Histidine Triad EAdherence and Immune Evasion
Non-Classical Surface-Exposed ProteinsEnoEnolase (2-Phosphoglycerate Dehydratase)Glycolytic Enzyme, Adherence and Colonization
GAPDHGlyceraldehyde-3-Phosphate DehydrogenaseGlycolytic Enzyme, Adherence and Colonization
HtrAHigh-Temperature Requirement A, Serine Protease (Heat Shock Protein)Serine Protease Enzyme
PavAPneumococcal Adherence and Virulence Protein AAdherence, Immune Evasion, Colonization and Translocation
Pbp1BPenicillin-Binding Protein 1BAntibiotic Resistance
StkPSerine/Threonine Protein KinaseCellular Metabolism and Fitness
Usp45PcsB, Secreted 45-KDa ProteinVirulence
NanBNeuraminidase BHydrolytic Enzyme, Adherence and Colonization
PppAPneumococcal Protective Protein A, Non-Heme Iron-Containing FerritineColonization
Fic-LikeFic-Like Cell Fillamentation ProteinPutative Cytotoxicity
6PGD6-Phosphogluconate DehydrogenaseVirulence
PlyPneumolysinCytolytic Toxin, Adherence, Immune Evasion, Invasion, Dissemination and Complement Activation
NanCNeuraminidase CVirulence
RegulatorsRlrAPathogenicity Island rlrA Transcriptional RegulatorVirulence
MgrAMgrA Family Transcriptional RegulatorVirulence
MerRMerR Family Transcriptional RegulatorVirulence
PsaRIron-Dependent Transcriptional RegulatorVirulence
Histidine Kinases (HKs)HK01Sensor Histidine KinaseVirulence
HK02Sensor Histidine Kinase, VicKAntibiotic Resistance, Virulence and Fitness
HK03Sensor Histidine Kinase, LiaSAntibiotic Resistance and Stress Protection
HK04Sensor Histidine Kinase, PnpSGenetic Competence, Fitness, Immune Evasion
HK05Sensor Histidine Kinase, CiaHAntibiotic Resistance, Genetic Competence and Pathogenesis
HK06Sensor Histidine KinaseColonization and Invasion
HK07Sensor Histidine Kinase, YesMFitness
HK08Sensor Histidine Kinase, SaeSPathogenesis and Fitness
HK09Sensor Histidine KinaseVirulence
HK10Sensor Histidine Kinase, VncSAntibiotic Resistance
HK11Sensor Histidine KinaseBiofilm Formation
HK12Sensor Histidine Kinase, ComDGenetic Competence
HK13Sensor Histidine Kinase, BlpHVirulence
Response Regulators (RRs)RR01Response RegulatorVirulence
RR02Response Regulator, VicRAntibiotic Resistance, Virulence and Fitness
RR03Response Regulator, LiaRAntibiotic Resistance and Stress Protection
RR04Response Regulator, PnpRGenetic Competence, Fitness, Immune Evasion
RR05Response Regulator, CiaRAntibiotic Resistance, Genetic Competence, Pathogenesis
RR06Response RegulatorColonization and Invasion
RR07Response Regulator, YesNFitness
RR08Response Regulator, SaeRPathogenesis and Fitness
RR09Response RegulatorVirulence
RR10Response Regulator, VncRAntibiotic Resistance
RR11Response RegulatorBiofilm Formation
RR12Response Regulator, ComEGenetic Competence
RR13Response Regulator, BlpRVirulence
RR14Response RegulatorVirulence

The proteins are grouped by classes, depending on their surface-exposure mechanism. The names, abreviations and function of the proteins were obtained from literature references

Function or pathogenic role of the virulence factors and two-component regulatory systems of S. pneumoniae The proteins are grouped by classes, depending on their surface-exposure mechanism. The names, abreviations and function of the proteins were obtained from literature references The genomes of 25 analyzed pneumococcal strains comprise genome sizes ranging from 2,024,476 bp in SPN034156 up to 2,245,615 bp in Hungary 19A-6. Likewise, the G + C content varies between 39.50% in CGSP14 and 39.90% in SPN034156. 670-6B is the strain with the highest number of genes (2430) and proteins (2352) and SPN034156 is the strain with the lowest number of genes (1956) and proteins (1799). Hence, the difference among genomes, regarding the number of genes and proteins can be up to 474 genes and 553 proteins, respectively. The overall number of genes for each pneumococcal genome evaluated here overmatches the overall number of proteins because the reported number of genes includes all the tRNA-, rRNA- and protein-encoding genes. Considering the chromosomal localization of pneumococcal virulence factors genes, they are all distributed along the pneumococcal genome (Fig. 2). Interestingly, these genes are located in a co-oriented manner in relation with the origin of replication (oriC: 2.160.822–196). During the bidirectional replication of the genome, gene transcription must be simultaneous [36]. Hence, for the genes oriented in opposite direction to the corresponding replication fork, both molecular machineries will run into a frontal collision that might affect at least one of the processes. For replication, this phenomenon implies a genomic instability, while the gene transcription is probably inefficient. Previous studies have proven that the essential and highly constitutively expressed genes are co-oriented [36]. For the pneumococcus, 30 of the 36 genes encoding virulence factors are localized in the first half of the genome, on the forward strand, and co-oriented with the replication fork clockwise. Similarly, 21 of the 27 virulence factor genes localized on the second half of the genome, are located on the reverse strand and co-oriented with the replication fork moving anti-clockwise (Fig. 2). A similar genome organization is observed for the 27 genes that encode the TCSs in S. pneumoniae, where only one operon, the tcs04 genes (TCS04), is not co-oriented with the replication fork (Fig. 3). These data reinforce the idea that the virulence factor genes and the genes of the tcs are highly important for the pneumococcal interaction with the human host, and its pathogenic potential in processes such as adherence, colonization, invasion, immune evasion, fitness, antibiotic resistance and natural competence (Table 2).
Fig. 2

Chromosomal localization and direction of the virulence factor genes of S. pneumoniae TIGR4. Lineal representation of the pneumococcus genome. The arrows, drawn at scale, localize 62 of the 65 virulence factors and simple regulation genes considered in this study (pitA, pitB and zmpD are not present in the genome of TIGR4). Each color represents a different class of codified protein: blue = sortase-anchored proteins with an LPxTG cleavage motif; violet = choline-binding proteins (CBPs); green = lipoproteins, yellow = non-classical surface proteins (NCSP), and red = stand-alone regulators. This map was constructed using the Software SnapGene® (GSL Biotech; Available at snapgene.com)

Fig. 3

Localization and direction of the two component systems genes in S. pneumoniae TIGR4. Circular representation of the pneumococcal genome. The arrows, not drawn at scale, localize the 27 genes which codifies for the proteins of the 13 two component systems +1 incomplete. Each color indicates a different class of codified protein: red = histidine kinase sensors and blue = response regulators Proteins. This map was constructed using the Software SnapGene® (GSL Biotech; Available at snapgene.com)

Chromosomal localization and direction of the virulence factor genes of S. pneumoniae TIGR4. Lineal representation of the pneumococcus genome. The arrows, drawn at scale, localize 62 of the 65 virulence factors and simple regulation genes considered in this study (pitA, pitB and zmpD are not present in the genome of TIGR4). Each color represents a different class of codified protein: blue = sortase-anchored proteins with an LPxTG cleavage motif; violet = choline-binding proteins (CBPs); green = lipoproteins, yellow = non-classical surface proteins (NCSP), and red = stand-alone regulators. This map was constructed using the Software SnapGene® (GSL Biotech; Available at snapgene.com) Localization and direction of the two component systems genes in S. pneumoniae TIGR4. Circular representation of the pneumococcal genome. The arrows, not drawn at scale, localize the 27 genes which codifies for the proteins of the 13 two component systems +1 incomplete. Each color indicates a different class of codified protein: red = histidine kinase sensors and blue = response regulators Proteins. This map was constructed using the Software SnapGene® (GSL Biotech; Available at snapgene.com) The analysis of the distribution of genes associated with virulence and host-pathogen interactions among the studied pneumococcal strains revealed that only 26 of the 65 genes considered here are present in the all 25 strains. These genes encode for products involved in different functions such as cell wall hydrolysis, ABC transporters and structural proteins implied in the adherence to host tissue, the so-called adhesins. Interestingly, after preliminar inspection (by locus tag, identifier names and/or product sizes) of the datasets and supplementary material reported by van Tonder and colleagues in 2017, only a few of the pneumococcal virulence factors (PspC, KsgA, and 4 hypothetical lipoproteins) and regulators (RR04, HK08, RR08, RR09, RR10) were found in the pneumococcal “supercore” genomic list of 303 genes, based on the analysis of 3121 pneumococci recovered from healthy individuals from four different subsets of the global pneumococcal population [15]. These findings, if confirmed after deeper analysis of the datasets based on sequence comparison, may indicate that pneumococcal pathogenesis is a much more complex process than thought before. While most of the genes have a single copy in the genome, the lytA gene, encoding the major pneumococcal autolysin, is found also in two and even three copies in 13, and 2 strains, respectively. This is most likely due to the multiple integration of prophages in the chromosomal DNA [37] (Table 3). In strain SPNA45, the gene gnd, encoding the enzyme 6-phophogluconate dehydrogenase, is duplicated and fused with a second copy of its downstream neighbor gene, which encodes the orphan response regulator (rr14). The remaining 39 of the 65 virulence factor genes were found to belong to the accesory genome, presenting different degrees of absence in the 25 strains. Thus, all these genes are not essential but are beneficial for fitness and pathogenesis. Striking examples are the genes encoding the Pilus-1 and Pilus-2 structures that have been identified to mediate adherence, contribute to virulence and promote invasion [38-42]. These genes are located on pathogenicity islands (PAI) and these islands contain also the genes required for cell surface anchoring and regulation [38-41]. Remarkably, strains like ST556, Taiwan19F-14 and TCH8431/19A, were detected here as positive for both types of pili (1 and 2). Among the other genes with restricted presence in some strains it is important to mention that they encode for sortase-anchored proteins or choline-binding proteins (CBPs), as well as histidine triad proteins (pht genes). These gene products are associated with different processes of bacterial fitness and pathogenesis (Tables 3 and 2) [6, 43, 44]. Regarding the distribution and data of the analyzed strains for the TCS most of them were found in the 25 pneumococcal strains. Exceptions are presented by the TCS07 and TCS12, which contribute to fitness and competence, respectively [7, 45]. These TCS are absent in a couple of strains (Table 4). In some other strains genes like hk01, hk12 and rr04, presented incomplete sequences, an artefact leading to truncated and hence non-functional proteins/regulators (Tables 3 and 2). Interestingly, only the genes encoding the hk08, rr08, rr09, rr06 and rr04 were found to belong to the “supercore” genomic set of genes reported by van Tonder et al., in 2017 [15], indicating the important role these highly conserved and well-distributed regulatory proteins play in the pneumococcus and in its interplay with the environment.
Table 3

Distribution of the virulence factor and regulation genes of S. pneumoniae

Virulence FactorsTIGR4D39R670,585Hungary19A 6SPN994038SPN994039SPN034183OXC141SPN034156ST556Taiwan19F-14TCH8431/19AA026INV200CGSP14INV104G54AP200P1031gamPNI0373SPNA45ATCC 700669JJA670-6B
LPxTG - Proteins bgaA 1111111111111111111111111
endoD 1111111111111111111111111
pclA 1111111111111111111111111
spGH101 1111111111111111111111111
strH 1111111111111111111111111
nanA 1 111111111111111111111111
pfbA 1111111111111111111110111
prtA 1111111111111111111110111
pavB 1111111111111111111101011
ksgA 1111111101111111111101011
spuA 1111111110111111111111111
hysA 1111100000111111111111111
SP_1492 1111100000111110111111011
zmpA 1111111111111100101000001
zmpB 1111111111111111111110101
zmpC 1000000000000000011000000
zmpD 0000000000000011010110110
psrP 1001100000000001100000000
rrgA 1000100000111100000000001
rrgB 1000100000111100000000001
rrgC 1000100000111100000000001
pitA 0000010000111000101000000
pitB 0000010000111000101000000
Choline-Binding Proteins (CBPs) lytA 111 2 2 2 2 2 2 2 2 1111111 2 2 2 2 2 3 3
lytB 1111111111111111111111111
lytC 1111110000111011111111111
pce 1111111111111111111111111
pcpA 1111100000111111111001111
pspA 1111111111111111111110 1 01
pspC 1111100001111101101111111
cbpC 1111111111111110111110011
cbpD 1111111111111111111111111
cbpF 1111100000000000000000111
cbpG 1000000011111111111111110
cbpI 1000000000000000011000000
cbpJ 1000011111111111000000111
SP_0667 1111111111111111111000000
Lipoproteins glnQ 1111111111111111111111111
piaA 1111111111111111111111111
piuA 1111111111111111111111111
ppiA 1111111111111111111111111
psaA 1111111111111111111111111
ppmA 1111111111111101100111111
aliA 1111100000111111111110111
phtA 1111000000000000101110001
phtB 1000000000110010111010000
phtD 1111111110111011111111111
phtE 1111111111111111111111111
Non-Classical Surface-Exposed Proteins eno 1111111111111111111111111
gap 1111111111111111111111111
htrA 1111111111111111111111111
pavA 1111111111111111111111111
pbp1B 1111111111111111111111111
stkP 1111111111111111111111111
usp45 1111111111111111111111111
nanB 1111011111111111111111111
pppA 1111111011111111111111111
fic-Like1111111111111111110111111
gnd 111111111111111111111 2 111
ply 111111111111111111111 1 111
nanC 1000100000000011111000111
Regulators mgrA 1111111111111111111111111
psaR 1111111111111111111111111
merR 1010011111101111011110111
rlrA 1000100000111100000000001

The present table shows the absence (0) or presence (1, 2 or 3) of each considered genes in the 25 strains selected for this study. The number (1, 2 or 3) indicates the amount of copies of each gene in the genome. lytA is the only factor with more than one copy per genome. In the strain SPNA45, the gene gnd was found duplicated (2) and fused with a duplication of its neighbor gene (rr14) downstream. In the gene nanA of TIGR4 (1) a shift in its ORF was found. However, it has also been reported that NanA is expressed in this pneumoccoccal strain. Gene defective copies (genes with any alteration in their primary DNA sequences) are depicted in bold and italics: In the SPNA45 strain ply is fused with a copy of lytA, and pspA is defective in the ATCC700669 pneumococcal strain

Table 4

Distribution of the genes that conform the two-component systems in S. pneumoniae

Two-Component Systems (TCSs)TIGR4D39R670,585Hungary19A 6SPN994038SPN994039SPN034183OXC141SPN034156ST556Taiwan19F-14TCH8431/19AA026INV200CGSP14INV104G54AP200P1031gamPNI0373SPNA45ATCC 700669JJA670-6B
Histidine Kinases (HKs) hk01 111111111111111111 1 111111
hk02 1111111111111111111111111
hk03 1111111111111111111111111
hk04 1111111111111111111111111
hk05 1111111111111111111111111
hk06 1111111111111111111111111
hk07 1111111111111110111111111
hk08 1111111111111111111111111
hk09 1111111111111111111111111
hk10 1111111111111111111111111
hk11 1111111111111111111111111
hk12 1111111111 1 11111011110111
hk13 1111111111111111111111111
Response Regulators (RRs) rr01 1111111111111111111111111
rr02 1111111111111111111111111
rr03 1111111111111111111111111
rr04 111111111111111111 1 111111
rr05 1111111111111111111111111
rr06 1111111111111111111111111
rr07 1111111111111110111111111
rr08 1111111111111111111111111
rr09 1111111111111111111111111
rr10 1111111111111111111111111
rr11 1111111111111111111111111
rr12 1111111111111111111101111
rr13 1111111111111111111111111
rr14 111111111111111111111 2 111

The table shows, the absence (0) or presence (1 or 2) of each gene considered in the 25 strains selected for this study. The number (1 or 2) indicates the amount of copies per gene in each genome. rr14 is the only gene with more than one copy per genome (2), which is actually fused with its neighbor gene (gnd) upstream. In bold and italics, three genes are observed (hk01, hk12 and rr04) in two different strains which might have some alteration (insertion or deletion) in their primary DNA and protein Sequence. The two-component system NisK-NisR of the pneumococcus is rare, of the 25 strains analyzed in this study only in the strain 70,585 was found

Distribution of the virulence factor and regulation genes of S. pneumoniae The present table shows the absence (0) or presence (1, 2 or 3) of each considered genes in the 25 strains selected for this study. The number (1, 2 or 3) indicates the amount of copies of each gene in the genome. lytA is the only factor with more than one copy per genome. In the strain SPNA45, the gene gnd was found duplicated (2) and fused with a duplication of its neighbor gene (rr14) downstream. In the gene nanA of TIGR4 (1) a shift in its ORF was found. However, it has also been reported that NanA is expressed in this pneumoccoccal strain. Gene defective copies (genes with any alteration in their primary DNA sequences) are depicted in bold and italics: In the SPNA45 strain ply is fused with a copy of lytA, and pspA is defective in the ATCC700669 pneumococcal strain Distribution of the genes that conform the two-component systems in S. pneumoniae The table shows, the absence (0) or presence (1 or 2) of each gene considered in the 25 strains selected for this study. The number (1 or 2) indicates the amount of copies per gene in each genome. rr14 is the only gene with more than one copy per genome (2), which is actually fused with its neighbor gene (gnd) upstream. In bold and italics, three genes are observed (hk01, hk12 and rr04) in two different strains which might have some alteration (insertion or deletion) in their primary DNA and protein Sequence. The two-component system NisK-NisR of the pneumococcus is rare, of the 25 strains analyzed in this study only in the strain 70,585 was found The estimation of the variability for each individual virulence factor and pneumococcal regulator (at the DNA and protein level) allowed the construction of a partial variome for the analysed 25 pneumococcal strains. Briefly, the variome takes into consideration the estimation of (1) the presence, absence or the number of copies of genes in the different strains, (2) the number of total synonymous and nonsynonymous mutations, and (3) the number of allelic and protein variants explaining the variability for each factor. The results summarized in Tables 5 and 6, contain the data for the genes and proteins associated to virulence and host-pathogen interaction, and also the data for the stand-alone and TCS regulators. Specifically there are some identified factors with the best distribution and highest evolutionary conservation, These were (1) the ply gene encoding the sole pneumococcal cytolysin and cytotoxin pneumolysin [46], (2) the enolase, which encodes the enzyme enolase (2-phosphoglycerate dehydratase) and has an essential function in the metabolism [47], but also interacts specifically with plasmin(ogen) and is therefore involved in fibrinolytic processes, adherence and virulence, and (3) the pcsB (Usp45) gene, which encodes for a 45-kDa secreted and immunogenic protein that is involved in cell division and stress response [48]. As for the mutations, these three proteins presented a minor number of changes, in comparison with others proteins that were also analyzed. The variome of the TCS (Table 6) allowed to conclude that the most conserved genes from the evolutionary point of view, are the genes hk05 and rr05 of ciaR/H (tcs05). The TCS CiaRH is involved in the resistance to cefotaxime, regulation of genetic competence and increase in pathogenicity in the respiratory tract in murine models [7, 49, 50]. Meanwhile, hk02 and rr02 (WalR/K, MicA/B or VicR/K), have been associated with resistance to erythromycin and are essential for the bacterial growth. Nevertheless, the latter was proven to be due to its regulon (pcsB), and was no longer essential upon ectopic expression of PcsB [7, 48]. Pneumococcal TCS08 is involved in the genetic regulation of pilus-1 [41]. The mutation analysis showed that the response regulators exhibited a lower rate of variations in comparison to the histidine kinases, being the response regulators rr05, rr02, rr06, and rr08 the most conserved. All the results obtained in this study support the global idea of a new generation of protein-based and serotype-independent vaccines for Streptococcus pneumoniae. The basis is the high degree of distribution and conservation of the virulence proteins in combination with the importance of their functions and immunogenic capacities. This probably makes them ideal pharmacological targets to treat the pneumococcus and its diseases. This might be an alternative to the immunization with the conjugated serotypes, or represent a strategy to combine immunogenic and highly conserved proteins with capsular polysaccharides to generate a serotype-independent immune response.
Table 5

Analysis of the Variome of the virulence factor genes of S. pneumoniae

Virulence Factor GenesLengthMutationsVariantsAnalyzed Sequences
NameLocus in TIGR4 / R6Gene (bp)Protein (a.a.)OverallSynonimousNon-SynonimousAllellesProtein
lytA sp_1937 / spr1754 95731820815454252042
gnd sp_0375 / spr0335 142547482757191126
strH sp_0057 / spr0057 39391312873255212125
lytB sp_0965 / spr0867 19776581328448222025
endoD sp_0498 / spr0040 498016591397069202025
nanA sp_1693 / spr1536 31081035460282178201925
bgaA sp_0648 / spr0565 67022233415247168191825
spGH101 sp_0368 / spr0328 53041767348238110181825
glnQ sp_0609 / spr0534 765254402218171725
phtE sp_1004 / spr0908 31201039562828201625
pbp1B sp_2099 / spr1909 2466821811665171625
pce sp_0930 / spr0831 18846271488068161625
pavA sp_0966 / spr0868 165655140182217 15 25
cbpD sp_2201 / spr2006 134744849282118 14 25
piuA sp_1872 / spr1687 9663212271514 12 25
htrA sp_2239 / spr2045 1182393128414 10 25
pclA sp_1546 / spr1402 6302091991013 10 25
mgrA sp_1800 / spr1622 14824932115613 9 25
ppiA sp_0771 / spr0679 804267168812 9 25
stkP sp_1732 / spr1577 19806595246618 7 25
gap sp_2012 / spr1825 10083351412213 7 25
psaA sp_1650 / spr1494 93030953411213 6 25
psaR sp_1638 / spr1480 65121610559 6 25
piaA sp_1032 / spr0934 10263419277 6 25
usp45 sp_2216 / spr2021 11793929638 4 25
eno sp_1128 / spr1036 13054341716112 2 25
spuA sp_0268 / spr0247 384312801217744181824
prtA sp_0641 / spr0561 64232140436272164181624
nanB sp_1687 / spr1531 2094697452025181624
pfbA sp_1833 / spr1652 212770820980129151524
pppA sp_1572 / spr1430 537178694323161224
ply sp_1923 / spr1739 14164712019114 2 24
pavB sp_0082 / spr0075 2574857(−)(−)(−)201723
phtD sp_1003 / spr0907 2520839604331273171723
zmpB sp_0664 / spr0581 56461881(−)(−)(−)151523
pspA sp_0117 / spr0121 2235744(−)(−)(−)181722
cbpC sp_0377 / spr0337 10233401768789151422
ksgA sp_1992 / spr1806 66622136828141422
ppmA sp_0981 / spr0884 9423139639622
hysA sp_0314 / spr0286 32011066914150181720
lytC sp_1573 / spr1431 1473490802060171620
pspC sp_2190 / spr1995 2082693(−)(−)(−)191919
aliA sp_0366 / spr0327 1986661675017171719
SP_0667 sp_0667 / spr0583 999332874938131319
merR sp_0739 / spr0649 741246179812919
pcpA sp_2136 / spr1945 1866621432518171318
SP_1492 sp_1492 / spr1345 60920218612111118
cbpG sp_0390 / spr0349 8582851064759151517
zmpA sp_1154 / spr1042 60152004(−)(−)(−)101017
cbpJ sp_0378 / Absent 99933287523510915
nanC sp_1326 / Absent 22237406439257710
phtA sp_1175 / spr1061 2409802483414879
phtB sp_1174 / Absent 2460819331202129778
cbpF sp_0391 / spr0351 1023340572928778
zmpD Absent / Absent 52381745(−)(−)(−)657
rrgA sp_0462 / Absent 2682893420225195557
rrgC sp_0464 / Absent 118239319811557
rrgB sp_0463 / Absent 1998665738290448447
rlrA sp_0461 / Absent 1530509000227
pitA Absent / Absent 1770589101556
pitB Absent / Absent 1233410000336
psrP sp_1772 / Absent 14,3314776(−)(−)(−)555
zmpC sp_0071 / Absent 55711856211223
cbpI sp_0069 / Absent 636211000113

Data of the punctual genetic variability (total mutations, synonymous and nonsynonymous + allelic and protein variants) estimated for each one of the virulence factors and simple regulators genes. The analyzed sequences depend on the presence, absence or number of copies of the genes in the different strains. The size of the sequences and loci are also shown in TIGR4 and R6, pneumococcal representative strains. Factors in bold were identified as the most conserved. (−) = mutations could not be estimated for different reasons, like repetitive sequences

Table 6

Analysis of the genetic variation (Variome) of the genes that conform the two-component systems in S. pneumoniae

Virulence Factor GenesLengthMutationsVariantsAnalyzed Sequences
NameLocus in TIGR4 / R6Gene (bp)Protein (a.a.)OverallSynonimousNon-SynonimousAllellesProtein
rr14 sp_0376 / spr0336 6902291715214526
hk11 sp_2001 / spr1815 1098365247142105181825
hk10 sp_0604 / spr0529 1329442331617171625
hk06 sp_2192 / spr1997 1332443332310171225
hk13 sp_0527 / spr0464 1341446272153119121125
rr13 sp_0526 / spr0463 738245786513111125
rr11 sp_2000 / spr1814 600199775621151025
hk03 sp_0386 / spr0343 996331362610131025
hk01 sp_1632 / spr1473 97532420119131025
hk09 sp_0662 / spr0579 16925632317615825
rr01 sp_1633 / spr1474 6782251310313825
rr09 sp_0661 / spr0578 7382451410412725
rr04 sp_2082 / spr1893 7082354917329625
rr03 sp_0387 / spr0344 6332101410412525
rr10 sp_0603 / spr0528 6572181510511525
rr06 sp_2193 / spr1998 6542179547525
rr08 sp_0083 / spr0076 699232106410 4 25
hk04 sp_2083 / spr1894 1332443129310 4 25
hk08 sp_0084 / spr0077 10533501412211 3 25
rr05 sp_0798 / spr0707 67522465111 3 25
hk02 sp_1226 / spr1106 13504491210210 3 25
rr02 sp_1227 / spr1107 70523477011 2 25
hk05 sp_0799 / spr0708 1335444111019 2 25
hk07 sp_0155 / spr0153 1647548684622191724
rr07 sp_0156 / spr0154 1287428503317171424
rr12 sp_2235 / spr 2041 753250118310324
hk12 sp_2236 / spr2042 132644142162611923

Data of the punctual genetic variability (total mutations, synonymous and non-synonymous + allelic and protein variants) estimated for each one of the two-component system genes. The analyzed sequences depend on the presence, absence or number of copies of the genes in the different strains. The size of the sequences and loci are also shown in TIGR4 and R6, pneumococcal representative strains. Factors in bold are the most conserved

Analysis of the Variome of the virulence factor genes of S. pneumoniae Data of the punctual genetic variability (total mutations, synonymous and nonsynonymous + allelic and protein variants) estimated for each one of the virulence factors and simple regulators genes. The analyzed sequences depend on the presence, absence or number of copies of the genes in the different strains. The size of the sequences and loci are also shown in TIGR4 and R6, pneumococcal representative strains. Factors in bold were identified as the most conserved. (−) = mutations could not be estimated for different reasons, like repetitive sequences Analysis of the genetic variation (Variome) of the genes that conform the two-component systems in S. pneumoniae Data of the punctual genetic variability (total mutations, synonymous and non-synonymous + allelic and protein variants) estimated for each one of the two-component system genes. The analyzed sequences depend on the presence, absence or number of copies of the genes in the different strains. The size of the sequences and loci are also shown in TIGR4 and R6, pneumococcal representative strains. Factors in bold are the most conserved

Conclusions

The construction of this “low-scale” Variome model for the virulence factors and regulators of Streptococcus pneumoniae was achieved from 25 pneumococcal strains with fully sequenced and annotated genomes. According to the Molecular Phylogenetic Analysis performed on the NCBI website, this selected set of pneumococcal genomes ensured an optimal representation of the pneumococcal population (8290 strains) reported in the NCBI database up to date. Similarly, this study population set also represented an important group of highgly pathogenic pneumococcal serotypes (1, 2, 3, 4, 5, 6B, 11A, 14, 19A, 19F and 23F), which have been also included in the current pneumococcal conjugate vaccine formulations (except serotypes 2 and 11A), used to prevent penumococal infections. A total of 92 different genes and proteins were identified, classified, and studied for the construction of the variome. The genes of the pneumococcal virulence factors and TCS, are distributed along the genome, and are located in such a manner that transcription is co-oriented with replication. The analysis of the gene distribution in this study population set showed that 26 of them were found in the 100% of the 25 pneumococcal genomes/strains (core genome), while 39 are part of the flexible genome. The estimation of the variability for each individual virulence factors, stand-alone regulator or TCS, indicated that the virulence factors with the lowest variability in the pneumococcus are pneumolysin, enolase and PcsB, while the regulators with the highest conservation are TCS05 (CiaR/H), TCS02 (VicR/K) and TCS08. Finally, all the results obtained here with the bioinformatic analysis performed, constitute the first model to compare, visualize and understand the future flood of new genomic data about the genetic variation (in terms of gene presence/absence or mutation) of pneumococcal virulence factors and regulators [51-53]. The applicability offered by this variome model, together with further population genomic analysis of pneumococci, will provide relevant information on potential targets for vaccines, supporting the idea of a new generation of protein-based formulations to combat Streptococcus pneumoniae and its disease burden.
  47 in total

1.  KEGG: kyoto encyclopedia of genes and genomes.

Authors:  M Kanehisa; S Goto
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Estimating recombinational parameters in Streptococcus pneumoniae from multilocus sequence typing data.

Authors:  E J Feil; J M Smith; M C Enright; B G Spratt
Journal:  Genetics       Date:  2000-04       Impact factor: 4.562

3.  Evidence of localized prophage-host recombination in the lytA gene, encoding the major pneumococcal autolysin.

Authors:  María Morales; Pedro García; Adela G de la Campa; Josefina Liñares; Carmen Ardanuy; Ernesto García
Journal:  J Bacteriol       Date:  2010-03-19       Impact factor: 3.490

Review 4.  Source-sink dynamics of virulence evolution.

Authors:  Evgeni V Sokurenko; Richard Gomulkiewicz; Daniel E Dykhuizen
Journal:  Nat Rev Microbiol       Date:  2006-07       Impact factor: 60.633

Review 5.  Linking databases and organisms: GenomeNet resources in Japan.

Authors:  M Kanehisa
Journal:  Trends Biochem Sci       Date:  1997-11       Impact factor: 13.807

6.  Constitutive expression of PcsB suppresses the requirement for the essential VicR (YycF) response regulator in Streptococcus pneumoniae R6.

Authors:  Wai-Leung Ng; Gregory T Robertson; Krystyna M Kazmierczak; Jingyong Zhao; Raymond Gilmour; Malcolm E Winkler
Journal:  Mol Microbiol       Date:  2003-12       Impact factor: 3.501

Review 7.  Systematic evaluation of serotypes causing invasive pneumococcal disease among children under five: the pneumococcal global serotype project.

Authors:  Hope L Johnson; Maria Deloria-Knoll; Orin S Levine; Sonia K Stoszek; Laura Freimanis Hance; Richard Reithinger; Larry R Muenz; Katherine L O'Brien
Journal:  PLoS Med       Date:  2010-10-05       Impact factor: 11.069

8.  Co-orientation of replication and transcription preserves genome integrity.

Authors:  Anjana Srivatsan; Ashley Tehranchi; David M MacAlpine; Jue D Wang
Journal:  PLoS Genet       Date:  2010-01-15       Impact factor: 5.917

Review 9.  Understanding the pneumococcus: transmission and evolution.

Authors:  Eric S Donkor
Journal:  Front Cell Infect Microbiol       Date:  2013-03-07       Impact factor: 5.293

10.  Variation of pneumococcal Pilus-1 expression results in vaccine escape during Experimental Otitis Media [EOM].

Authors:  Marisol Figueira; Monica Moschioni; Gabriella De Angelis; Michèle Barocchi; Vishakha Sabharwal; Vega Masignani; Stephen I Pelton
Journal:  PLoS One       Date:  2014-01-08       Impact factor: 3.240

View more
  13 in total

1.  Comparative genomics of invasive Streptococcus pneumoniae CC320/271 serotype 19F/19A before the introduction of pneumococcal vaccine in India.

Authors:  Rosemol Varghese; Ayyanraj Neeravi; Jobin John Jacob; Karthick Vasudevan; Jones Lionel Kumar; Nithya Subramanian; Balaji Veeraraghavan
Journal:  Mol Biol Rep       Date:  2021-04-19       Impact factor: 2.316

2.  The complexity of serotype replacement of pneumococci.

Authors:  Orsolya Dobay
Journal:  Hum Vaccin Immunother       Date:  2019-06-13       Impact factor: 3.452

3.  A Cross-Reactive Protein Vaccine Combined with PCV-13 Prevents Streptococcus pneumoniae- and Haemophilus influenzae-Mediated Acute Otitis Media.

Authors:  Hannah M Rowe; Beth Mann; Amy Iverson; Aaron Poole; Elaine Tuomanen; Jason W Rosch
Journal:  Infect Immun       Date:  2019-09-19       Impact factor: 3.441

4.  Molecular characterization and epidemiology of Streptococcus pneumoniae serotype 24F in Denmark.

Authors:  Ioanna Drakaki Kavalari; Kurt Fuursted; Karen A Krogfelt; H-C Slotved
Journal:  Sci Rep       Date:  2019-04-02       Impact factor: 4.379

5.  Genomic Characterization of the Emerging Pathogen Streptococcus pseudopneumoniae.

Authors:  Geneviève Garriss; Priyanka Nannapaneni; Alexandra S Simões; Sarah Browall; Karthik Subramanian; Raquel Sá-Leão; Herman Goossens; Herminia de Lencastre; Birgitta Henriques-Normark
Journal:  mBio       Date:  2019-06-25       Impact factor: 7.867

6.  The identification of co-expressed gene modules in Streptococcus pneumonia from colonization to infection to predict novel potential virulence genes.

Authors:  Sadegh Azimzadeh Jamalkandi; Morteza Kouhsar; Jafar Salimian; Ali Ahmadi
Journal:  BMC Microbiol       Date:  2020-12-17       Impact factor: 3.605

7.  Multipathogen Analysis of IgA and IgG Antigen Specificity for Selected Pathogens in Milk Produced by Women From Diverse Geographical Regions: The INSPIRE Study.

Authors:  Michelle K McGuire; Arlo Z Randall; Antti E Seppo; Kirsi M Järvinen; Courtney L Meehan; Debela Gindola; Janet E Williams; Daniel W Sellen; Elizabeth W Kamau-Mbuthia; Egidioh W Kamundia; Samwel Mbugua; Sophie E Moore; Andrew M Prentice; James A Foster; Gloria E Otoo; Juan M Rodríguez; Rossina G Pareja; Lars Bode; Mark A McGuire; Joseph J Campo
Journal:  Front Immunol       Date:  2021-02-11       Impact factor: 7.561

Review 8.  Multifaceted Role of Pneumolysin in the Pathogenesis of Myocardial Injury in Community-Acquired Pneumonia.

Authors:  Ronald Anderson; Jan G Nel; Charles Feldman
Journal:  Int J Mol Sci       Date:  2018-04-11       Impact factor: 5.923

9.  Designing self-assembled peptide nanovaccine against Streptococcus pneumoniae: An in silico strategy.

Authors:  Hesam Dorosti; Mahboobeh Eslami; Navid Nezafat; Fardin Fadaei; Younes Ghasemi
Journal:  Mol Cell Probes       Date:  2019-09-11       Impact factor: 2.365

Review 10.  Streptococcal Serine-Rich Repeat Proteins in Colonization and Disease.

Authors:  Jia Mun Chan; Andrea Gori; Angela H Nobbs; Robert S Heyderman
Journal:  Front Microbiol       Date:  2020-10-30       Impact factor: 5.640

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.