Literature DB >> 29620181

Identification of Streptococcus mitis321A vaccine antigens based on reverse vaccinology.

Qiao Zhang¹, Kexiong Lin¹, Changzheng Wang¹, Zhi Xu¹, Li Yang¹, Qianli Ma¹.

Abstract

Streptococcus mitis (S. mitis) may transform into highly pathogenic bacteria. The aim of the present study was to identify potential antigen targets for designing an effective vaccine against the pathogenic S. mitis321A. The genome of S. mitis321A was sequenced using an Illumina Hiseq2000 instrument. Subsequently, Glimmer 3.02 and Tandem Repeat Finder (TRF) 4.04 were used to predict genes and tandem repeats, respectively, with DNA sequence function analysis using the Basic Local Alignment Search Tool (BLAST) in the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Cluster of Orthologous Groups of proteins (COG) databases. Putative gene antigen candidates were screened with BLAST ahead of phylogenetic tree analysis. The DNA sequence assembly size was 2,110,680 bp with 40.12% GC, 6 scaffolds and 9 contig. Consequently, 1,944 genes were predicted, and 119 TRF, 56 microsatellite DNA, 10 minisatellite DNA and 154 transposons were acquired. The predicted genes were associated with various pathways and functions concerning membrane transport and energy metabolism. Multiple putative genes encoding surface proteins, secreted proteins and virulence factors, as well as essential genes were determined. The majority of essential genes belonged to a phylogenetic lineage, while 321AGL000129 and 321AGL000299 were on the same branch. The current study provided useful information regarding the biological function of the S. mitis321A genome and recommends putative antigen candidates for developing a potent vaccine against S. mitis.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：

Year: 2018 PMID： 29620181 PMCID： PMC5983942 DOI： 10.3892/mmr.2018.8799

Source DB: PubMed Journal: Mol Med Rep ISSN： 1791-2997 Impact factor: 2.952

Introduction

α-hemolytic Streptococcus is the foremost cause of pneumonia in age groups with the exception of newborns, and occasionally predisposes individuals to peritonitis, otitis media, sinusitis, and meningitis (1,2). Streptococcus mitis is a gram-positive a-hemolytic species of Streptococcus. It is the closest relative of Streptococcus pneumonia with high pathogenicity that is due to a variety of virulence factors, including pneumolysin (Ply), a hemolytic cytolysin, the autolysin, LytA, and various surface proteins involved in host cell interaction, and shares >900 core genes with S. pneumonia (3). The majority of previous studies have generally described S. mitis as a normal commensal that colonizes the human oropharynx, and is characterized by low pathogenicity (4,5). However, diverse infectious complications, such as infective endocarditis, bacteraemia and septicaemia, occur in immunocompromised patients as a result of the transition of S. mitis from a commensal to pathogenic microorganism when it escapes from the colonizing site (6–8). A recent study using multilocus sequence analysis revealed that severe clinical diseases are more likely to occur in cancer patients with S. mitis than in patients with Streptococcus oralis (9). S. mitis resists certain antibiotics and induces infective endocarditis in combined immunocompromised patients (10). Its infection often combines with other pathogenetic factors and appears to cause various complications in patients with variable syndromes and signs, leading to difficulties in treatment. Furthermore, there is a lack of effective therapeutic strategies targeting these complications (11,12). Therefore, developing an effective vaccination to reduce the incidence of S. mitis pathogenicity-induced diseases in immunocompromised patients is considered to be important. Establishing the complete genome sequence of a free-living organism enables the development of reverse vaccinology (RV), a novel approach to vaccine design for treatment of bacterial infections, reliant on deciphering the information contained in the genome of the bacterium. Marked progress has been made in understanding the biology of the pathogens and the vaccination development as a result of advances in genomics and RV (13). RV has been applied to group B Streptococcus (14), S. pneumonia (15), as well as human herpes simplex viruses (16). In addition, Rickettsia prowazekii T-cell antigens have been identified by combining RV technology and in vivo screening (17). RV also facilitates identification of vaccine candidates in Rhipicephalus microplus (18). Consequently, numerous antigen candidates for these pathogens have been acquired, which demonstrates the significance and power of RV. In addition to guiding vaccine design, RV promoted understanding of the pathogenesis of meningococcus (13). The aim of the present study was to identify potential antigens suitable for use in an effective vaccine. The candidate antigens of the pathogenic bacterium were screened using RV based on whole genome sequencing of S. mitis321A. The biological functions and signaling pathways of the predicted genes in the genome were also analyzed.

Materials and methods

Sample collection

The clinical strain S. mitis321A was collected from a 70-year-old male patient with chronic obstructive pulmonary disease in stable state (moderate severity) using pharyngeal swabs at the Institute of Respiratory Disease, Xinqiao Hospital of Third Military Medical University in February, 2011. The S. mitis321A strain was seeded onto blood agar plates containing 5% sheep blood and grown overnight at 37°C. A single clone was subsequently cultured and grown to mid-logarithmic phase in Todd-Hewitt broth (THB) supplemented with 0.5% yeast extract at 37°C [5–6 h; optical density (OD)=0.5–0.6 at a wavelength of 600 nm]. Bacterial DNA was extracted from overnight broth cultures using a QIAamp DNA mini kit (Qiagen AG, Basel, Switzerland) according to the manufacturer's protocols. The patient provided informed consent prior to the present study.

Preprocessing and DNA sequencing

Large DNA fragments were sheared into small fragments (≤800 bp) using a high throughput sonication instrument (Covaris or BioRuptor). The sticky end of the small DNA fragments was converted into a blunt end using T4 DNA Polymerase, Klenow DNA Polymerase and T4 polynuleotide kinase (Illumina, Inc., San Diego, CA, USA), followed by adaptors ligating to the ends. Subsequently, the blunt-ended DNA fragments were subjected to electrophoretic separation (2% agarose gel in TAE buffer; 120V; 60 min) to recover the target DNA products, followed by polymerase chain reaction amplification according to the manufacturer's instructions. Briefly, the PCR reaction mix included DNA (1 µg), Phusion DNA polymerase (Finnzymes; Thermo Fisher Scientific, Inc., Waltham, MA, USA), PCR primer 1.1 (1 µl; Illumina, Inc.), PCR primer 2.1 (1 µl; Illumina, Inc.) and deionized water (22 µl). Amplify protocols were as follows: 98°C for 30 sec, 10 cycles of 98°C for 10 sec, 65°C for 30 sec, and 72°C for 30 sec, with a final extension at 72°C for 5 min. When the DNA library was ready, DNA clusters were formed and sequenced on an Illumina HiSeq 2000 instrument (Illumina, Inc.).

Raw data purification

The genomic DNA was used for constructing 500- and 6,000-bp random sequencing libraries. For DNA filtering, low-quality data was deleted from the raw data generated on the sequencing platform to increase the accuracy and reliability of subsequent analyses. Consequently, clean data was obtained. The 500- and 6,000-bp libraries were handled as follows: i) 1- to 90-bp sequence was intercepted from read1 and read2; ii) the reads containing >36 consecutive bases with quality score ≤20 were deleted (default 40%, the cutoff was set as 36 bp); iii) the reads with the number of the bases containing N up to a certain degree were removed (default 10%, the cutoff was set as 9 bp); iv) Adapter sequences were deleted (default: Adapter sequence has 15 bp overlap with read sequences); and v) duplicated sequences were removed. Subsequent to the above process, 10–20% of the data (small fragment DNA data) was removed and clean data was obtained. The k-mer frequency distribution analysis for DNA sequencing reads (19,20) is the preliminary step for evaluating the size of the genome prior to DNA sequence assembly using the obtained clean data and SOAPdenovo (http://soap.genomics.org.cn/soapdenovo.html) (-k:73) short read assembler (21).

Genome analysis

Glimmer software is specifically designed to mine genes in microbial DNA, such as bacteria, viruses and other microorganisms. Compared with the previous versions, Glimmer 3.02 (http://ccb.jhu.edu/software/glimmer/index.shtml) (-l linear) is more powerful for predicting the initiation site and coding region, improving the accuracy of predicting GC-rich sequences and effectively reducing the false positive rate (22). In the present study, Glimmer 3.02 was used to predict genes in reconstructed sequences following DNA sequence assembly. Tandem Repeat Finder (TRF)4.04 (2778010502000-d-h) was applied to predict tandem repeats from which mini- and micro-satellite sequences were screened according to the length and number of repeats.

Functional annotation

Functional annotation for the obtained DNA sequences was conducted using the Basic Local Alignment Search Tool (BLAST) analysis of DNA sequences in the Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.kegg.jp/) (23), Cluster of Orthologous Groups of proteins (COG, http://www.ncbi.nlm.nih.gov/COG) (24), SwissProt (http://www.expasy.org/sprot/) (25), non-redundant protein database (NR, http://www.ncbi.nlm.nih.gov/RefSeq/) (26), Gene Ontology (GO, http://www.geneontology.org/) (27), InterProScan (http://www.ebi.ac.uk/interpro/scan.html) and TrEMBL (http://www.expasy.org/sprot/). Specifically, the query amino acid sequence corresponding to the obtained DNA was mapped to the known amino acid sequence in these databases; identifying the known amino acid sequence that resembles the query sequence above a certain threshold identified the function of the query protein. The COG database contains 2,091 COGs and covers 56–83% of the gene products extracted from the complete bacterial and archaea genomes and facilitates protein classification. The SWISS-PROT protein knowledgebase aims to provide detailed annotation information for amino acid sequences, including the function, domains structure, variants and modifications at a post-translational level. TrEMBL is a supplement to the SWISS-PROT database. InterProScan acts as a tool to predict the functions of a given protein sequence based on the known information concerning the protein domains and functional sites (28).

Screening vaccine antigen

RV integrated with bioinformatics approaches was utilized to screen genes encoding antigens of S. mitis321A, which elicited protective immune response in the human body. There is a common consensus that the cell surface antigens, secreted proteins and pathogenic protein of pathogenic microorganisms may serve as antigens for vaccine development (29,30); thus, genome sequences encoding the secreted proteins, cell surface anchoring proteins and virulence factor were selected in the study as follows: Firstly, all information associated with secreted proteins was downloaded from Cell PLoc (http://www.csbio.sjtu.edu.cn/bioinf/Cell-PLoc-2/), an online package of multiple web servers, which comprises rich knowledge on the subcellular locations of proteins involved in diverse organisms (31). The downloaded secreted proteins sequence was compared with the retained target protein sequence to determine the protein sequence sharing significant homology with the secreted protein (E-value, Subsequently, the pfam database (https://pfam.xfam.org/) provides detailed information associated with protein multiple sequences alignments and families (33). Through searching for cell surface-expressed anchoring protein family from the pfam database, the Ecm33 (glycosyl phosphatidyl inositol-anchored cell wall organization protein) family and its protein sequence was obtained and then compared with the target protein sequence of S. mitis321A to determine significant protein sequences (E-value, The Virulence Searcher (http://www.hpa-bioinfotools.org.uk/help/virfactfind_help.html) online tool allows for convenient searches for putative genes encoding virulence factors (34). Using this tool, motif information associated with virulence factors was acquired and integrated with functional annotation result of the S. mitis321A to identify the possible gene encoding virulence factor. Finally, the obtained secreted proteins, anchoring proteins and virulence factors were compared with the known protective antigens of S. mitis321A that had been reported in previous studies using BLAST to screen significant vaccine candidate genes, which resembled the known protective antigens above the threshold value (E-value,

Essential gene screening for developing antibacterial drugs

As essential genes are crucial for bacteria survival and may serve as target genes for developing potent antimicrobial drugs, the essential genes of S. mitis321A were screened to identify potential target genes. Initially, BLAST was used with target gene sequences against the essential gene sequences in the Database of Essential Genes (DEG) database (E-value, A multiple sequence alignment of all the candidate essential gene sequences was produced using the ClustalW2 program (http://www.ebi.ac.uk/Tools/msa/clustalw2/) (36). From the alignment, a phylogenetic tree was generated and visualized using the PHYLogeny Inference Package (version 3.695; http://atgc.lirmm.fr/phyml) (37). A bootstrap analysis was conducted with 1,000 replications to evaluate the robustness of the method. All gaps and regions of the alignment with low confidence were deleted from the phylogenetic analysis.

Results

A total of 332 Mb of DNA sequence data was retained following purification, and the details of the purification are presented in Table I.

Table I.

Genome sequencing data of Streptococcus mitis321A.

Sample name	Insert size (bp)	Reads length (bp)	Raw data (Mb)	Adapter (%)	Duplication (%)	Total reads	Filtered reads (%)	Low quality filtered reads (%)	Clean data (Mb)
321A	464	(90:90)	227	0.07	0.53	2,527,772	2.86	1.14	221
321A	600	(90:90)	125	2.07	0.71	1,378,728	11.13	1.94	111

K-mer frequency distribution was analyzed to calculate the genome size. As shown in Fig. 1, no apparent heterozygosity peak and repeat peak was observed, indicating small degree of heterozygosity and repeat in the DNA sequences. The result of DNA sequence assembly is presented in Table II. The assembled genome size was 2,110,680 bp, with 40.12% GC, 6 scaffolds and 9 contig.

Figure 1.

K-mer frequency distribution. The y-axis represents the percentages of frequencies at various depths relative to the total frequency. Typically, the K-mer frequency distribution follows Poisson distribution. The appearance of a heterozygosity peak at half of the x-axis corresponding to the main peak denotes heterozygosity, and the repeat peak at integer multiple values of the x-axis corresponding to the main peak represents a degree of repetition.

Table II.

DNA sequence assembly.

Index	Scaffold	Contig
Total number (>500 bp)	6	9
Total length (bp)	2,110,680	2,109,125
N50 length (bp)	2,100,529	1,460,616
N90 length (bp)	2,100,529	636,033
Max. length (bp)	2,100,529	1,460,616
Min. length (bp)	515	515
Sequence GC (%)	40.12	40.12

N50/N90; statistics from sets of contig or scaffold lengths.

A total of 1,944 protein-encoding genes were predicted from the genome DNA, with mean gene length of 946 bp and GC content of 40.9%. The total length of predicted genes and gene interval occupied 87.1 and 34.86% of the whole genome, respectively. The GC content in the gene interval was 34.86%. The findings from the tandem repeat evaluation are presented in Table III. In total, 119 TRFs, 56 microsatellite DNAs, 10 minisatellite DNAs and 154 transposons were determined. The percentages of transposons and TRFs in the whole genome were 0.6914 and 2.6943%, respectively. Although the number of transposons was larger than the number of tandem repeats, the percentage of the total length of the tandem repeats was larger than that of transposons in the whole genome length.

Table III.

Tandem repeats analysis.

Category	Number	Repeat size (bp)	Total length (bp)	In genome (%)
Transposon	154	13–674	14,651	0.6941
Tandem repeat finder	119	6–1,353	56,867	2.6943
Minisatellite DNA	56	15–60	19,574	0.9274
Microsatellite DNA	10	6–10	451	0.0214

Function analysis

KEGG pathways involving the predicted DNAs were identified and classified (Fig. 2). Of all the identified pathways, membrane transport (environmental information processing) was the most significant pathway containing the largest number of matched genes, and other important pathways with a large numbers of genes comprised xenobiotic biodegradation and metabolism, carbohydrate metabolism, amino acid metabolism, translation, transcription, replication and repair, and infectious disease pathways.

Figure 2.

Predicted gene-associated KEGG pathway classification. The number beside the horizontal bars indicates the number of genes matched to each given pathway. KEGG, Kyoto Encyclopedia of Genes and Genomes.

Regarding the result of COG database analysis (Fig. 3), more genes were clustered in cell wall/membrane/envelope biogenesis (M), signal transduction mechanisms (T) and defense mechanisms (V) when compared with other function classes, and the exact function of a proportion of identified genes remained undefined (function unknown; S).

Figure 3.

COG function classification. COG, Cluster of Orthologous Groups of proteins.

With the BLAST analysis that aligned the obtained genes encoding putative surface proteins, secreted proteins and virulence factors with the previous studies, protective antigen candidates were identified to be 321AGL000253, 321AGL000282, 321AGL000444, 321AGL000958 and 321AGL001626. Detailed functional information of the five identified sequences is displayed in Table IV: 321AGL000253 was closely associated with Xaa-Pro aminopeptidase and hydrolase activity; 321AGL000282 was associated with sensor histidine kinase and signal transduction; 321AGL000444 was linked to competence damage-inducible protein A (CinA); 321AGL000958 was associated with manganese ABC transporter substrate-binding lipoprotein and metal ion transport system; and 321AGL 001626 was linked to glutathione reductase (NADPH) and glutathione-disulfide reductase activity.

Table IV.

Functional annotation information of the five sequences based on NR, KEGG, COG, GO, InterProScan and TrEMBL databases.

Gene_Id	321AGL000253	321AGL000282	321AGL000444	321AGL000958	321AGL001626
NR	[X-Pro aminopeptidase (Streptococcus mitis)]	[Sensor histidine kinase (Streptococcus mitis)]	[Damage-inducible protein, CinA (Streptococcus mitis)]	[Manganese ABC transporter substrate-binding lipoprotein (Streptococcus mitis SK564)]	[Glutathione-disulfide reductase (Streptococcus mitis SK564)]
KEGG	[K01262 pepPXaa-Pro aminopeptidase 3.4.11.9 metabolism; enzyme families; peptidases (BR:ko01002)]	[K07718 yes M two-component system, sensor histidine kinase Yes M 2.7.13.3 metabolism; enzyme families; protein kinases (BR:ko01001) environmental information processing; signal transduction; two-component system (PATH:ko02020) environmental information processing; signal transduction; two-component system (BR:ko02022)]	(NA)	[K09818 ABC.MN.S manganese/iron transport system substrate-binding protein-environmental information processing; membrane transport; transporters (BR:ko02000)]	[K00383 E1.8.1.7, GSR, glutathione oxidoreductase, glutathione reductase 1.8.1.7 metabolism; metabolism of other amino acids; glutathione metabolism (PATH:ko00480)]
COG	(COG0006 Xaa-Pro aminopeptidase E amino acid transport and metabolism)	(COG2972 Predicted signal transduction protein with a C-terminal ATPase domain T signal transduction mechanisms)	(NA)	(COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin P inorganic ion transport and metabolism)	[COG1249 pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide dehydrogenase (E3) component, and associated enzymes C energy production and conversion]
SwissProt	[YQHT_BACSU uncharacterized peptidase yqhT organism=Bacillus subtilis (strain 168) GN=yqhT PE=3 SV=1]	(NA)	(NA)	(MTSA_STRAP Manganese ABC transporter substrate-binding lipoprotein OS= Streptococcus anginosus GN=psaA PE=3 SV=1)	(GSHR_STRTR glutathione reductase; organism= Streptococcus thermophilus gene Gene name=gor; protein inferred from homology=3; sequence version=1)
TrEMBL	(G6NMA3_STRPN XAA-pro aminopeptidase organism= Streptococcus pneumoniae GA07643 GN=pepP PE=3 SV=1)	(I0T163_STRMT histidine kinase OS=Streptococcus mitis) SK575 GN= HMPREF1048_1531 PE=4 SV=1	(F9MKF5_STRMT putative uncharacterized protein organism=Streptococcus mitis SK569 GN= HMPREF9959_0223 PE=4 SV=1)	(E1LLV4_STRMT manganese ABC transporter substrate-binding lipoprotein OS= Streptococcus mitis SK564 GN=SMSK564_0925 PE=3 SV=1)	(E1LK57_STRMT glutathione-disulfide reductase organism= Streptococcus mitis SK564 GN=gor PE=3 SV=1)
Interprocan	(IPR000587; creatinase IPR000994; peptidase M24, structural domain IPR001131; peptidase M24B, X-Pro dipeptidase/aminopeptidase P, conserved site)	(IPR003594; ATPase-like, ATP-binding domain IPR003660; HAMP linker domain IPR010559; signal transduction histidine kinase, internal region)	(NA)	(IPR006127; ABC transporter, metal-binding lipoprotein IPR006128; Adhesion lipoprotein IPR006129; Adhesin B)	[IPR004099; pyridine nucleotide-disulphide oxidoreductase, dimerization IPR006322; glutathione reductase, eukaryote/bacterial IPR012999; pyridine nucleotide-disulphide oxidoreductase, class I, active site IPR013027; Flavin adenine dinucleotide (FDA)-dependent pyridine nucleotide-disulphide oxidoreductase IPR016156; FAD/NAD-linked reductase, dimerization IPR023753; pyridine nucleotide-disulphide oxidoreductase, FAD/NAD (P)-binding domain]
(GO)	(GO:0009987; cellular process; biological process GO:0016787; hydrolase activity; molecular function)	[GO:0000155; two-component sensor activity; molecular function GO:0000160; two-component signal transduction system (phosphorelay); biological process GO:0004871; signal transducer activity; molecular function GO:0005524; ATP binding; molecular function GO:0007165; signal transduction; biological process GO:0016021; integral to membrane; cellular component]	(NA)	(GO:0007155; cell adhesion; biological process GO:0030001; metal; ion transport biological) process GO:0046872; metal ion binding; molecular function	(GO:0004362; glutathione-disulfide reductase activity; molecular function GO:0005737; cytoplasm; cellular component GO:0006749; glutathione meta bolic process; biological process GO:0016491; oxidoreductase activity; molecular function GO:0016668; oxidoreductase activity, acting on a sulfur group of donors, NAD or NADP as accep tor; molecular function GO:0045454; cell redox homeo stasis; biological process GO:0050660; flavin adenine dinucleotide binding; molecular function GO:0050661; NADP binding; molecular function GO:0055114; oxidation-reduction process; Biological Process)

GO terms are classified into three types as follows: Molecular function, biological process and cellular component. KEGG, Kyoto Encyclopedia of Genes and Genomes; COG, Cluster of Orthologous Groups of proteins; NR, non-redundant protein database; GO, Gene Ontology.

Phylogenetic tree analysis

Following alignment using CLUSTALW2, 27 essential genes were identified. The identified genes were subjected to phylogenetic tree analysis showing that essential genes (321AGL000129 and 321AGL000299) on the same branch belonged to the same phylogenetic lineage, and may act as the same type of antibacterial drug target genes (Fig. 4).

Figure 4.

Phylogenetic tree of essential genes. Essential genes on the same branch belong to the same phylogenetic lineage and may act as the same type of antibacterial drug target genes.

Discussion

Generally, S. mitis is considered to be a commensal oral Streptococcus posing little immunological threat to the majority of individuals; however, elderly, immunocompromised and cancer patients undergoing cytotoxic chemotherapy are susceptible to it (38). In addition, it may occasionally affect normal healthy infants and adults (8). Therefore, the aim of the present study was to establish antigen candidates for developing potent vaccines against the S. mitis pathogen. In the current study, a 332-Mb sequence of the S. mitis321A genome was predicted to encode a total of 1,944 genes with 40.9% GC content. By contrast, S. mitis B5 genome sequencing determines two15-Mb sequences with mean GC content of 39.98%, which is similar to the genome of S. pneumonia (2.04–2.24 Mb and ~40% GC) (3). Different strains of S. mitis displayed varied genomes. The predicted genes of S. mitis321A were closely associated with membrane transport (environmental information processing), carbohydrate metabolism, amino acid metabolism, translation, transcription, replication and repair KEGG pathways. Of most importance was the membrane transport pathway with 335 matched genes. Consistently, more genes were involved in the wall/membrane/envelope biogenesis function class when compared with the other function classes, as demonstrated in the COG classification analysis, confirming that genes encoding putative membrane proteins were critical for the pathogenicity of S. mitis. Another consideration of the present study was that xenobiotic biodegradation and metabolism, carbohydrate metabolism and amino acid metabolism, translation, transcription, replication and repair pathways appeared to be associated with a large number of genes, leading to the hypothesis that S. mitis may deteriorate the condition of vulnerable patients by impairing energy mechanisms and interrupting DNA synthesis, transcription and translation processes in host cells, thus triggering severe clinical consequences. Consistently, the COG classification analysis indicated that amino acid transport and metabolism, carbohydrate transport and metabolism, transcription, replication, recombination and repair function classes were closely linked to the identified genes. Furthermore, through BLAST analysis, 321AGL000253, 321AGL000282, 321AGL000444, 321AGL000958 and 321AGL001626 were identified to be candidate antigens of S. mitis321A. The putative biological function of the five sequences appeared to be varied. As suggested by the present study, 321AGL000253 was closely associated with Xaa-Pro aminopeptidase and hydrolase activity. Xaa-Pro aminopeptidase hydrolyzes Xaa-Pro bonds. A previous study has shown that Xaa-Pro aminopeptidase is involved in aminolysis reactions in Lactococcuslactis (39). However, to the best of our knowledge, the Xaa-Pro aminopeptidase in S. mitis has not previously been defined; thus, requires further investigation to clarify its association with vaccine design and development. Additionally, 321AGL000282 was associated with sensor histidine kinase and signal transduction, while 321AGL001626 was linked to NADPH and glutathione metabolism activity. Histidine kinase is a multifunctional transferase family that is implicated in upstream signal transduction pathways of various virulent pathways (40). It has been demonstrated as a critical component of the virulence of certain fungal strains (41). Furthermore, it has been revealed that glutathione peroxidase may contribute to the virulence of S. pyogenes (42). These findings indicated that 321AGL000282 and 321AGL001626 may be virulence factors of S. mitis321A. Furthermore, the 321AGL000444 was linked to CinA, which has been found to mediate the membrane association in Helicobacter pylori and S. pneumoniae (1,43). 321AGL000958 was associated with manganese ABC transporter substrate-binding lipoprotein, which is a transmembrane protein for adenosine triphosphate (44,45). These evidence indicate that 321AGL000444 and 321AGL000958 may encode the membrane anchoring protein of the bacteria. Essential genes were defined as pivotal genes for organism survival, which are often involved in metabolism, DNA replication and translation into proteins (46). Notably, they are increasingly recognized as potential target genes for developing novel agents against various pathogenic microorganisms (47,48). There are numerous studies based on genome analysis that have provided a selection of essential genes, which is promising for selecting and validating antimicrobial agents (49,50). Thus, essential genes of S. mitis321A were screened based on the DEG database, not including genes encoding surface proteins, secreted proteins or virulence factors. As a result, 27 essential genes were obtained. Phylogenetic tree analysis was used to analyze the homologue of the essential genes. The majority of essential genes appeared to belong to the same phylogenetic lineage, with the exception of 321AGL000176, 321AGL001082 and 321AGL001586. Essential genes on the same branch, such as 321AGL000129 and 321AGL000299 may be the target genes for the same type of antibacterial agents. The findings of this preliminary study require validation with experimental data. Subsequent trials will evaluate the efficacy of the vaccines that targeted the putative antigen targets provided by the present study, and provide insight into the biological function of the antigen targets and differences in genomes between S. mitis321A and other strains of S. mitis. In conclusion, the genome sequencing of S. mitis321A predicted 1,944 genes with 40.9% GC content. The predicted genes were associated with a variety of signaling pathways and biological functions regarding membrane transport and energy metabolism. Five gene sequences encoding putative surface proteins, secreted proteins and virulence factors, and several essential genes were determined to be antigen candidates for developing potent vaccines to prevent the diseases driven by the S. mitis321A pathogen.

49 in total

Review 1. Molecular genetic basis of antimicrobial agent resistance in Mycobacterium tuberculosis: 1998 update.

Authors: S Ramaswamy; J M Musser
Journal: Tuber Lung Dis Date: 1998

2. Vaccines, reverse vaccinology, and bacterial pathogenesis.

Authors: Isabel Delany; Rino Rappuoli; Kate L Seib
Journal: Cold Spring Harb Perspect Med Date: 2013-05-01 Impact factor: 6.915

3. Identification of a universal Group B streptococcus vaccine by multiple genome screen.

Authors: Domenico Maione; Immaculada Margarit; Cira D Rinaudo; Vega Masignani; Marirosa Mora; Maria Scarselli; Hervé Tettelin; Cecilia Brettoni; Emilia T Iacobini; Roberto Rosini; Nunzio D'Agostino; Lisa Miorin; Scilla Buccato; Massimo Mariani; Giuliano Galli; Renzo Nogarotto; Vincenzo Nardi-Dei; Vincenzo Nardi Dei; Filipo Vegni; Claire Fraser; Giuseppe Mancuso; Giuseppe Teti; Lawrence C Madoff; Lawrence C Paoletti; Rino Rappuoli; Dennis L Kasper; John L Telford; Guido Grandi
Journal: Science Date: 2005-07-01 Impact factor: 47.728

4. Complete genome sequence of a virulent isolate of Streptococcus pneumoniae.

Authors: H Tettelin; K E Nelson; I T Paulsen; J A Eisen; T D Read; S Peterson; J Heidelberg; R T DeBoy; D H Haft; R J Dodson; A S Durkin; M Gwinn; J F Kolonay; W C Nelson; J D Peterson; L A Umayam; O White; S L Salzberg; M R Lewis; D Radune; E Holtzapple; H Khouri; A M Wolf; T R Utterback; C L Hansen; L A McDonald; T V Feldblyum; S Angiuoli; T Dickinson; E K Hickey; I E Holt; B J Loftus; F Yang; H O Smith; J C Venter; B A Dougherty; D A Morrison; S K Hollingshead; C M Fraser
Journal: Science Date: 2001-07-20 Impact factor: 47.728

5. Penicillin-resistant Streptococcus mitis as a cause of septicemia with meningitis in febrile neutropenic children.

Authors: D R Balkundi; D L Murray; M J Patterson; R Gera; A Scott-Emuakpor; R Kulkarni
Journal: J Pediatr Hematol Oncol Date: 1997 Jan-Feb Impact factor: 1.289

6. A systematic, functional genomics, and reverse vaccinology approach to the identification of vaccine candidates in the cattle tick, Rhipicephalus microplus.

Authors: Christine Maritz-Olivier; Willem van Zyl; Christian Stutzer
Journal: Ticks Tick Borne Dis Date: 2012-04-20 Impact factor: 3.744

7. Contribution of glutathione peroxidase to the virulence of Streptococcus pyogenes.

Authors: Audrey Brenot; Katherine Y King; Blythe Janowiak; Owen Griffith; Michael G Caparon
Journal: Infect Immun Date: 2004-01 Impact factor: 3.441

8. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors: Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal: Gigascience Date: 2012-12-27 Impact factor: 6.524

9. Genome-wide prediction of vaccine targets for human herpes simplex viruses using Vaxign reverse vaccinology.

Authors: Zuoshuang Xiang; Yongqun He
Journal: BMC Bioinformatics Date: 2013-03-08 Impact factor: 3.169

10. Evolution of Streptococcus pneumoniae and its close commensal relatives.

Authors: Mogens Kilian; Knud Poulsen; Trinelise Blomqvist; Leiv S Håvarstein; Malene Bek-Thomsen; Hervé Tettelin; Uffe B S Sørensen
Journal: PLoS One Date: 2008-07-16 Impact factor: 3.240