| Literature DB >> 29620181 |
Qiao Zhang1, Kexiong Lin1, Changzheng Wang1, Zhi Xu1, Li Yang1, Qianli Ma1.
Abstract
Streptococcus mitis (S. mitis) may transform into highly pathogenic bacteria. The aim of the present study was to identify potential antigen targets for designing an effective vaccine against the pathogenic S. mitis321A. The genome of S. mitis321A was sequenced using an Illumina Hiseq2000 instrument. Subsequently, Glimmer 3.02 and Tandem Repeat Finder (TRF) 4.04 were used to predict genes and tandem repeats, respectively, with DNA sequence function analysis using the Basic Local Alignment Search Tool (BLAST) in the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Cluster of Orthologous Groups of proteins (COG) databases. Putative gene antigen candidates were screened with BLAST ahead of phylogenetic tree analysis. The DNA sequence assembly size was 2,110,680 bp with 40.12% GC, 6 scaffolds and 9 contig. Consequently, 1,944 genes were predicted, and 119 TRF, 56 microsatellite DNA, 10 minisatellite DNA and 154 transposons were acquired. The predicted genes were associated with various pathways and functions concerning membrane transport and energy metabolism. Multiple putative genes encoding surface proteins, secreted proteins and virulence factors, as well as essential genes were determined. The majority of essential genes belonged to a phylogenetic lineage, while 321AGL000129 and 321AGL000299 were on the same branch. The current study provided useful information regarding the biological function of the S. mitis321A genome and recommends putative antigen candidates for developing a potent vaccine against S. mitis.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29620181 PMCID: PMC5983942 DOI: 10.3892/mmr.2018.8799
Source DB: PubMed Journal: Mol Med Rep ISSN: 1791-2997 Impact factor: 2.952
Genome sequencing data of Streptococcus mitis321A.
| Sample name | Insert size (bp) | Reads length (bp) | Raw data (Mb) | Adapter (%) | Duplication (%) | Total reads | Filtered reads (%) | Low quality filtered reads (%) | Clean data (Mb) |
|---|---|---|---|---|---|---|---|---|---|
| 321A | 464 | (90:90) | 227 | 0.07 | 0.53 | 2,527,772 | 2.86 | 1.14 | 221 |
| 321A | 600 | (90:90) | 125 | 2.07 | 0.71 | 1,378,728 | 11.13 | 1.94 | 111 |
Figure 1.K-mer frequency distribution. The y-axis represents the percentages of frequencies at various depths relative to the total frequency. Typically, the K-mer frequency distribution follows Poisson distribution. The appearance of a heterozygosity peak at half of the x-axis corresponding to the main peak denotes heterozygosity, and the repeat peak at integer multiple values of the x-axis corresponding to the main peak represents a degree of repetition.
DNA sequence assembly.
| Index | Scaffold | Contig |
|---|---|---|
| Total number (>500 bp) | 6 | 9 |
| Total length (bp) | 2,110,680 | 2,109,125 |
| N50 length (bp) | 2,100,529 | 1,460,616 |
| N90 length (bp) | 2,100,529 | 636,033 |
| Max. length (bp) | 2,100,529 | 1,460,616 |
| Min. length (bp) | 515 | 515 |
| Sequence GC (%) | 40.12 | 40.12 |
N50/N90; statistics from sets of contig or scaffold lengths.
Tandem repeats analysis.
| Category | Number | Repeat size (bp) | Total length (bp) | In genome (%) |
|---|---|---|---|---|
| Transposon | 154 | 13–674 | 14,651 | 0.6941 |
| Tandem repeat finder | 119 | 6–1,353 | 56,867 | 2.6943 |
| Minisatellite DNA | 56 | 15–60 | 19,574 | 0.9274 |
| Microsatellite DNA | 10 | 6–10 | 451 | 0.0214 |
Figure 2.Predicted gene-associated KEGG pathway classification. The number beside the horizontal bars indicates the number of genes matched to each given pathway. KEGG, Kyoto Encyclopedia of Genes and Genomes.
Figure 3.COG function classification. COG, Cluster of Orthologous Groups of proteins.
Functional annotation information of the five sequences based on NR, KEGG, COG, GO, InterProScan and TrEMBL databases.
| Gene_Id | 321AGL000253 | 321AGL000282 | 321AGL000444 | 321AGL000958 | 321AGL001626 |
|---|---|---|---|---|---|
| NR | [X-Pro aminopeptidase | [Sensor histidine kinase | [Damage-inducible protein, CinA | [Manganese ABC transporter substrate-binding lipoprotein ( | [Glutathione-disulfide reductase ( |
| KEGG | [K01262 pepPXaa-Pro aminopeptidase 3.4.11.9 metabolism; enzyme families; peptidases (BR:ko01002)] | [K07718 yes M two-component system, sensor histidine kinase Yes M 2.7.13.3 metabolism; enzyme families; protein kinases (BR:ko01001) environmental information processing; signal transduction; two-component system (PATH:ko02020) environmental information processing; signal transduction; two-component system (BR:ko02022)] | (NA) | [K09818 ABC.MN.S manganese/iron transport system substrate-binding protein-environmental information processing; membrane transport; transporters (BR:ko02000)] | [K00383 E1.8.1.7, GSR, glutathione oxidoreductase, glutathione reductase 1.8.1.7 metabolism; metabolism of other amino acids; glutathione metabolism (PATH:ko00480)] |
| COG | (COG0006 Xaa-Pro aminopeptidase E amino acid transport and metabolism) | (COG2972 Predicted signal transduction protein with a C-terminal ATPase domain T signal transduction mechanisms) | (NA) | (COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin P inorganic ion transport and metabolism) | [COG1249 pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide dehydrogenase (E3) component, and associated enzymes C energy production and conversion] |
| SwissProt | [YQHT_BACSU uncharacterized peptidase yqhT organism= | (NA) | (NA) | (MTSA_STRAP Manganese ABC transporter substrate-binding lipoprotein OS= | (GSHR_STRTR glutathione reductase; organism= |
| TrEMBL | (G6NMA3_STRPN XAA-pro aminopeptidase organism= | (I0T163_STRMT histidine kinase OS= | (F9MKF5_STRMT putative uncharacterized protein organism= | (E1LLV4_STRMT manganese ABC transporter substrate-binding lipoprotein OS= | (E1LK57_STRMT glutathione-disulfide reductase organism= |
| Interprocan | (IPR000587; creatinase IPR000994; peptidase M24, structural domain IPR001131; peptidase M24B, X-Pro dipeptidase/aminopeptidase P, conserved site) | (IPR003594; ATPase-like, ATP-binding domain IPR003660; HAMP linker domain IPR010559; signal transduction histidine kinase, internal region) | (NA) | (IPR006127; ABC transporter, metal-binding lipoprotein IPR006128; Adhesion lipoprotein IPR006129; Adhesin B) | [IPR004099; pyridine nucleotide-disulphide oxidoreductase, dimerization IPR006322; glutathione reductase, eukaryote/bacterial IPR012999; pyridine nucleotide-disulphide oxidoreductase, class I, active site IPR013027; Flavin adenine dinucleotide (FDA)-dependent pyridine nucleotide-disulphide oxidoreductase IPR016156; FAD/NAD-linked reductase, dimerization IPR023753; pyridine nucleotide-disulphide oxidoreductase, FAD/NAD (P)-binding domain] |
| (GO) | (GO:0009987; cellular process; biological process GO:0016787; hydrolase activity; molecular function) | [GO:0000155; two-component sensor activity; molecular function GO:0000160; two-component signal transduction system (phosphorelay); biological process GO:0004871; signal transducer activity; molecular function GO:0005524; ATP binding; molecular function GO:0007165; signal transduction; biological process GO:0016021; integral to membrane; cellular component] | (NA) | (GO:0007155; cell adhesion; biological process GO:0030001; metal; ion transport biological) process GO:0046872; metal ion binding; molecular function | (GO:0004362; glutathione-disulfide reductase activity; molecular function GO:0005737; cytoplasm; cellular component GO:0006749; glutathione meta bolic process; biological process GO:0016491; oxidoreductase activity; molecular function GO:0016668; oxidoreductase activity, acting on a sulfur group of donors, NAD or NADP as accep tor; molecular function GO:0045454; cell redox homeo stasis; biological process GO:0050660; flavin adenine dinucleotide binding; molecular function GO:0050661; NADP binding; molecular function GO:0055114; oxidation-reduction process; Biological Process) |
GO terms are classified into three types as follows: Molecular function, biological process and cellular component. KEGG, Kyoto Encyclopedia of Genes and Genomes; COG, Cluster of Orthologous Groups of proteins; NR, non-redundant protein database; GO, Gene Ontology.
Figure 4.Phylogenetic tree of essential genes. Essential genes on the same branch belong to the same phylogenetic lineage and may act as the same type of antibacterial drug target genes.