Literature DB >> 23863604

Common and pathogen-specific virulence factors are different in function and structure.

Chao Niu1, Dong Yu, Yuelan Wang, Hongguang Ren, Yuan Jin, Wei Zhou, Beiping Li, Yiyong Cheng, Junjie Yue, Zhixian Gao, Long Liang.   

Abstract

In the process of host-pathogen interactions, bacterial pathogens always employ some special genes, e.g., virulence factors (VFs) to interact with host and cause damage or diseases to host. A number of VFs have been identified in bacterial pathogens that confer upon bacterial pathogens the ability to cause various types of damage or diseases. However, it has been clarified that some of the identified VFs are also encoded in the genomes of nonpathogenic bacteria, and this finding gives rise to considerable controversy about the definition of virulence factor. Here 1988 virulence factors of 51 sequenced pathogenic bacterial genomes from the virulence factor database (VFDB) were collected, and an orthologous comparison to a non-pathogenic bacteria protein database was conducted using the reciprocal-best-BLAST-hits approach. Six hundred and twenty pathogen-specific VFs and 1368 common VFs (present in both pathogens and nonpathogens) were identified, which account for 31.19% and 68.81% of the total VFs, respectively. The distribution of pathogen-specific VFs and common VFs in pathogenicity islands (PAIs) was systematically investigated, and pathogen-specific VFs were more likely to be located in PAIs than common VFs. The function of the two classes of VFs were also analyzed and compared in depth. Our results indicated that most but not all T3SS proteins are pathogen-specific. T3SS effector proteins tended to be distributed in pathogen-specific VFs, whereas T3SS translocation proteins, apparatus proteins, and chaperones were inclined to be distributed in common VFs. We also observed that exotoxins were located in both pathogen-specific and common VFs. In addition, the architecture of the two classes of VFs was compared, and the results indicated that common VFs had a higher domain number and lower domain coverage value, revealed that common VFs tend to be more complex and less compact proteins.

Entities:  

Keywords:  bacterial pathogens; common virulence factor; pathogen-specific virulence factor

Mesh:

Substances:

Year:  2013        PMID: 23863604      PMCID: PMC5359729          DOI: 10.4161/viru.25730

Source DB:  PubMed          Journal:  Virulence        ISSN: 2150-5594            Impact factor:   5.882


Introduction

Bacterial pathogens often cause various epidemic diseases, which threaten human health and lives., A growing number of researchers have conducted related research and made great achievements in the field of bacterial pathogens. Bacterial pathogens are parasitic organisms with specialized adaptations that allow them to interact with hosts. In the process of host–pathogen interactions, bacterial pathogens utilize a number of mechanisms, including adherence, invasion, antiphagocytosis, protein, or toxin secretion etc. That is to say that bacterial pathogens employ a number of special factors or so-called “virulence factors” during these interactions, so they can grow, reproduce, spread, and cause host damage ranging from mild annoyance to death. In the past two decades, the term “virulence factor” (VF) has been defined as a factor that was produced by a pathogen and causes diseases., A number of virulence factors (VFs) based on different types of mechanisms employed by bacterial pathogens were identified using molecular biological techniques. However, as the number and diversity of completed bacterial genomes began to increase, some identified VFs were also discovered to be encoded in the genomes of non-pathogenic bacteria or commensal bacteria., Moreover, some biological experiments also discovered this phenomenon, e.g., with the help of microarray analyses, many of the known virulence factors in pathogenic Escherichia coli and Neisseria spp. were identified as being present in the closely-related commensal Escherichia coli and non-pathogenic Neisseria lactamica., Now two views about the definition of virulence factor have emerged and their difference concerns whether the virulence factor must be absent in non-pathogenic bacteria. The first definition was defined using comparative genomics and considered that any virulence factors that should not be found in non-pathogens, whereas the second definition was defined using genetic techniques and models of infection., Confronted with this unresolved debate, some researchers have proposed the hypothesis of host interactions, namely that many so-called “virulence genes” are most likely involved in more general interactions between the microorganism and the host or the environment. Generally, the host–bacteria interaction can be classified as symbiotic, commensal, or pathogenic interactions., These divisions should be viewed as a “homeostasis” and this kind of equilibrium is constantly evolving. According to the ecological and evolutionary view of bacterial pathogenomics, pathogenic bacteria, symbiotic, and commensal bacteria often share their habitats with bacteriophages and other bacteria. In this mixed ecology, almost all of these bacteria also utilize similar strategies and molecular systems to interact with eukaryotic hosts and their maintained homeostasis is usually disrupted by the mechanisms of horizontal gene transfer or gene loss,, for example, pathogenic strains of Enterococcus faecium have evolved from a commensal species via horizontal gene transfer. Therefore, it is not surprising that genes encoding “virulence factors” are present in both pathogenic and nonpathogenic bacteria. Sui et al. performed the first systematic analysis across diverse genera, and found that virulence factors (VFs) are disproportionately associated with genomic islands (GIs, clusters of genes in a prokaryotic genome of probable horizontal origin). Both pathogen-associated VFs and common VFs (having homologs in both pathogens and non-pathogens) were identified by performing a sequence similarity and all were associated with GIs. In the current paper, the method of ortholog prediction was adopted between pathogenic and non-pathogenic bacteria., 1988 virulence factors collected from the virulence factor database (VFDB) and their orthologous proteins in a nonpathogenic bacteria protein database were identified by carrying out the reciprocal-best-BLAST-hits (RBH) approach. Each VF was identified as pathogen-specific (there is no orthologous protein in non-pathogenic bacteria), or common (there is one or more orthologous proteins in non-pathogenic bacteria). The distribution of the identified pathogen-specific and common VFs in pathogenicity islands, functional categories, protein architecture were systemically investigated. We believed that this research would be instructive for us to study the virulence factors in bacterial pathogens and to elucidate the pathogenic mechanism and the evolution of pathogenic bacteria in the future.

Results

Identification of pathogen-specific VFs and common VFs

The 1988 VFs and the proteins from the non-pathogenic bacteria protein database were compared by conducting the reciprocal-best-BLAST-hits. If a VF had one or more orthologous proteins in the non-pathogenic bacteria protein database, the VF was identified as a common VF that was not only found in bacterial pathogens, but also in bacterial non-pathogens. Otherwise, the VF was identified as a pathogen-specific VF that was only present in bacterial pathogen. The cutoff e-value was routinely set to 1e−7, and protein pairs with lower e-values were considered orthologous pairs. In the 1988 VFs, we identified 620 pathogen-specific VFs and 1368 common VFs, which account for 31.19% and 68.81% of the total VFs, respectively. Moreover, among the 1368 common VFs, there were 1239 VFs (90.57%) that had two or more orthologous proteins in the non-pathogenic bacterial protein database, which ensured the correctness of most of the ortholog predictions.

Pathogenicity islands (PAIs) contained a higher proportion of pathogen-specific VFs

It is well known that pathogenicity islands (PAIs), which are involved in virulence and most likely acquired from horizontal gene transfer (HGT),, belong to a subclass of genomic islands. In the past, many novel virulence factors of pathogenic bacteria, e.g., adhesins, invasins, toxins, secretion systems (especially the type III secretion system [T3SS] and type IV secretion system [T4SS]), iron uptake systems, and others, have been identified in pathogenicity islands,- and these findings suggest that horizontal gene transfer, especially the horizontal gene transfer mediated by pathogenicity islands, has played key roles in the evolution of bacterial pathogens.,, In order to investigate the distribution of pathogen-specific VFs and common VFs in pathogenicity islands, we tabulated the number of pathogen-specific VFs in PAIs, outside of PAIs, and number of common VFs in PAIs and outside of PAIs in a 2 × 2 contingency table according to the VFDB classification and used a Chi-square test with Yates correction for continuity correction. The statistical result showed that pathogen-specific VFs were more likely to be located in the PAIs, and common VFs were more likely to be located outside of PAIs (P < 2.20e−16; Table 1), which implied that pathogen-specific VFs might be acquired by horizontal gene transfer, e.g., the horizontal gene transfer mediated by pathogenicity islands. The results also imply that pathogen-specific VFs might be more closely connected to pathogenicity and might play key roles in the evolution of bacterial virulence.

Table 1. Distributions of the VFs from the VFDB inside vs. outside of PAIs

 In PAIsOutside of PAIsSUM
Pathogen-specific VFs361259620
Common VFs4748941368
SUM83511531988

Pearson Chi-square test with Yates continuity correction. χ-squared = 96.3863, df = 1, P < 2.2e−16.

Pearson Chi-square test with Yates continuity correction. χ-squared = 96.3863, df = 1, P < 2.2e−16.

The distribution of pathogen-specific VFs and common VFs in each functional category was different

According to its classification scheme, the VFDB mainly contained 49 different virulence functional categories, e.g., flagella, capsule, toxin, etc. (see Table 2). In order to investigate the distribution of pathogen-specific VFs and common VFs in each functional category, we tabulated the number of pathogen-specific VFs and the number of common VFs in each functional category in a 49 × 2 contingency table according to the VFDB classification and then used a Chi-square test with Yates correction with corrections for multiple testing to acquire the statistical results.

Table 2. The distribution of pathogen-specific and common VFs in each functional category according to the VFDB classification

VFDB classificationPathogen-specificVFsCommonVFsP valueb
#%a#%a 
Functional categories with a higher percentage of pathogen-specific VFs
Exotoxin7311.77372.701.49e−14*
Type IV secretion system (T4SS)7612.26564.094.01e−10*
Unclassified protein (T3SS)c9014.52997.249.84e−06*
Effector protein (T3SS)315.00181.321.41e−05*
Pathogenicity islandd17127.5826719.523.34e−04*
Antiphagocytosis-associated protein40.6510.071.24e−01
Chaperone(T3SS)81.2970.512.58e−01
Protease71.1370.513.85e−01
Type VII secretion system142.26181.324.30e−01
Plasminogen activator20.3210.074.92e−01
Translocation protein (T3SS)50.8150.376.16e−01
Anti-proteolysis10.1600.006.11e−01
Afimbrial adhesin284.52493.586.68e−01
Actin-based motility10.1610.078.60e−01
Proinflammatory effect10.1610.078.32e−01
Exoenzyme121.94221.611
Secretion apparatus protein(T3SS)121.94251.831
Categories with a higher percentage of common VFs
Flagella10.1614610.671.08e−14*
Capsule111.771228.927.70e−08*
Endotoxin or lipopolysaccharide(LPS)40.65705.125.50e−07*
Iron uptake91.45886.431.90e−05*
Regulation00.00332.412.62e−05*
Type VI secretion system(T6SS)10.16372.701.01e−04*
Type II secretion system(T2SS)20.32302.196.24e−03*
Stress protein00.00120.888.71e−02
Cell metabolism00.0090.662.11e−01
Unclassified60.97302.192.64e−01
Urease00.0070.512.90e−01
Immune evasion10.16100.734.42e−01
Cell wall10.16100.734.22e−01
Biofilm formation00.0040.295.97e−01
Intracellular survival00.0040.295.75e−01
Invasion30.48130.957.05e−01
Magnesium uptake00.0030.228.52e−01
IgA1 protease00.0030.228.26e−01
Serum resistance00.0030.228.02e−01
Fimbriae447.101037.531
Molecular mimicry10.1630.221
Manganese uptake00.0010.071
Complement protease00.0020.151
Nutrient acquisition00.0010.071
Biosurfacant00.0020.151
Peptidase00.0010.071
Enzyme00.0010.071
Heat-shock protein00.0010.071
Pigment00.0020.151
Bile resistance00.0010.071
Complement resistance00.0010.071
Resistance to antimicrobial peptides00.0010.071
SUM620 1368  

a The percentage of pathogen-specific or common VFs in a given functional category. bPearson Chi-square test with Yates continuity correction (see Materials and Methods). Asterisks indicate statistical significance (P value < 0.05). cThe number of the genes involved with T3SS, except for the number of effector proteins, chaperones, translocation apparatus proteins and secretion apparatus proteins included in the T3SS. dThe number of the genes involved in PAIs, not including the number of the virulence factors included in other functional categories, e.g., the number of the genes encoding T3SS or T4SS in PAIs.

a The percentage of pathogen-specific or common VFs in a given functional category. bPearson Chi-square test with Yates continuity correction (see Materials and Methods). Asterisks indicate statistical significance (P value < 0.05). cThe number of the genes involved with T3SS, except for the number of effector proteins, chaperones, translocation apparatus proteins and secretion apparatus proteins included in the T3SS. dThe number of the genes involved in PAIs, not including the number of the virulence factors included in other functional categories, e.g., the number of the genes encoding T3SS or T4SS in PAIs. In our statistical results (see Table 2), proteins belonging to the exotoxin, T4SS, T3SS unclassified protein, T3SS effector protein, and PAI, which were all directly associated with the form of virulence, were more inclined to be distributed in the pathogen-specific VFs, and this suggested that pathogen-specific VFs might be closely connected with the form of virulence. Conversely, flagella, capsule, endotoxins, iron uptake protein, regulation protein, the type VI secretion system (T6SS), and type II secretion system (T2SS) that were apt to be involved in host interactions, were more inclined to be common VFs, and this finding indicated that common VFs might be involved in host interaction. In addition,, there were some functional categories whose distribution in both pathogen-specific VFs and common VFs did not have statistical significance, and we observed that these categories contained some antagonistic proteins, protease, immune evasion, general secretion system, bacterial adherence, cell structure proteins and invasion, etc. We would discuss each functional category included among pathogen-specific VFs and common VFs in more details below.

Exotoxins were included in both among the pathogen-specific and common VFs

Many bacterial pathogens can synthesize exotoxins, which are toxic to host cells and always play a central role in the pathogenesis of microbial diseases. In our statistical results, most of the exotoxins were specific to pathogens and had no orthologous proteins in non-pathogenic bacteria (P = 1.49e−14, see Table 2). However, there were still some exotoxins among the common VFs. According to the VFDB classification, exotoxins can be further classified into three functional categories: membrane-acting toxins, membrane damaging toxins, and intracellular toxins (see Table 3). Membrane-acting toxins bind to a receptor on the cell surface and stimulate intracellular signaling pathways, membrane damaging toxins exhibit hemolysin or cytolysin activity in vitro, and intracellular toxins possess enzymatic activity and affect internal cellular bio-mechanisms or inhibit protein synthesis. From Table 3, we found pathogen-specific exotoxins mainly including membrane-acting superantigen and enterotoxin, membrane-damaging pore-forming toxin except for the RTX toxin (repeat in structural toxin), and most of intracellular toxin (e.g., N-glycosidase, neurotoxin, adenylate cyclase, ADP-ribosyltransferase toxins, etc.).

Table 3. Proportions of pathogen-specific exotoxins from the VFDB according to the VFDB classification

ExotoxinclassificationSubclassificationTotalPathogen-specific exotoxins
##%a
Membrane-actingtoxinSuperantigen1919100
Enterotoxin33100
Membrane-damagingtoxinPore-forming-44100
Channel-forming involvingα-helix-containing toxin11100
Channel-forming involvingβ-sheet-containing toxin77100
Cholesterol-dependent cytolysin (CDC)44100
RTX toxin(repeat in structural toxin)1400
Phospholipase C7114.29
Intracellular toxinAdenylate cyclase4375
ADP-ribosyltransferase261869.23
DnaseI3266.67
Neurotoxin22100
N-glycosidase66100
Deamidase200
Glucosyltransferase100
Other toxinsMurine toxin11100
Hemolysin/bacteriocin: Biofilm formation4250
Accessory cholera enterotoxin100
Zona occludens toxin100
SUM 11073 

a The percentage of pathogen-specific exotoxins in a given functional category.

a The percentage of pathogen-specific exotoxins in a given functional category. However, some membrane-damaging toxin, for example, pore-forming RTX toxin and membrane-damaging phospholipases C were also included in the common VFs (see Table 3). The prototype of RTX toxins was the Escherichia coli α-hemolysin(HlyA), which was the best-characterized RTX protein secreted by a type I secretion system. The synthesis, activation and secretion of E. coli HlyA were controlled by the hlyCABD operon, which included hlyC, hlyA, hlyB, and hlyD genes. In hlyCABD operon, the hlyC was a fatty acid acyltransferase, which was responsible for acylation of Pro-HylA and was independent of the secretion of HlyA, the hlyA encoded a structural toxin, whereas the hlyB and hlyD encoded a type I secretion system and they were components of the HlyA secretion apparatus. In our research, 14 RTX toxin genes that were contained in our data were all identified as common VFs, and this was consistent with the studies of the RTX toxin in nonpathogens., Among the 14 genes, there were four structural toxin genes that had same function as hlyA and the rest had the same functions as hlyC, hlyB, or hlyD. Previous work had shown that the hlyA was an α-hemolysin and characterized by a domain consisting of tandemly arranged glycine-rich nonameric repeats near the protein C terminus, which was responsible for Ca2+ binding. However, the membrane insertion of α-hemolysin was independent from membrane lysis and the calcium binding was essential for toxin activity. So the RTX toxin was not directly involved in virulence and their roles in bacterial non-pathogens need to be studied further. As for the membrane-damaging phospholipase C toxins, they were synthesized by many widespread bacteria and possessed an enzymatic activity. So far, it had not been substantiated that all phospholipases C would have lethal properties. In addition, more and more research indicated that the measurement of the cytolytic potential or lethality of phospholipases C could not accurately indicate their roles in the pathogenesis of disease. Through the investigation of the genetic diversity of the four Mycobacterium tuberculosis phospholipase C-encoding genes (plcA, plcB, plcC, and plcD), it was suggested that the plcD region was significantly associated with the pathogenesis of the tuberculosis and that the plcD gene might play a more important role in the pathogenesis of thoracic tuberculosis. Through the above analysis, we found most of exotoxins that were directly involved in virulence were pathogen-specific, and this indicated that the pathogen-specific exotoxins were essential to bacterial virulence and played a central role in pathogenicity. Whereas, as for the roles of the common “exotoxins” in pathogenicity (e.g., RTX toxin, phospholipases C toxin, etc.), some were associated with the secretion of exotoxins and some were still in dispute on whether directly involved in virulence.

Type III secretion system proteins belonged to different classes between pathogen-specific and common VFs

Many gram-negative bacteria use type III secretion systems (T3SS) to secrete virulence factors into the cytosol of host cells. The gene clusters encoding T3SS are often located on virulence plasmids or in pathogenicity islands., Sui et al. analyzed VFDB functional classes and demonstrated that type III and type IV secretion systems were pathogen-associated VFs and associated with GIs. T3SS proteins are grouped into four classes: bacterial membrane apparatus proteins, translocon proteins, effector proteins, and type III chaperones. We noticed that not all classes of T3SS proteins were pathogen-specific. In fact, T3SSs have also been discovered in commensal and symbiotic bacteria., In our further result, in T3SS proteins, only effector proteins and T3SS unclassified proteins were included among the pathogen-specific VFs, and most of the T3SS translocation proteins, type III chaperone proteins and secretion apparatus proteins were not pathogen-specific and had many orthologous proteins in nonpathogenic bacteria in our ortholog’s prediction (see Table 2). For the T3SS proteins, effector molecules are injected into eukaryotic host cells, and specifically interfere with the eukaryotic cells functions, resulting in the unbalance of host cells functions. Some previous studies have indicated that the type III secretion systems were assembled from core components of the flagellar machine, and this had resulted that T3SS translocation proteins and secretion apparatus proteins were not pathogen-specific. As for the chaperones in T3SSs, they had functions that are focused on protein folding and stress repair, and possessed nearly no virulence properties. From the above analysis, T3SS effector proteins that were directly involved in virulence, were inclined to be distributed in pathogen-specific VFs, whereas the translocation proteins, secretion apparatus proteins and T3SS chaperones that assisted the secretion of effector proteins were inclined to be distributed in common VFs.

Most type IV secretion system effector proteins were pathogen-specific

Type IV secretion systems (T4SS) were versatile systems, which could mediate conjugal transfer between bacteria by a variety of bacteria. Like T3SS, T4SS are also used by bacterial pathogens to secrete “effectors” into host plant or animal cells, and are found in pathogenicity islands mediated by horizontal gene transfer. For example, the T4SS in H. pylori is encoded by the cag PAI. Up to now, T4SSs has been discovered and identified in some bacterial pathogens, such as Bordetella pertussis, Bartonella spp., Legionella pneumophila, Brucella spp., and Helicobacter pylori. However, the components of T4SS are not as well identified as the components of T3SS and the identified components are mainly effector proteins. In our result (see Table 2), most of the T4SS-encoding genes tended to be distributed among the pathogen-specific VFs (P = 4.01e−10). However, many T4SS-encoding genes were included in the common VFs. As we know, effector proteins often played a critical role in bacterial pathogenicity. To examine whether effector proteins secreted by T4SSs were included in our identified pathogen-specific VFs, we collected the identified effector proteins of four well-studied bacterial pathogens from related literature (see Table 4). Some of the identified effector proteins were not listed as definitive effector proteins in VFDB database. As observed in Table 4, we noted that 83.87% of the effector proteins from four related bacterial pathogens were pathogen-specific. Because the archetypal T4SSs are bacterial conjugation machines, which are widespread in bacteria, it was not surprising to find that some components of T4SSs were not pathogen-specific.

Table 4. T4SSs of four well-studied pathogenic bacteria and their proportions of pathogen-specific effectors

BacteriumT4SSSUMaPathogen-specific effector proteinsReferences
##%b 
Bartonella spp.VirB/VirD47457.1456
Bordetella pertussisPtl5510057
Helicobacter pyloricagPAI1110058
Legionella pneumophilaDot/lcm181688.8959 and 60
SUM312683.87 

a The number of all known effectors in a given bacterium. bThe percentage of pathogen-specific effectors in a given bacterium.

a The number of all known effectors in a given bacterium. bThe percentage of pathogen-specific effectors in a given bacterium.

Common VFs tended to be involved in general host interaction

During the process of bacterial evolution, vertical descent and duplication might be considered the primary events of genome evolution. Orthologs and paralogs are two types of homologous sequences in genetics. Orthologs are commonly defined as genes that have evolved by vertical descent from a common ancestor and tend to perform the same function. In our research, the VFs that had one or more orthologous proteins among non-pathogenic bacteria were identified as common VFs. Orthologs in different species were commonly defined as genes that had evolved by vertical descent from a common ancestor and tend to perform the same function. In our results, the VFs that were in flagella, capsule, endotoxin, iron uptake, regulation, T6SS, and T2SS categories tended to be found among the common VFs (Table 2). These findings indicated that these VFs were universal in both pathogenic and non-pathogenic bacteria and tended to carry the same functions. In Table 5, we list the characteristics and functions of those functional categories that were more inclined to be found in common VFs, and found that common VFs were more likely to be involved in general host interaction.

Table 5. The characteristics and functions of each functional category that was more inclined to be found in common VFs

Functional categoriesCharacteristicsFunctionsReferences
FlagellaSurface organelleFlagella are used for motility and chemotaxis in bacteria61
CapsulePrimarily structural component of gram-positive cell wallProtect and avoid phagocytosis62
Lipopolysaccharide (LPS) or endotoxinComponents of the outer membrane of the cell wall of gram-negative bacteriaActivate the host complement pathway63
Iron uptakeMediate the release of host iron for parasitic consumptionUsed for iron uptake and heme-utilization64
RegulationRegulate the expression of various genesAdapt to the host surrounding3 and 53
Type VI secretion systemT6SSs are widespread in gram-negative proteobacteriaA secretory system that play a general role in mediating host interaction65
Type II secretion systemT2SSs are encoded by genes of the general secretion pathway (gsp) and are widely distributed in gram-negative bacteriaMain terminal branch of the general secretory pathway66
In order to verify the relationship between common VFs and general host interaction further, we collected 278 non-pathogenic bacterial strains and information on their habitats status. In the end, we obtained 65 host-associated and 213 non-host-associated nonpathogenic bacterial strains. According to our orthologs prediction, 1169 (85.45%) of 1368 common VFs had orthologous proteins in the “host-associated” non-pathogenic bacteria strains, and this finding suggested that most of common VFs were more inclined to be involved in general host interaction.

Domain architecture of VFs’ proteins

Domains are considered as the basic units of protein folding, evolution, and function. Decomposing each protein into modular domains is a basic prerequisite for the accurate functional classification of biological molecules. The function of a protein is determined by its structure, which is mostly embodied in its domain architecture. In order to investigate the protein domain characteristics of pathogen-specific VFs and common VFs, we inspected the DN and calculated the DC for each VF. Our results showed that common VFs had a larger DN than pathogen-specific VFs on average (see Fig. 1), and the proportions of multi-domain proteins in common VFs was higher than that of in pathogen-specific VFs (χ-squared = 38.1187, df = 1, P = 6.657e−10, see Table 6). This indicated that common VFs were more likely to be multi domain protein than pathogen-specific VFs, and vice versa. Both the DC of common VFs and that of pathogen-specific VFs varied in a large range but with slightly different medians. The average DC value for the two types of VFs was 0.66 and 0.70, respectively (see Fig. 2).

Figure 1. The proportion of multi-domain proteins among the VF proteins.

Table 6. Distributions of the VFs from the VFDB in single-domain vs. in multi-domain proteins

 Proteins with one annotated domainProteins with two or more annotated domainsSUM
Pathogen-specific VFs318104422
Common VFs7605401300
SUM10786441722

Pearson Chi-square test with Yates continuity correction. χ-squared = 38.1187, df = 1, P value = 6.657e−10.

Figure 2. The DC of common and pathogen-specific VF proteins.

Figure 1. The proportion of multi-domain proteins among the VF proteins. Pearson Chi-square test with Yates continuity correction. χ-squared = 38.1187, df = 1, P value = 6.657e−10. Figure 2. The DC of common and pathogen-specific VF proteins. DN can be regarded as an indicator of the complexity of a protein structure. The larger the DN, the more complex the protein is. DC can be used as a parameter representing the structural compactness of a protein. The smaller the DC, the looser the protein is. From our results, it can be inferred that pathogen-specific VF proteins tend to be simpler structures than common VF proteins. The fact that the common VFs had a higher DN than pathogen-specific VFs but with a slightly lower DC values indicates that common VFs tend to be more complex and less compact proteins than pathogen-specific VFs.

Discussion

Pathogen-specific VFs are the divergence that distinguished pathogens from non-pathogens and are exclusively found in pathogens. Conversely, common VFs are shared by the pathogenic and non-pathogenic bacteria and are involved in general host interaction, and survival or maintenance of basic functions in the host. For example, the surface organelles of bacteria, flagella, and fimbriae are primarily the structural components of the organisms. Within the host, using the motility and chemotaxis provided by flagella, bacteria can move to their destinations or their target tissues. At the same time, in order to avoid phagocytosis, bacteria have evolved surface components that prevent the attachment and engulfment of macrophages and other host cellular immune responses. Gram-positive bacteria are naturally surrounded by a thick cell wall (capsule), whereas gram-negative bacterial lipopolysaccharide (LPS) or “endotoxin” can protect against complement-mediated lysis. However, bacterial capsule and lipopolysaccharide are primarily cell wall structural components. In addition to the common capsule and lipopolysaccharides, there were several antigens that are pathogen-specific among bacterial pathogens and can inhibit adsorption, such as streptococcal protein M and staphylococcal protein A., In addition, bacteria usually use adhesins to adhere to the specific tissues or host cells. Bacterial adhesins can be divided into two major types: pili (fimbriae) and nonpilus adhesins (afimbrial adhesins). Fimbriae are mainly structural components. In order to colonize the human gut mucosa, EHEC O157:H7 and the commensal E. coli K12 use the common pilus adherence factor: E. coli common pilus (ECP) for epithelial cell colonization, as proven in previous experiments. With respect to afimbrial adhesins, many pathogenic bacteria, e.g., Staphylococcus aureus and Streptococcus pyogenes., share the same ability to adhere to distinct components of the extracellular matrix (ECM). However, the same specific binding occurs between lactobacilli and components of the extracellular matrix (ECM), including collagen and fibronectin. After entering a host cell, the host environment is continuously changing and may not always be ideal for bacterial survival. In order to adapt to the host surroundings, bacteria have to evolve some strategies. For example: the iron uptake factors, e.g., transferrins, hemoglobin protease, hemolysins, and siderophores, are used for iron uptake and heme utilization by bacteria and are indispensable for survival in the host. At the same time, along with the changing of the host environment, bacteria have to use regulatory factors to regulate the expression of various genes and, ultimately, to adapt to the new niche. Bacterial secretion systems are mainly used to deliver toxins or effector proteins into the eukaryotic host cells and modulate the interactions of bacteria with their environments. Currently, seven different types of secretion systems (referred to as Type I–VII or T1SS–T7SS) have been identified, most of which have been carefully investigated. The toxins or effector proteins secreted by secretion systems play a central role in pathogenesis. However, the presence of secretion systems in nonpathogenic bacteria suggests that the involvement of secretion systems is not limited to virulence, such as T6SS and T2SS in nonpathogenic bacteria, and that such systems may also be implicated in functions such as host/symbiont communication, exchange, and cell–cell communication.

Conclusion

In this paper, pathogenic-specific VFs and common VFs were systematically identified in bacterial pathogens by ortholog predictions between the VFs from VFDB and non-pathogenic bacteria. In VFDB, most VFs (more than 68%) were common to both bacterial pathogens and non-pathogens, whereas only approximately 31% of VFs were pathogen-specific. The VFs that were directly involved in virulence, such as exotoxins, T3SS effector proteins, T4SS effector proteins, and PAIs, tended to be distributed among pathogen-specific VFs. Conversely, the VFs that were associated with the pathogenicity closely and did not directly to cause damage to host cells, such as T3SS translocation proteins, T3SS apparatus proteins and chaperones, flagella, capsule, endotoxins, iron uptake proteins, regulatory proteins, T6SS, and T2SS, were inclined to be located among the common VFs and might be associated with the general interaction between bacteria and host. In addition, the common VFs had a higher DN and a lower DC, which indicated that common VFs tend to be complex and less compact proteins.

Materials and Methods

Data

There are some available public virulence factor databases, including PRINTS, VFDB, MvirDB, etc. Of these databases, the virulence factor database (VFDB) is the most high-quality data set of bacterial VFs. The VFDB contains experimentally demonstrated bacterial VFs that were collected first based on the original research papers appearing in PubMed. Now, the VFDB contains 24 PAIs and over 2294 virulence-related genes from 24 different pathogen genera, including the most well-known medically important pathogens. For each genus, those experimentally demonstrated VFs were first collected based on the original research papers appearing in PubMed to form the primary database. Each VF entry is grouped into the functional categories, and is accompanied by relevant original literature articles or important reviews accessible through direct links to PubMed, as well as detailed information about related genes, keywords, structural features, functions, and pathogenic mechanisms. One thousand, nine hundred and eighty-eight of those VFs were in bacteria whose genome sequences had been completed, and they were from 51 medically significant bacterial pathogens. We chose 278 well-defined non-pathogenic bacterial genome sequences from the non-pathogenic bacteria protein database to identify VF orthologous proteins in nonpathogenic bacteria. All 278 strains were not pathogenic to the human, animals, or plants, and the information on their habitats status is known. Owning to the complexity of the evolution of bacterial pathogens, the opportunistic bacterial pathogens were excluded from our non-pathogenic bacteria protein database. The Genomic Standards Consortium (GSC) introduced the minimum information about a genome sequence (MIGS) specification, and the “habitat” was a key metadata descriptor in the proposed MIGS specification., They defined habitat as the place or environment where an organism naturally or normally lives and grows. Currently, GenBank and Genomes Online Database (GOLD) were the two major data sources about the MIGS specification. The phenotypes of the 278 non-pathogenic bacterial strains and their habitats status were obtained from the GenBank (http://www.ncbi.nlm.nih.gov/Genbank/): prokaryotic attributes table (e.g., pathogenic in, disease and environment: habitat fields) and Genomes Online Database (GOLD) (http://www.genomesonline.org/): organism information (e.g., phenotype, disease, and habitat fields)., We classified the habitat status into two categories: host-associated and non-host-associated (including aquatic, terrestrial, specialized, and multiple). We obtained 65 host-associated and 213 non-host-associated nonpathogenic bacterial strains from the 278 non-pathogenic bacterial strains.

Identification of the VFs' orthologous proteins

Ortholog prediction is paramount when conducting whole genome comparisons.,, Generally, orthologous genes are identified by phylogenetic analysis. However, sophisticated phylogenetic analysis is not easily automated and not high throughout. Therefore, ortholog prediction for large genome-scale data sets is typically performed using a reciprocal-best-BLAST-hits (RBH) approach and there are numerous orthologous resources that use this method, including the Clusters of Orthologous Groups (COG) database, the Institute for Genomic Research (TIGR)’s EGO database, and INPARANOID., In this paper, we mainly adopted the reciprocal-best-BLAST-hits (RBH) approach to identify the VF orthologous proteins. With the RBH method, genes from species A and species B are predicted to be orthologs if they are both the “best BLAST hit” of the other, when all genes from species A are compared with all genes from species B by BLAST analysis. The cutoff e-value was set to 1e−7, which was used to exclude distant homologs.

Identification of domain composition and calculation of domain coverage

The protein domain definition used in this study came from Pfam. The domain assignments were made by scanning libraries of HMMs against the protein sequences using HMMER-2.0s. A domain was assigned to a region of a query protein if a match to a domain HMM with an e-value lower than 0.001 was observed. In order to obtain the non-overlapping domain architecture of multi-domain proteins, we resolved overlapping domains according to some rules. We defined two domains as overlapping if more than 10% of the predicted domain locations were overlapping (based on the relative length of the domains). If, in the case of overlapping domains, the e-value difference was larger than 5 (on a –log10 scale), we kept the domain with the highest e-value. In cases where the difference was smaller, we kept the longest model. If both overlapping models had the same length, we considered differences in e-value. Based on the non-overlapping domain architecture, the domain number (DN) and domain coverage (DC) for the VFs were calculated. The DN in one protein is the number of annotated domains in the sequence of this protein, whereas the DC of a protein is the percentage of the amino acid sequence that defines the identified domains over the whole protein sequence. DN was calculated by including all non-overlapping domains in one protein. DC refers to the percentage of the entire length of all identified domains in a protein to its whole sequence length. The procedure for the DN and DC analyses employed in this study has been previously described.

Statistical analysis

In the tables of this paper, for those categories with small values (<5), the Fisher Exact Test was used instead. When multiple categories were examined in parallel, the Benjamini and Hochberg False Discovery Rate correction for multiple testing was performed for all functional category analyses. We considered P values smaller than 0.05 to be significant. All statistical analyses were performed using the R statistics package.
  66 in total

1.  Staphylococcus aureus protein A induces airway epithelial inflammatory responses by activating TNFR1.

Authors:  Marisa I Gómez; Aram Lee; Bharat Reddy; Amanda Muir; Grace Soong; Allyson Pitt; Ambrose Cheung; Alice Prince
Journal:  Nat Med       Date:  2004-07-11       Impact factor: 53.440

Review 2.  Pathogenicity islands: a molecular toolbox for bacterial virulence.

Authors:  Ohad Gal-Mor; B Brett Finlay
Journal:  Cell Microbiol       Date:  2006-08-24       Impact factor: 3.715

Review 3.  Bacterial protein toxins: current and potential clinical use.

Authors:  A Fabbri; S Travaglione; L Falzano; C Fiorentini
Journal:  Curr Med Chem       Date:  2008       Impact factor: 4.530

4.  Emerging infectious diseases: public health issues for the 21st century.

Authors:  S Binder; A M Levitt; J J Sacks; J M Hughes
Journal:  Science       Date:  1999-05-21       Impact factor: 47.728

5.  A receptor-binding region in Escherichia coli alpha-haemolysin.

Authors:  Aitziber L Cortajarena; Félix M Goni; Helena Ostolaza
Journal:  J Biol Chem       Date:  2003-02-11       Impact factor: 5.157

6.  Complete genome sequence of Vibrio fischeri: a symbiotic bacterium with pathogenic congeners.

Authors:  E G Ruby; M Urbanowski; J Campbell; A Dunn; M Faini; R Gunsalus; P Lostroh; C Lupp; J McCann; D Millikan; A Schaefer; E Stabb; A Stevens; K Visick; C Whistler; E P Greenberg
Journal:  Proc Natl Acad Sci U S A       Date:  2005-02-09       Impact factor: 11.205

7.  M protein, a classical bacterial virulence determinant, forms complexes with fibrinogen that induce vascular leakage.

Authors:  Heiko Herwald; Henning Cramer; Matthias Mörgelin; Wayne Russell; Ulla Sollenberg; Anna Norrby-Teglund; Hans Flodgaard; Lennart Lindbom; Lars Björck
Journal:  Cell       Date:  2004-02-06       Impact factor: 41.582

8.  The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata.

Authors:  Konstantinos Liolios; I-Min A Chen; Konstantinos Mavromatis; Nektarios Tavernarakis; Philip Hugenholtz; Victor M Markowitz; Nikos C Kyrpides
Journal:  Nucleic Acids Res       Date:  2009-11-13       Impact factor: 16.971

Review 9.  Bacterial toxins: friends or foes?

Authors:  C K Schmitt; K C Meysick; A D O'Brien
Journal:  Emerg Infect Dis       Date:  1999 Mar-Apr       Impact factor: 6.883

Review 10.  Comparison of the complete protein sets of worm and yeast: orthology and divergence.

Authors:  S A Chervitz; L Aravind; G Sherlock; C A Ball; E V Koonin; S S Dwight; M A Harris; K Dolinski; S Mohr; T Smith; S Weng; J M Cherry; D Botstein
Journal:  Science       Date:  1998-12-11       Impact factor: 47.728

View more
  18 in total

Review 1.  Virulence factor activity relationships (VFARs): a bioinformatics perspective.

Authors:  Hassan Waseem; Maggie R Williams; Tiffany Stedtfeld; Benli Chai; Robert D Stedtfeld; James R Cole; James M Tiedje; Syed A Hashsham
Journal:  Environ Sci Process Impacts       Date:  2017-03-22       Impact factor: 4.238

2.  A balanced gut microbiota is essential to maintain health in captive sika deer.

Authors:  Yu Wang; Jin Xu; Huan Chen; Jinyan Yu; Xiaomeng Xu; Lin Sun; Xun Xu; Chenyi Yu; Fei Xu; Jinlin Huang; Xin'an Jiao; Yunzeng Zhang
Journal:  Appl Microbiol Biotechnol       Date:  2022-08-04       Impact factor: 5.560

3.  Predicting the pathogenicity of bacterial genomes using widely spread protein families.

Authors:  Shaked Naor-Hoffmann; Dina Svetlitsky; Neta Sal-Man; Yaron Orenstein; Michal Ziv-Ukelson
Journal:  BMC Bioinformatics       Date:  2022-06-24       Impact factor: 3.307

4.  Targeted Treatment for Bacterial Infections: Prospects for Pathogen-Specific Antibiotics Coupled with Rapid Diagnostics.

Authors:  Tucker Maxson; Douglas A Mitchell
Journal:  Tetrahedron       Date:  2015-10-09       Impact factor: 2.457

5.  PAIDB v2.0: exploration and analysis of pathogenicity and resistance islands.

Authors:  Sung Ho Yoon; Young-Kyu Park; Jihyun F Kim
Journal:  Nucleic Acids Res       Date:  2014-10-21       Impact factor: 16.971

6.  Ubiquitin-Dependent Modification of Skeletal Muscle by the Parasitic Nematode, Trichinella spiralis.

Authors:  Rhiannon R White; Amy H Ponsford; Michael P Weekes; Rachel B Rodrigues; David B Ascher; Marco Mol; Murray E Selkirk; Steven P Gygi; Christopher M Sanderson; Katerina Artavanis-Tsakonas
Journal:  PLoS Pathog       Date:  2016-11-21       Impact factor: 6.823

7.  The Sit-and-Wait Hypothesis in Bacterial Pathogens: A Theoretical Study of Durability and Virulence.

Authors:  Liang Wang; Zhanzhong Liu; Shiyun Dai; Jiawei Yan; Michael J Wise
Journal:  Front Microbiol       Date:  2017-11-03       Impact factor: 5.640

8.  DBSecSys: a database of Burkholderia mallei secretion systems.

Authors:  Vesna Memišević; Kamal Kumar; Li Cheng; Nela Zavaljevski; David DeShazer; Anders Wallqvist; Jaques Reifman
Journal:  BMC Bioinformatics       Date:  2014-07-16       Impact factor: 3.169

9.  Encyclopedia of bacterial gene circuits whose presence or absence correlate with pathogenicity--a large-scale system analysis of decoded bacterial genomes.

Authors:  Maksim Shestov; Santiago Ontañón; Aydin Tozeren
Journal:  BMC Genomics       Date:  2015-10-13       Impact factor: 3.969

10.  Suppressed inflammation in obese children induced by a high-fiber diet is associated with the attenuation of gut microbial virulence factor genes.

Authors:  Hui Li; Guojun Wu; Liping Zhao; Menghui Zhang
Journal:  Virulence       Date:  2021-12       Impact factor: 5.882

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.