| Literature DB >> 28180111 |
Christophe Noroy1, Damien F Meyer2.
Abstract
During infection, some intracellular pathogenic bacteria use a dedicated multiprotein complex known as the type IV secretion system to deliver type IV effector (T4E) proteins inside the host cell. These T4Es allow the bacteria to evade host defenses and to subvert host cell processes to their own advantage. Ehrlichia chaffeensis is a tick-transmitted obligate intracellular pathogenic bacterium, which causes human monocytic ehrlichiosis. Using comparative whole genome analysis, we identified the relationship between eight available E. chaffeensis genomes isolated from humans and show that these genomes are highly conserved. We identified the candidate core type IV effectome of E. chaffeensis and some conserved intracellular adaptive strategies. We assigned the West Paces strain to genetic group II and predicted the repertoires of T4Es encoded by E. chaffeensis genomes, as well as some putative host cell targets. We demonstrated that predicted T4Es are preferentially distributed in gene sparse regions of the genome. In addition to the identification of the two known type IV effectors of Anaplasmataceae, we identified two novel candidates T4Es, ECHLIB_RS02720 and ECHLIB_RS04640, which are not present in all E. chaffeensis strains and could explain some variations in inter-strain virulence. We also identified another novel candidate T4E, ECHLIB_RS02720, a hypothetical protein exhibiting EPIYA, and NLS domains as well as a classical type IV secretion signal, suggesting an important role inside the host cell. Overall, our results agree with current knowledge of Ehrlichia molecular pathogenesis, and reveal novel candidate T4Es that require experimental validation. This work demonstrates that comparative effectomics enables identification of important host pathways targeted by the bacterial pathogen. Our study, which focuses on the type IV effector repertoires among several strains of E. chaffeensis species, is an original approach and provides rational putative targets for the design of alternative therapeutics against intracellular pathogens. The collection of putative effectors of E. chaffeensis described in our paper could serve as a roadmap for future studies of the function and evolution of effectors.Entities:
Keywords: Ehrlichia chaffeensis; comparative genomics; genome plasticity; host-pathogen interactions; type IV effectors
Mesh:
Substances:
Year: 2017 PMID: 28180111 PMCID: PMC5263134 DOI: 10.3389/fcimb.2016.00204
Source DB: PubMed Journal: Front Cell Infect Microbiol ISSN: 2235-2988 Impact factor: 5.293
Main biological and genetic characteristics of the eight .
| 1991 | 1999 | 1997 | 1998 | 1997 | 1996 | 1997 | 2000 | |
| Arkansas | Nebraska | Florida | Florida | Florida | Georgia | Florida | Tennessee | |
| 21-year old male | Human | 51-year old Woman | Human | Human | 52-year old, HIV+ | Human | Human | |
| Fever, headache, pharyngitis, nausea, vomiting, and dehydration. Cervical lymphadenopathy, splenomegaly, thrombocytopenia, leukopenia with left shift, elevations in serum aspartate transaminase concentration. | HME, no clinical description available. | Fever, non-productive cough, nausea, vomiting, and diarrhea, profoundly lethargic. Thrombocytopenia, leukopenia, elevations in serum aspartate transaminase concentration, doxycycline therapy, cerebrospinal fluid mononuclear pleocytosis, pulmonary oedema, hypotension, and anuria. The patient died in hospital on day 6. | Acute HME, no clinical description available. | Acute HME, no clinical description available. | Fever, headache, myalgia, nausea, and vomiting, orthostatic hypotension, thrombocytopenia, leukopenia with left shift, elevations in serum aspartate transaminase concentration, doxycycline therapy. Lobar pneumonia and acute renal failure. The patient died. | Acute HME, no clinical description available. | Acute HME, no clinical description available. | |
| Mild, chronic | UN | UN | Acute, severe | UN | UN | Acute, lethal | UN | |
| I/4/4 | II/3/3 | III/4/4 | III/4/4 | I/4/3 | II/3/3 | II/6/4 | II/3/3 | |
| Dawson et al., | Sumner et al., | Paddock et al., | Sumner et al., | Sumner et al., | Paddock et al., | Sumner et al., | Cheng et al., |
Human isolates of E. chaffeensis were classified in three genetic groups according to the 28-kDa major outer membrane gene cluster, the number of TRP32 repeats (variable-length PCR target gene, NCBI accession version # .
Figure 1Comparative genomics of 8 Phylogenetic tree of 8 E. chaffeensis strains. FastTree based on the Mauve alignment of the whole genomes of 8 E. chaffeensis strains. The node values indicate the local support values of the Shimodaira-Hasegawa test. The number outside the tree shows the genetic group of each strain, the West Paces strain was assigned to genetic group II due to the high level of conservation with the Heartland strain. (B) Alignments of 8 E. chaffeensis genomes generated using Mauve software (Darling et al., 2010) (http://gel.ahabs.wisc.edu/mauve/). Locally collinear blocks (LCBs), shown as rounded rectangles, represent regions with no rearrangement of homologous sequences across genomes. The forward or reverse orientation of the LCBs is indicated by their position, respectively above or below the line. Lines between the genomes trace orthologous LCBs. Using default parameters resulting in a minimum LCB weight of 70, there are 7 LCBs across all the genomes. The LCB weight defines the minimum number of matching nucleotides in a collinear region for it to be considered homologous across genomes and not the result of a spurious match. Regions outside LCBs were too divergent in at least one genome to be aligned successfully. Inside each LCB, vertical bars represent the similarity profile of the genome sequence. The height of each bar corresponds to the average level of conservation in that region of the genome sequence. (C) Shared and specific gene content between 8 E. chaffeensis strains. Each colored petal represents a different E. chaffeensis genome. The number in the center of the diagram represents the number of orthologous genes shared by all the genomes, thus defining the E. chaffeensis core genome. The number inside each individual petal corresponds to the number of genes that are absent from the core genome, and the numbers in brackets correspond to the number of genes specific to this strain. The number outside each petal shows the genetic group of each strain.
Putative type IV effectors (T4Es) identified by the S4TE algorithm.
| ECH_RS02870 | ECHLIB_RS01940 | ECHWAK_RS01950 | ECHWP_RS02750 | Hypothetical protein | 229 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | GSRs |
| ECH_RS03425 | ECHLIB_RS01385 | ECHWAK_RS01390 | ECHWP RS03295 | Hypothetical protein | 164 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | GDRs |
| ECH_RS04335 | ECHLIB_RS00490 | ECHWAK_RS00485 | ECHWP_RS00485 | Gamma carbonic anhydrase family protein | 151 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | GSRs |
| ECH_RS02750 | ECHLIB_RS02060 | ECHWAK_R502070 | ECHWP R502630 | Hypothetical protein | 141 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | GDRs |
| ECH_RS02620 | ECHLIB_RS02190 | ECHWAK_RS02200 | ECHWP_RS02500 | Alpha/beta hydrolase | 139 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | GSRs |
| ECH_RS03745 | ECHLIB_RS01065 | ECHWAK_RS01070 | ECHWP RS03615 | AI-2E family transporter | 122 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | GSRs |
| ECH_RS00450 | ECHLIB_RS04345 | ECHWAK_RS04360 | ECHWP_RS04320 | Hypothetica l protein | 118 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | GSRs |
| ECH_RS01210 | ECHLIB_RS03585 | ECHWAK_RS03600 | ECHWP RS01105 | DNA ligase (NAD(+)) LlgA | 117 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | BRs |
| ECH RS04225 | ECHLIB RS00595 | ECHWAK RS00600 | ECHWP RS00590 | Hypothetica l protein | 115 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | GSRs |
| ECHLIB_RS02720 | Hypothetical protein | 114 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | GDRs | |||
| ECH_RS02365 | ECHLIB_RS02435 | ECHWAK_RS02455 | ECHWP_RS02250 | Translation initiation factor IF-2 | 114 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | GDRs |
| ECH_RS03205 | ECHLIB_RS01605 | ECHWAK_RS01615 | ECHWP RS03075 | Diguanylate cyclase response regulator0 | 109 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | IBRs |
| ECH_RS03195 | ECHLIB_RS01615 | ECHWAK_RS01625 | ECHWP_RS03065 | NAD-glutamate dehydrogenase | 108 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | IBRs |
| ECH_RS02495 | ECHLIB_RS02315 | ECHWAK_R502330 | ECHWP R502375 | Peptide chain release factor 1 | 105 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | IBRs |
| ECH_RS04685 | ECHLIB_RS04640 | ECHWAK_RS04650 | Hypothetica l protein | 103 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | GSRs | |
| ECH_RS01570 | ECHLIB_RS03225 | ECHWAK_RS03240 | ECHWP RS01465 | Hypothetical protein | 101 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | GDRs |
| ECH_RS04230 | ECHLIB_RS00590 | ECHWAK_RS00595 | ECHWP_RS00585 | Hypothetica l protein | 99 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | GSRs |
| ECH RS02945 | ECHLIB RS01860 | ECHWAK RS01875 | ECHWP RS02825 | Transcriptional regulator | 98 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | IBRs |
| ECH_RS02385 | ECHLIB_RS02415 | ECHWAK_RS02435 | ECHWP_RS02270 | Hypothetica l protein | 98 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | GSRs |
| ECH_RS03860 | ECHLIB_RS00950 | ECHWAK_RS00955 | ECHWP RS03725 | Hypothetical protein | 97 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | IBRs |
| ECH_RS04650 | ECHLIB_RS00175 | ECHWAK_RS00175 | ECHWP_RS00175 | Protein translocase subunit SecA | 93 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | GSRs |
| ECH_RS02080 | ECHLIB_R502725 | ECHWAK_R502740 | ECHWP R501965 | Hypothetical protein | 93 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | GDRs |
| ECH_RS02075 | ECHLIB_RS02730 | ECHWAK_RS02745 | ECHWP_RS01960 | Conjugal transfer protein Trbl | 92 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | GDRs |
| ECH_RS03040 | ECHLIB_RS01765 | ECHWAK_R501780 | ECHWP R502910 | Peptidylprolyl isomerase | 91 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | GSRs |
| ECH_RS02255 | ECHLIB_RS02545 | ECHWAK_RS02565 | ECHWP_RS02140 | 165 rRNA (uracii(1498)-N(3)) -methyltransferase | 88 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | GDRs |
| ECH_RS03605 | ECHLIB_RS01205 | ECHWAK_R501210 | ECHWP R503475 | Hypothetical protein | 87 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | IBRs |
| ECH RS03515 | ECHLIB RS01295 | ECHWAK RS01300 | ECHWP RS03385 | Hypothetical protein | 87 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | GSRs |
| ECH_RS02340 | ECHLIB_RS02460 | ECHWAK_RS02480 | ECHWP RS02225 | Hypothetical protein | 87 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | IBRs |
| ECH_RS01565 | ECHLIB_RS03230 | ECHWAK_RS03245 | ECHWP_RS01460 | Exodeoxyribonuclease V subunit beta | 87 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | IBRs |
| ECH_RS01140 | ECHLIB_RS03660 | ECHWAK_RS03675 | ECHWP RS01035 | Hypothetical protein | 87 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | GSRs |
| ECH_RS03890 | ECHLIB_RS00920 | ECHWAK_RS00925 | ECHWP_RS03755 | DNA-directed RNA polymerase subunit beta | 85 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | GDRs |
| ECH_RS00205 | ECHLIB_RS04580 | ECHWAK_RS04590 | ECHWP RS04555 | Type IV secretion system protein VirD4 | 85 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | IBRs |
| ECH_RS03630 | ECHLIB_RS01180 | ECHWAK_RS01185 | ECHWP_RS03500 | DNA processing protein DprA | 84 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | GSRs |
| ECH_RS03440 | ECHLIB_RS01370 | ECHWAK_RS01375 | ECHWP RS03310 | Phage capsid protein | 82 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | GSRs |
| ECH_RS03530 | ECHLIB_RS01280 | ECHWAK_RS01285 | ECHWP_RS03400 | Hypothetica l protein | 81 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | IBRs |
| ~RS01950 | ECHLIB_RS02855 | ECHWAK_RS02870 | ECHWP RS01835 | Molecular chaperone DnaK | 81 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | IBRs |
| ECH_RS03610 | ECHLIB_RS01200 | ECHWAK_RS01205 | ECHWP_RS03480 | Hypothetica l protein | 80 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | GSRs |
| ECH_RS03260 | ECHLIB_RS01550 | ECHWAK_RS01560 | ECHWP RS03130 | abc-ATPase UvrA | 77 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | GSRs |
| ECH_RS03895 | ECHLIB_RS00915 | ECHWAK_RS00920 | ECHWP_RS03760 | 505 ribosomal protein L7 /L12 | 74 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | GDRs |
| ECH_RS02525 | ECHLIB_RS02285 | ECHWAK_RS02300 | ECHWP RS02405 | Glutamate-tRNA ligase | 74 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | GSRs |
| ECH RS00785 | ECHLIB_RS04010 | ECHWAK_RS04025 | ECHWP_RS03985 | Hypothetica l protein | 74 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | GSRs |
| 'Eei(RS00330 | ECHLIB_RS04455 | ECHWAK_RS04465 | ECHWP RS04430 | 1-acyl-sn-glycerol-3-phosphate acyltransferase | 74 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | GSRs |
| ECH_RS00255 | ECHLIB_RS04530 | ECHWAK_RS04540 | ECHWP_RS04505 | Hypothetical protein | 74 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | BRs |
| ECH_RS03415 | ECHLIB_RS01395 | ECHWAK_RS01400 | ECHWP RS03285 | NAD+ synthetase | 73 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | GSRs |
| ECH_RS02490 | ECHLIB_RS02320 | ECHWAK_RS02335 | ECHWP_RS02370 | GTP-binding protein | 73 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | IBRs |
| ECH_RS00505 | ECHLIB_RS04290 | ECHWAK_RS04305 | ECHWP_RS04265 | Citrate (Si)-synthase | 73 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | GDRs |
| ECH RS03555 | ECHLIB RS01255 | ECHWAK RS01260 | ECHWP RS03425 | Hypothetica l protein | 72 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | GSRs |
This table shows the candidate T4Es identified by S4TE software in four E. chaffeensis strains. The Liberty strain is used as a reference to sort predicted effectors, and the homolog candidate effectors are ranked by S4TE scores. Each T4E is defined by the gene ID, Name, and S4TE features.
Figure 2Distribution of Distribution of E. chaffeensis str. Arkansas genes according to the length of their flanking intergenic regions (FIRs). All E. chaffeensis genes were sorted into 2-dimensional bins according to the length of their 5′ (y-axis) and 3′ (x-axis) FIRs. The number of genes in the bins is represented by a color-coded density graph. Genes whose FIRs are both longer than the median FIR length were considered as gene-sparse region (GSR) genes. Genes whose FIRs are both below the median value were considered as gene-dense region (GDR) genes. In-between region (IBR) genes are genes with a long 5′FIR and short 3′FIR, or inversely. Candidate effectors predicted using the S4TE algorithm were s plotted on this distribution according to their own 3′ and 5′ FIRs. A color is assigned to each of the three following groups: Red to GDRs, orange to IBRs, and blue to GSRs. (B) Distribution of genes in GDRs, IBRs, and GSRs of E. chaffeensis strains. The proportion of the genome and the effectome that occurs in GDRs (red), IBRs (orange), and in GSRs (blue) is indicated.
Figure 3Protein-protein interaction network between the . A sub-cellular location was predicted with the S4TE algorithm (http://sate.cirad.fr) for Ehrlichia candidate effectors (left) and with CELLO2GO software (http://cello.life.nctu.edu.tw/cello2go/) for human proteins (right). Blue and red circles represent predicted T4Es located in the cytoplasm and in the nucleus of the host cell, respectively. Blue, red, pink, green, purple, yellow, and turquoise hexagons represent the different locations of targeted human proteins in the host cell. Hexagons harbor several colors when CELLO2GO predicts several probable subcellular locations.