| Literature DB >> 23427276 |
Pieter Meysman1, Aminael Sánchez-Rodríguez, Qiang Fu, Kathleen Marchal, Kristof Engelen.
Abstract
Escherichia coli K12 is a commensal bacteria and one of the best-studied model organisms. Salmonella enterica serovar Typhimurium, on the other hand, is a facultative intracellular pathogen. These two prokaryotic species can be considered related phylogenetically, and they share a large amount of their genetic material, which is commonly termed the "core genome." Despite their shared core genome, both species display very different lifestyles, and it is unclear to what extent the core genome, apart from the species-specific genes, plays a role in this lifestyle divergence. In this study, we focus on the differences in expression domains for the orthologous genes in E. coli and S. Typhimurium. The iterative comparison of coexpression methodology was used on large expression compendia of both species to uncover the conservation and divergence of gene expression. We found that gene expression conservation occurs mostly independently from amino acid similarity. According to our estimates, at least more than one quarter of the orthologous genes has a different expression domain in E. coli than in S. Typhimurium. Genes involved with key cellular processes are most likely to have conserved their expression domains, whereas genes showing diverged expression are associated with metabolic processes that, although present in both species, are regulated differently. The expression domains of the shared "core" genome of E. coli and S. Typhimurium, consisting of highly conserved orthologs, have been tuned to help accommodate the differences in lifestyle and the pathogenic potential of Salmonella.Entities:
Keywords: Escherichia coli; Salmonella; expression conservation; expression divergence; gene expression; pathogenesis
Mesh:
Year: 2013 PMID: 23427276 PMCID: PMC3649669 DOI: 10.1093/molbev/mst029
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FDistribution of the EC score between the orthologous genes of Escherichia coli and Salmonella enterica serovar Typhimurium depicted by its kernel smoothed density estimate (blue line). The distribution of the EC scores for gene pairs with randomized expression values, which represent the estimated score given no conservation of expression, is shown as a red line. The distribution of the EC scores resulting from comparison of the E. coli compendium to itself with data from different experiments is shown as a green line and represents the estimated score given perfect conservation of expression.
GO Enrichment of Conserved and Divergent Genes.
| GO | ||
|---|---|---|
| Genes with divergent expression | ||
| Phospholipid biosynthetic process | 4.87 × 10−6 | |
| Lipid A biosynthetic process | 1.81 × 10−5 | |
| Biosynthetic process | 3.15 × 10−5 | |
| Catabolic process | 1.44 × 10−5 | |
| Metabolic process | 6.55 × 10−6 | |
| Genes with conserved expression | ||
| Translation | 1.42 × 10−52 | |
| Regulation of translation | 6.79 × 10−7 | |
| Translational termination | <1 × 10−60 | |
| Translational elongation | <1 × 10−60 | |
| Protein metabolic process | 4.99 × 10−33 | |
| Gene expression | 8.67 × 10−19 | |
| Transcription termination | <1 × 10−60 | |
| tRNA metabolic process | 4.65 × 10−5 | |
| tRNA aminoacylation for protein translation | 1.37 × 10−6 | |
| ncRNA metabolic process | 9.04 × 10−7 | |
| tRNA aminoacylation | 2.68 × 10−6 | |
| Ribosome biogenesis | 3.80 × 10−5 | |
| Macromolecule metabolic process | 1.85 × 10−16 | |
| Macromolecular complex subunit organization | 1.01 × 10−7 | |
| Primary metabolic process | 1.23 × 10−19 | |
| Metabolic process | 2.82 × 10−23 | |
| Nucleotide biosynthetic process | 1.25 × 10−6 | |
| Nucleoside biosynthetic process | 1.80 × 10−6 | |
| Purine ribonucleotide biosynthetic process | 3.65 × 10−5 | |
| Purine ribonucleoside biosynthetic process | 4.13 × 10−5 | |
| Ribonucleotide biosynthetic process | 8.55 × 10−7 | |
| Ribonucleoprotein complex biogenesis | 3.8 × 10−5 | |
| Amino acid derivative metabolic process | 1.87 × 10−5 | |
| Fatty acid biosynthetic process | 1.24 × 10−7 | |
FExpression correlation matrices of the genes in the core genome of Escherichia coli (A) and Salmonella enterica serovar Typhimurium (B). Each value presented in the heatmap is the Pearson correlation coefficient between the expression profile of the gene in the row and the expression profile of the gene in the column from the compendium of the given species. The rows and columns are sorted according to a hierarchical clustering, and functional expression classes were created at a cutoff of 110 distance units. The classes are represented by colored boxes with their hierarchical relationship given in the tree to the left. Each class is labeled E. coli class (Ecl) or S. Typhimurium class (Scl) appended by a number.
Functional Evaluation of the E. coli Expression Classes.
| Ecl1 | Ecl2 | Ecl3 | |
|---|---|---|---|
| No. genes | 1,094 genes | 734 genes | 1,058 genes |
| GO enrich. | Chemotaxis | Cell division | Multiorganism processes |
| Energy metabolism | Cell wall assembly | Cell adhesion | |
| Amino acid metabolism | Carbohydrate biosynthesis | Carbohydrate catabolism | |
| Nucleotide metabolism | Nucleotide biosynthesis | Transport proteins | |
| Cation/osmotic stress | Transcription | Acidity stress | |
| Translation | Starvation stress | ||
| Toxin stress | |||
| Oxidative stress | |||
| Funct. div. | Anabolism | Anabolism | Catabolism |
| 8.51E-6 enrich. p-val | |||
| Central metabolism | |||
| Ess. genes | 60 essential genes | 204 essential genes | 8 essential genes |
| TF targets | FlhDC | LexA | CRP |
| (TrpR) | SoxS | IHF | |
| (TyrR) | DnaA | FhlA | |
| (Lrp) | PurR | NarP | |
| GadE | CysB | ||
| SF targets | σ28 | σ70 | σ38 |
| (σ70) | (σ24) | ||
| SF present | σ28 | σ54 | σ24 |
| σ32 | σ38 | ||
| σ70 |
aSummary of GO enrichment results, full listing available in supplementary table S2, Supplementary Material online.
bEnriched functional divisions using the annotation provided by Seshasayee et al. (2009).
cEssential E. coli genes.
dTarget genes for given TF or SF enriched in cluster, TFs in parenthesis were not significant according to multiple testing criterion, full listing available in supplementary table S3, Supplementary Material online.
eGene encoding for SF present in cluster.
Functional Evaluation of the S. Typhimurium Expression Classes.
| Scl1 | Scl2 | Scl3 | Scl4 | Scl5 | |
|---|---|---|---|---|---|
| No. genes | 833 genes | 545 genes | 741 genes | 445 genes | 321 genes |
| GO enrich. | Amino acid biosynthesis | Cell cycle | Transport proteins | Aerobic respiration | Response to stress |
| Sulfur metabolism | Cellular component biosynthesis | Cell adhesion | Nitrogen compound biosynthesis | ||
| Vitamin biosynthesis | Lipid biosynthesis | Cell motility | |||
| Transcription | |||||
| Translation | |||||
| Infection genes | 11 inf. genes | 5 inf. genes | 28 inf. genes | 5 inf. genes | 7 inf. genes |
| Essential genes | 30 ess. genes | 64 ess. genes | 28 ess. genes | 28 ess. genes | 12 ess. genes |
| TF pred. targets | (FadR) | (ArgP) | IclR | IscR | Fis |
| (TyrR) | (Fur) | NanR | FlhDC | (PhoP) | |
| (GntR) | (DnaA) | (FNR) | (GalR) | (FruR) | |
| (CRP) | (MelR) | (GlpR) | |||
| (ArcA) | |||||
| (H-NS) | |||||
| (SoxS) | |||||
| ArcA pot. targets | 36 targets | 27 targets | 87 targets | 46 targets | 25 targets |
| SF present | σ24 | σ28 | σ32 | ||
| σ54 | σ38 | ||||
| σ70 |
aSummary of GO enrichment results, full listing available in supplementary table S4, Supplementary Material online.
bGenes required for long term infection as identified by Lawley et al.
cEssential S. Typhimurium genes.
dTarget genes for given TF enriched in cluster, TFs in parenthesis were not significant according to multiple testing criterion, full listing available in supplementary table S5, Supplementary Material online.
eGenes directly or indirectly regulated by ArcA as identified by Evans et al.
fGene coding for SF present in cluster.
FHistogram of the EC score distribution for the orthologous genes split by the functional expression classes found in the expression compendia of the core genomes. (A) The distribution of EC scores for each of the three Escherichia coli classes (Ecl1: green, Ecl2: red, and Ecl3: blue). (B) The distribution of EC scores for each of the five Salmonella enterica serovar Typhimurium classes (Scl1: gray, Scl2: red, Scl3: cyan, Scl4: orange, and Scl5: blue).
FOverlap between the functional expression classes of Escherichia coli (columns) and Salmonella enterica serovar Typhimurium (rows). Reported is the number of orthologous gene pairs in each combination of classes. Numbers printed in bold are overlaps between classes that are significantly enriched (P value < 0.01) and those that are faded out are significantly depleted for each other (P value < 0.01).
FEC score of the co-orthologous gene pairs of Escherichia coli and Salmonella enterica serovar Typhimurium, which were not included in our core genome. EC scores were calculated by integrating each co-ortholog gene pair in turn into the expression correlation matrices and recalculating the EC. The resulting score assigned to the co-ortholog gene pair is considered as its EC score. (A) Histogram of the distribution of the EC scores of the MS co-orthologs (blue), which have the highest protein similarity of the gene pairs between the two species in the same co-ortholog cluster, and the LS co-orthologs (red), which are the remainder of the co-ortholog gene pairs. (B) Direct comparison of the MS co-ortholog EC (x axis) and the LS co-ortholog (y axis) EC of the same co-ortholog cluster. The yellow band along the diagonal indicates the segment of the plot where both co-orthologs seem to have diverged their expression at the same rate (within an error margin of 0.2).