| Literature DB >> 31922467 |
Ian Goodhead1,2, Frances Blow3,2, Philip Brownridge2, Margaret Hughes4,2, John Kenny4,2, Ritesh Krishna5,2, Lynn McLean2, Pisut Pongchaikul2, Rob Beynon2, Alistair C Darby4,2.
Abstract
The majority of bacterial genomes have high coding efficiencies, but there are some genomes of intracellular bacteria that have low gene density. The genome of the endosymbiont Sodalis glossinidius contains almost 50 % pseudogenes containing mutations that putatively silence them at the genomic level. We have applied multiple 'omic' strategies, combining Illumina and Pacific Biosciences Single-Molecule Real-Time DNA sequencing and annotation, stranded RNA sequencing and proteome analysis to better understand the transcriptional and translational landscape of Sodalis pseudogenes, and potential mechanisms for their control. Between 53 and 74 % of the Sodalis transcriptome remains active in cell-free culture. The mean sense transcription from coding domain sequences (CDSs) is four times greater than that from pseudogenes. Comparative genomic analysis of six Illumina-sequenced Sodalis isolates from different host Glossina species shows pseudogenes make up ~40 % of the 2729 genes in the core genome, suggesting that they are stable and/or that Sodalis is a recent introduction across the genus Glossina as a facultative symbiont. These data shed further light on the importance of transcriptional and translational control in deciphering host-microbe interactions. The combination of genomics, transcriptomics and proteomics gives a multidimensional perspective for studying prokaryotic genomes with a view to elucidating evolutionary adaptation to novel environmental niches.Entities:
Keywords: Sodalis glossinidius; endosymbiont; pseudogenes; transcriptome
Mesh:
Substances:
Year: 2020 PMID: 31922467 PMCID: PMC7067036 DOI: 10.1099/mgen.0.000285
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Fig. 1.(a) Sense and antisense mean transcription of CDSs (blue) and pseudogenes (red) in three cell-free cultures of : early log phase (ELP), late log phase (LLP) and late stationary phase (LSP). Transcripts per million (TPM) derived from EdgeR counts per million have been transformed to log(TPM+1) to enable the presentation of zero transcription. (b, c) Density plot of expression [log (TPM+1)] showing transcription for all three cell-free culture conditions, grouped by CDS (blue) or pseudogene (red). Lines represent TPM=1 and TPM=10, representing two different minimum thresholds to be considered as activity. Panel (b) is sense transcription only, and panel (c) displays all transcription. The CDSs can be seen to show a greater degree of increased expression levels than the pseudogenes (red). Overlapping low-expression CDSs and pseudogenes highlight the difficulty in identifying pseudogenes using transcription levels.
Fig. 2.flagellum and symbiosis region expression, summarized by general ‘region’ (left bar). Two genes not covered by general region bars are fliU (top) and flk (bottom). CDSs (blue) and pseudogenes (red) are displayed as the first coloured column. Early log phase (ELP), late log phase (LLP) and late stationary phase (LSP) expression is displayed as a heatmap of log fold change relative to average expression, where red signifies upregulation and blue represents downregulation.
Fig. 3.–Log2FC differential expression plotted against negative logFDR (false discovery rate), determined from EdgeR. (a) LLP and (b) LSP transcription. Whilst CDSs (blue) represent the majority of differentially expressed genes, some pseudogenes (red) are significantly differentially expressed in either condition.
Fig. 4.Pan-genome analysis of CDSs (blue) and pseudogenes (red) from the genomes of six isolates derived from four different tsetse species.Plot representsthe number of CDSs/pseudogenes in the ROARY-derived pan-genome. Core, 7 genomes; soft core, 2–6 genomes; cloud, 1 genome.