Richard P Meisel1, Danial Asgari1, Florencia Schlamp2, Robert L Unckless3. 1. Department of Biology and Biochemistry, University of Houston, Houston, TX 77204-5001, USA. 2. Department of Medicine, NYU Grossman School of Medicine, New York, NY 10016, USA. 3. Department of Molecular Biosciences, University of Kansas, Lawrence, KS 66045, USA.
Abstract
Sex chromosomes frequently differ from the autosomes in the frequencies of genes with sexually dimorphic or tissue-specific expression. Multiple hypotheses have been put forth to explain the unique gene content of the X chromosome, including selection against male-beneficial X-linked alleles, expression limits imposed by the haploid dosage of the X in males, and interference by the dosage compensation complex on expression in males. Here, we investigate these hypotheses by examining differential gene expression in Drosophila melanogaster following several treatments that have widespread transcriptomic effects: bacterial infection, viral infection, and abiotic stress. We found that genes that are induced (upregulated) by these biotic and abiotic treatments are frequently under-represented on the X chromosome, but so are those that are repressed (downregulated) following treatment. We further show that whether a gene is bound by the dosage compensation complex in males can largely explain the paucity of both up- and downregulated genes on the X chromosome. Specifically, genes that are bound by the dosage compensation complex, or close to a dosage compensation complex high-affinity site, are unlikely to be up- or downregulated after treatment. This relationship, however, could partially be explained by a correlation between differential expression and breadth of expression across tissues. Nonetheless, our results suggest that dosage compensation complex binding, or the associated chromatin modifications, inhibit both up- and downregulation of X chromosome gene expression within specific contexts, including tissue-specific expression. We propose multiple possible mechanisms of action for the effect, including a role of Males absent on the first, a component of the dosage compensation complex, as a dampener of gene expression variance in both males and females. This effect could explain why the Drosophila X chromosome is depauperate in genes with tissue-specific or induced expression, while the mammalian X has an excess of genes with tissue-specific expression.
Sex chromosomes frequently differ from the autosomes in the frequencies of genes with sexually dimorphic or tissue-specific expression. Multiple hypotheses have been put forth to explain the unique gene content of the X chromosome, including selection against male-beneficial X-linked alleles, expression limits imposed by the haploid dosage of the X in males, and interference by the dosage compensation complex on expression in males. Here, we investigate these hypotheses by examining differential gene expression in Drosophila melanogaster following several treatments that have widespread transcriptomic effects: bacterial infection, viral infection, and abiotic stress. We found that genes that are induced (upregulated) by these biotic and abiotic treatments are frequently under-represented on the X chromosome, but so are those that are repressed (downregulated) following treatment. We further show that whether a gene is bound by the dosage compensation complex in males can largely explain the paucity of both up- and downregulated genes on the X chromosome. Specifically, genes that are bound by the dosage compensation complex, or close to a dosage compensation complex high-affinity site, are unlikely to be up- or downregulated after treatment. This relationship, however, could partially be explained by a correlation between differential expression and breadth of expression across tissues. Nonetheless, our results suggest that dosage compensation complex binding, or the associated chromatin modifications, inhibit both up- and downregulation of X chromosome gene expression within specific contexts, including tissue-specific expression. We propose multiple possible mechanisms of action for the effect, including a role of Males absent on the first, a component of the dosage compensation complex, as a dampener of gene expression variance in both males and females. This effect could explain why the Drosophila X chromosome is depauperate in genes with tissue-specific or induced expression, while the mammalian X has an excess of genes with tissue-specific expression.
Many animal species, as well as some plants and other eukaryotes, have sex chromosomes, which are often under different transcriptional regulation than the autosomes. Sex chromosomes can be grouped into several different categories, with XY and ZW systems amongst the most common in animals (Bachtrog ). X and Z chromosome gene expression is often controlled by transcriptional regulators and histone modifications that are unique from the autosomes (Lucchesi ; Ferrari ; Gu ). For example, 1 copy of the mammalian X chromosome is silenced (via the recruitment of facultative heterochromatin) in somatic tissues of XX females by a combination of noncoding RNAs and proteins (Lyon 1961; Brown ; Chow ). In contrast, the Drosophila dosage compensation complex (DCC) upregulates gene expression on the X chromosome in males using a combination of RNAs and proteins (Lucchesi and Kuroda 2015). The DCC only assembles in male somatic tissues, where it initiates the acetylation of lysine 16 in histone H4 (H4K16ac) specifically on the X chromosome, compensating for the haploid dose (Gelbart ). Furthermore, there is evidence for silencing of the single X chromosome in the male germline of some animal species (Lifschytz and Lindsley 1972; Turner 2007), although the extent of this meiotic sex chromosome inactivation (MSCI) varies across taxa (Bean ; Meiklejohn ; Turner 2015).The unique transcriptional and chromatin environments of X chromosomes, along with their hemizygosity in males, create selection pressures on X-linked genes that differ from the autosomes, resulting in X-autosome differences in gene content that are taxon-specific. For example, the mammalian X chromosome is enriched for genes that are expressed specifically in male reproductive tissues, such as the prostate and testis (Wang ; Lercher ; Mueller , 2013; Meisel ). In contrast, the Drosophila melanogaster X chromosome contains very few genes that are expressed primarily in the male-specific accessory gland, a reproductive organ analogous to the mammalian prostate (Swanson ; Ravi Ram and Wolfner 2007; Meisel ). The Drosophila X chromosome also contains a paucity of genes with male-biased expression (i.e. upregulated in males relative to females) relative to the autosomes (Parisi ; Sturgill ). Taxon-specific X-autosome differences in gene content further extend to genes with nonreproductive functions. In D. melanogaster, for instance, the X chromosome is deficient for genes that have narrow expression in nonreproductive tissues, whereas the mammalian X is enriched for genes with tissue-specific expression (Lercher ; Mikhaylova and Nurminsky 2011; Meisel ).Multiple hypotheses have been proposed to explain the differences in gene content between X chromosomes and autosomes (Table 1). One of these hypotheses is based upon the prediction that sexually antagonistic selection will favor recessive male-beneficial mutations (or dominant female-beneficial alleles) on the X chromosome (Rice 1984; Charlesworth ). This sexual antagonism hypothesis has numerous limitations (Fry 2010), including the inability to explain differences between Drosophila and mammalian X chromosomes in their deficiency or enrichment, respectively, of genes expressed in male reproductive tissues (Meisel ). A second hypothesis focuses specifically on the male germline, where MSCI silences the X chromosome (Lifschytz and Lindsley 1972) and may favor duplication of genes to the autosomes (Betrán ; Emerson ; Potrzebowski ; Vibranovski ). However, there is not a deficiency of genes with testis-biased expression on the D. melanogaster X chromosome (Meiklejohn and Presgraves 2012; Meisel ), limiting the ability of MSCI to explain the unique gene content of the Drosophila X chromosome. Third, the haploid dose of the X in males may impose a maximal gene expression level lower than the autosomes, selecting against X-linked genes with high expression (Wolfner ; Vicoso and Charlesworth 2009; Hurst ). This “dosage limit” hypothesis may even apply in species where the haploid X is dosage compensated by upregulation of X-linked expression. For example, in D. melanogaster there may be a transcriptional limit beyond which expression cannot be exceeded or some genes may not be dosage compensated in males (Meisel ).
Table 1.
Hypotheses to explain the unique gene content of the Drosophila X chromosome.
Hypothesis
Predicted X-autosome differences
Sexual antagonism
Excess or deficiency of X-linked genes with sex-specific functions and/or expressed in reproductive tissues
MSCI
Deficiency of X-linked testis-expressed genes
Dosage limit
Deficiency of highly expressed or induced X-linked genes
DCC-interference
Deficiency of DCC-bound genes highly expressed in males
Variance dampening
Deficiency of X-linked genes up- or downregulated in specific contexts (e.g. tissues, infections, abiotic treatments)
Hypotheses to explain the unique gene content of the Drosophila X chromosome.Here, we focus on the effect of the DCC on X chromosome expression and gene content in D. melanogaster. The DCC most strongly binds to more than 100 so-called chromatin entry or high affinity sites (HAS), from which it is thought to spread across the X chromosome (Kelley ; Alekseyenko ; Straub ). Bachtrog observed that genes near an HAS or bound by the DCC are less likely to have male-biased expression, and genes further from an HAS have a larger magnitude of male-biased expression. This led them to hypothesize that the DCC interferes with acquisition of male-biased expression on the X chromosome.There is mixed evidence for the hypothesis that DCC-interference is responsible for the unique gene content of the X chromosome. Consistent with the DCC-interference hypothesis, when Belyi measured expression of a reporter construct that was inserted at random locations on the X chromosome, they found reduced expression in male somatic tissues for transgenes inserted at chromosomal loci closer to endogenous DCC binding sites. However, when genes with testis-biased expression are excluded or when somatic tissues are analyzed separately, there is no relationship between male-biased expression and distance from an HAS for endogenous genes (Vensko and Stone 2014; Gallach and Betrán 2016). In addition, genes with male-biased expression in brain or head are over-represented on the D. melanogaster X chromosome and closer to DCC binding sites (Huylmans and Parsch 2015), which is opposite of what is predicted by the DCC-interference hypothesis.Our analysis addresses a fifth hypothesis, specifically whether the Drosophila DCC creates an unfavorable environment for X-linked genes that are differentially expressed (DE) in specific contexts. The D. melanogaster X chromosome is depauperate in genes with narrow expression in specific tissues (Mikhaylova and Nurminsky 2011; Meisel ), and X-linked genes further from an HAS or not bound by the DCC have more tissue-specific expression (Meisel ). In contrast, X-linked genes with female-biased expression, which also tend to be broadly expressed (Meisel 2011), are more likely to be bound by the DCC (Gallach and Betrán 2016). This suggests that the DCC creates an unfavorable environment for X-linked genes that are up- or downregulated in specific tissues, possibly because the DCC prevents the differential regulation of gene expression across contexts. Consistent with this hypothesis, there is evidence that Males absent on the first (Mof), one of the proteins in the DCC, dampens transcriptional variation on the D. melanogaster X chromosome (Lee ). Moreover, genes that are bound by the DCC have less genetic variation for gene expression than X-linked unbound genes (Meisel ), and transgenes inserted on the D. melanogaster X chromosome have less intralocus expression variation in males than females (Belyi ). Both of these observations are also consistent with the DCC dampening transcriptional variance. This “variance dampening” by the DCC, or Mof specifically, may inhibit context-dependent gene expression by reducing the ability of transcription factors to regulate expression subsequent to DCC-associated chromatin modifications (Table 1).We used bacterial infection, viral infection, and abiotic stressors as model systems to test the hypothesis that the Drosophila DCC is a variance dampener that reduces differential expression of X-linked genes in specific contexts. Biotic and abiotic stress represents a notable contrast to previous studies of context-dependent expression involving X-autosome comparisons of genes with tissue-specific expression (e.g. Mikhaylova and Nurminsky 2011; Meisel ). Bacterial infection, for example, causes the dramatic induction of gene expression, including effectors of the humoral immune system that are expressed more than 100 times higher within 12 h (De Gregorio ; Troha ; Schlamp ). Curiously, none of the 30–40 D. melanogaster genes encoding antimicrobial peptides (AMPs, a class of immune effectors) are found on the X chromosome (Hill-Burns and Clark 2009), providing a priori evidence that the X chromosome is a suboptimal location for genes induced by infection. We analyzed multiple RNA-seq studies of gene expression after biotic and abiotic treatments to test the hypothesis that the DCC inhibits context-dependent differential expression, which would explain the paucity of genes with tissue- or environment-specific expression on the Drosophila X chromosome.
Materials and methods
General statistical analysis
The analyses and figure generation were performed in R (R Core Team 2019) using the following packages: boot (Davison and Hinkley 1997; Canty and Ripley 2021), corpcor (Schäfer and Strimmer 2005; Schafer ), ggplot2 (Wickham 2016), ggridges (Wilke 2021), and cowplot (Wilke 2020). Additional R packages were used for specific analyses, as described below.
RNA-seq data analysis
We analyzed available RNA-seq data to test for differential expression between control D. melanogaster and flies that received a bacterial, viral, or abiotic treatment (Supplementary File 1). In one dataset, D. melanogaster adult males were infected with one of 10 different bacteria or a control treatment (Troha ). For that experiment, we used RNA-seq data from 12 h post-treatment for only live (not heat-killed) bacteria, which we downloaded as fastq files from the NCBI sequence read archive (BioProject PRJNA428174). We only included bacterial treatments with at least 50 DE genes relative to the control condition (see below for methods used to identify DE genes), which was 5/10 treatments. In another dataset, we compared gene expression 8 h following injection of Providencia rettgeri with uninfected flies, in males and females separately (Duneau ). In a third dataset, we analyzed the response to infection at 19 different timepoints from 1 to 120 h after injection of Escherichiacoli-derived crude lipopolysaccharide (Schlamp ). Other datasets include exposure to one of 2 different viral infections, copper, starvation, radiation, and cocaine (Palmer ; Prasad and Hens 2018; Harsh ; Baker ; de Oliveira ; Green ). A full list of accession numbers is provided in Supplementary File 1.Raw RNA-seq reads were assigned to annotated D. melanogaster transcripts (r6.22) using kallisto v0.44.0 (Bray ). We extracted transcripts per kilobase per million mapped reads (TPM) and read counts per transcript from the kallisto output. TPM values and read counts for all transcripts from each gene were summed to obtain gene-level expression estimates, and the counts per gene were then rounded to the nearest integer. For a given treatment, we only considered genes with at least 10 mapped reads total across all replicates from control and treatment samples. The integer counts were used as input into DESeq2 (Love ), which we used to identify DE genes between the treatment and control samples (see below). We performed a principal component analysis on regularized log transformed read counts to identify replicate samples that were outliers relative to other replicates of the same sample type. We identified one outlier male control replicate in the Kallithea virus data, which we excluded from all subsequent analyses.We identified DE genes based on the false-discovery rate corrected P-value (PADJ) and log2-fold-change of treatment relative to control expression (log2FC). Genes were considered upregulated in a treatment if PADJ < 0.05 and log2FC > 1 (i.e. a significant 2x increase upon treatment). Similarly, genes were considered downregulated if PADJ < 0.05 and log2FC < −1 (i.e. a significant 2x decrease upon treatment). DE genes are those that are either up- or downregulated in a given treatment (i.e. PADJ < 0.05 and |log2FC| > 1). We only included datasets with at least 50 DE genes between treatment and control samples.We also analyzed the bacterial infection data considering all Gram-negative or Gram-positive bacteria from Troha as a treatment group. In this case, we used a statistical model that examined the effect of treatment (bacteria or control), with strain nested within bacterial treatment. This was done separately for all Gram-negative bacteria and all Gram-positive bacteria. DE genes, as well as up- and downregulated genes, were identified using the same PADJ and log2FC criteria described above.In cases where both female and male RNA-seq data were available (cocaine and Kallithea virus), we performed 2 separate analyses focusing on the effect of the treatment (i.e. not on the effect of sex) on gene expression. First, we analyzed the male and female data together using a linear model that included the effect of treatment and the interaction of treatment and sex. From this analysis, we extracted genes that were DE based on the treatment effect. Second, we analyzed data from the 2 sexes separately using a model that only included the effect of treatment (i.e. the same way we analyzed data from other treatments without separate sex samples).We did not analyze raw RNA-seq data (i.e. Illumina sequence reads) for 2 of the infection datasets. First, we obtained previously identified DE genes from the time course analysis of expression following infection (Schlamp ). Second, raw data were not available from an experiment in which male and female D. melanogaster were infected with bacteria (Duneau ), but processed data were available from FlySexsick-seq (http://flysexsick.buchonlab.com). For those data, we compared gene expression between unchallenged flies and 8 h following injection of P.rettgeri. Only expression levels (and no P-values) were provided for these data, and we therefore considered genes to be DE based on a variety of log2FC cutoffs.We determined a null expectation for the number of X-linked DE genes by multiplying the fraction of autosomal genes that are DE by the total number of X-linked genes with expression measurements. Similar calculations were performed to determine a null expectation for up- or downregulated X-linked genes.
DCC binding
Data on DCC binding in the D. melanogaster genome was obtained from a published chromatin immunoprecipitation followed by microarray (ChIP-chip) experiment in which genes were classified as bound by the DCC in SL2 embryonic cells, clone 8 wing imaginal disc cells, and embryos (Alekseyenko ). For the purpose of our analysis, we considered a gene to be bound by the DCC if it was bound in at least one of the 3 samples. We also obtained HAS locations from 2 different published datasets (Alekseyenko ; Straub ). Using these HAS locations, we calculated the distance of each X chromosome gene to the nearest HAS in nucleotides.
Tissue-specific expression
We obtained microarray measurements of gene expression from 14 unique adult D. melanogaster tissues from FlyAtlas (Chintapalli ). Of the 14 tissues, 4 are sex-specific (testis and accessory gland from males, and ovary and spermatheca from females), and the remaining 10 tissues (brain, eye, thoracicoabdominal ganglion, salivary gland, crop, midgut, malpighian tubule, hindgut, heart, and fat body) are shared by both males and females (i.e. nonsex-specific). We used these data to calculate expression breadth () for each gene:
where is the number of tissues analyzed (10 for the nonsex-specific tissues and 14 for all adult tissues), is the gene expression level in tissue (measured as average signal intensity for all microarray probes assigned to that gene), and is the maximum across all tissues (Yanai ). All were set to 1 so that , as done previously (Larracuente ; Meisel 2011; Meisel ). Values for spermatheca from mated and unmated females were averaged to create a single for spermatheca (Meisel 2011). We calculated separately for all 14 unique adult tissues and for the 10 nonsex-specific tissues. We also identified the tissue where expression is highest for every gene that has and where expression was detected for at least one probe in all 4 replicate arrays in that tissue.
Results
Genes DE after infection are under-represented on the Drosophila melanogaster X chromosome
We tested if the D. melanogaster X chromosome is depauperate for genes induced (i.e. upregulated) by bacterial infection regardless of functional annotation. To those ends, we analyzed RNA-seq data in which D. melanogaster males were infected with one of 10 different bacteria vs a control (Troha ). From those infection experiments, we selected results from the 5 bacterial treatments with >50 DE genes, in order to have sufficient power to detect X-autosome differences. For 3 out of 5 bacterial infections we considered, there was a significant deficiency of induced genes on the X chromosome (Fig. 1a). For the remaining 2 bacterial infections, the observed number of induced genes on the X chromosome was less than expected, although the difference was not significant. It is unlikely to observe fewer induced X-linked genes than expected for all 5 treatments, assuming a null hypothesis of equal proportions above and below the expectation (P = 0.031 in a binomial exact test). In addition, when we considered all Gram-positive or Gram-negative bacteria (with a statistical model that has bacterial strain nested in treatment) from the experiment together, there was a significant deficiency of induced genes on the X chromosome in both cases (Fig. 1a). Moreover, genes that were upregulated by at least 1, 2, 3, or 4 different bacteria were also significantly under-represented on the X chromosome (Fig. 1b). Therefore, genes that are induced by bacterial infection are generally under-represented on the D. melanogaster X chromosome regardless of the criteria used to categorize induction. This is consistent with the deficiency of X-linked AMP genes (Hill-Burns and Clark 2009).
Fig. 1.
Genes that were DE after bacterial infection in D. melanogaster males are under-represented on the X chromosome. Bar graphs show the number of X-linked genes that were upregulated (a, b), downregulated (c, d), or DE (e, f) after infection with individual bacteria, combined Gram-negative (Gram–) bacteria, combined Gram-positive (Gram+) bacteria (a, c, e), or across multiple bacterial treatments (b, d, f). Dots connected by broken lines show the expected number of X-linked genes in each category based on the fraction of autosomal genes that are DE after infection. Black asterisks represent significant differences between observed and expected counts in a Fisher’s exact test (*P < 0.05, **P < 0.005, ***P < 0.0005).
Genes that were DE after bacterial infection in D. melanogaster males are under-represented on the X chromosome. Bar graphs show the number of X-linked genes that were upregulated (a, b), downregulated (c, d), or DE (e, f) after infection with individual bacteria, combined Gram-negative (Gram–) bacteria, combined Gram-positive (Gram+) bacteria (a, c, e), or across multiple bacterial treatments (b, d, f). Dots connected by broken lines show the expected number of X-linked genes in each category based on the fraction of autosomal genes that are DE after infection. Black asterisks represent significant differences between observed and expected counts in a Fisher’s exact test (*P < 0.05, **P < 0.005, ***P < 0.0005).There was not a significant deficiency of X-linked D. melanogaster genes downregulated after infection with any of the 5 bacterial treatments or when we considered all Gram-positive or Gram-negative bacteria (Fig. 1c). Similarly, genes that were downregulated by one or more different bacteria were not significantly under-represented on the X chromosome (Fig. 1d). However, in most cases, the number of downregulated X-linked genes was less than the expectation, albeit not significant. The failure to detect a significant deficiency of downregulated X-linked genes may have been caused by low statistical power—there were fewer downregulated genes than upregulated genes in most bacterial treatments. We examine this further by considering other experiments with more downregulated genes below.The total number of DE genes (either induced or repressed after infection) on the X chromosome was significantly less than the expectation for 4 of 5 individual bacterial treatments, Gram-positive bacteria, and Gram-negative bacteria (Fig. 1e). The X chromosome also has significantly fewer genes that were DE in at least one or more of the bacterial treatments (Fig. 1f). Therefore, both upregulated and DE genes are under-represented on the D. melanogaster X chromosome.To further evaluate if the X chromosome is anomalous, we tested if any individual autosomes had a deficiency (or excess) of induced, repressed, or DE genes following bacterial infection. Notably, the left arm of the second chromosome (2L) had an excess of induced genes in every treatment (Supplementary Figs. 1 and 2). This is surprising because none of the annotated D. melanogaster AMP genes are on chromosome 2L, and only 10/74 immune effector genes are found on 2L (Sackton ). Chromosome 2L therefore has an excess of genes induced by bacterial infection (Supplementary Figs. 1 and 2), despite having a significant deficiency of effector genes (P = 0.006 comparing effector and noneffector immune genes on chromosome 2L with the other chromosomes in Fisher’s exact test). The right arm of the third chromosome (3R), in contrast, has a deficiency of DE genes (Supplementary Figs. 1 and 2), even though it contains at least 5 AMP genes and has neither an excess nor a deficiency of effector genes (P = 0.7 comparing effectors and noneffectors between 3R and other chromosomes in Fisher’s exact test). We next ranked each chromosome arm within each treatment by the % of induced, repressed, or DE genes (excluding the diminutive chromosome 4 because it has <100 genes). On average, the X chromosome has the lowest percentage of induced, repressed, or DE genes across all treatments, and chromosome 3R has the second lowest (Supplementary Figs. 3 and 4). Therefore, the X chromosome is anomalous from each of the autosomes in its deficiency of upregulated and DE genes after bacterial infection.If the X chromosome has a maximal expression that prevents upregulation of individual genes (i.e. a dosage limit), we should observe a difference in the distribution of log2FC of treatment vs control between X-linked and autosomal genes (Meiklejohn ; Meiklejohn and Presgraves 2012). If we do not observe such a difference, it would suggest that there is not a dosage limit that selects against induced X-linked genes (Table 1). The X chromosome did not have a significantly lower log2FC than the autosomes in any of the 5 bacterial treatments (Supplementary Fig. 5a). In 2 of the 5 bacterial treatments (M. luteus and S. marcescens Db11), X-linked genes possess a higher median log2FC than autosomal genes (Supplementary Fig. 5a), which is opposite of the direction predicted by the dosage limit hypothesis. Only when we considered genes with a log2FC > 0 did the autosomes have a significantly higher log2FC than the X chromosome in a single bacterial treatment (Supplementary Fig. 5). There was not a significant difference in log2FC for any treatment when we considered genes with log2FC < 0.We further tested the dosage limit hypothesis by comparing the expression levels (TPM values) of genes, rather than the log2FC. We performed this analysis because fold-change measures a ratio of expression, which may not accurately assess if there is an absolute limit to X chromosome expression. We found that TPM values of X-linked genes were lower than autosomal genes for the control samples and all 5 bacterial treatments (Supplementary Fig. 6). This can be explained by the fact that whole male flies (including testes) were sampled for the RNA-seq experiment, and X-linked genes are expressed lower in testes than autosomal genes (Meiklejohn and Presgraves 2012). To overcome the confounding effect of lower expression of the X in testis, we separately analyzed X-linked and autosomal genes. To that end, we divided genes into those with log2FC > 0 and those with log2FC < 0, regardless of significance, and compared TPM values. Under the dosage limit hypothesis, we would expect lower TPM for X-linked genes with log2FC > 0 (i.e. upregulated genes). We did not observe this pattern; X chromosome TPM values were higher for genes with log2FC > 0 in 2 treatments, and there was not a significant difference in the remaining 3 treatments (Supplementary Fig. 7). For autosomal genes, in comparison, there was no consistent difference in TPM between genes with log2FC > 0 and those with log2FC < 0 (Supplementary Fig. 7). These results are inconsistent with the expectation under the dosage limit hypothesis. Therefore, the paucity of X-linked genes upregulated after infection cannot be explained by an overall reduced dose of X chromosome gene expression.We next tested if genes that are induced or repressed in female D. melanogaster following infection are also under-represented on the X chromosome. To those ends, we identified DE genes in D. melanogaster males and females 8 h after infection with P. rettgeri (Duneau ), one of the bacteria that induced a deficiency of X-linked genes in males (Fig. 1a). We report results for multiple log2FC cutoffs because raw data or P-values are not available for this experiment. Surprisingly, we observed a significant deficiency of X-linked genes upregulated in males at only one of the 8 log2FC cutoffs we considered (Fig. 2a). In contrast, at 5 of the 8 log2FC cutoffs, there was a significant deficiency of X-linked genes upregulated in females after infection (Fig. 2a). Similarly, at 5 of 8 log2FC cutoffs there was a deficiency of X-linked genes downregulated in females (Fig. 2b). Considering genes that were DE after infection, regardless of up- or downregulation, there was a significant deficiency on the X chromosome at 7 and 4 log-fold-change cutoffs in females and males, respectively (Fig. 2c). Therefore, there is a significant deficiency of X-linked DE genes after infection in both male and female D. melanogaster.
Fig. 2.
Genes that were DE after bacterial infection in D. melanogaster are under-represented on the X chromosome. Bar graphs show the number of X-linked genes that were upregulated (a), downregulated (b), or DE (c) after infection with P. rettgeri in either males (top) or females (bottom). Dots connected by broken lines show the expected number of X-linked genes in each category based on the fraction of autosomal genes that are DE after infection. Black asterisks represent significant differences between observed and expected counts in a Fisher’s exact test (*P < 0.05).
Genes that were DE after bacterial infection in D. melanogaster are under-represented on the X chromosome. Bar graphs show the number of X-linked genes that were upregulated (a), downregulated (b), or DE (c) after infection with P. rettgeri in either males (top) or females (bottom). Dots connected by broken lines show the expected number of X-linked genes in each category based on the fraction of autosomal genes that are DE after infection. Black asterisks represent significant differences between observed and expected counts in a Fisher’s exact test (*P < 0.05).We also tested if there is a paucity of X-linked genes induced at different time points following immune challenge by analyzing RNA-seq data from 1 to 120 h after D. melanogaster males were injected with E. coli-derived crude lipopolysaccharide (Schlamp ). There was a significant deficiency of X-linked genes upregulated at 16 of 19 time points (Fig. 3a). Similarly, at 9 of 19 timepoints, there was a significant deficiency of X-linked genes downregulated (Fig. 3b). Furthermore, there was a significant deficiency of X-linked DE genes (regardless of up- or downregulation) at all time points (Fig. 3c). Therefore, both up- and downregulated genes are under-represented on the X chromosome across the full temporal spectrum during the response to infection.
Fig. 3.
Genes that were DE at different time points after immune challenge in D. melanogaster males are under-represented on the X chromosome. Bar graphs show the number of X linked genes that were upregulated (a), downregulated (b), or DE (c) after D. melanogaster males were injected with E. coli-derived crude lipopolysaccharide. Dots connected by broken lines show the expected number of X-linked genes in each category based on the fraction of autosomal genes that are DE after infection. Black asterisks represent significant differences between observed and expected counts in a Fisher’s exact test (*P < 0.05).
Genes that were DE at different time points after immune challenge in D. melanogaster males are under-represented on the X chromosome. Bar graphs show the number of X linked genes that were upregulated (a), downregulated (b), or DE (c) after D. melanogaster males were injected with E. coli-derived crude lipopolysaccharide. Dots connected by broken lines show the expected number of X-linked genes in each category based on the fraction of autosomal genes that are DE after infection. Black asterisks represent significant differences between observed and expected counts in a Fisher’s exact test (*P < 0.05).These results show that there is a paucity of DE genes on the X chromosome after infection across a wide range of bacterial pathogens, in both sexes, and across a dense sampling of timepoints. The paucity of X-linked DE genes can be attributed to a deficiency of both up- and downregulated genes. This suggests the deficiency of X-linked DE genes is robust to experimental variation, and also that gene dosage in males cannot fully explain the pattern.
Genes DE under viral or abiotic stress are usually under-represented on the Drosophila melanogaster X chromosome
We next tested if genes induced by viral infection are under-represented on the D. melanogaster X chromosome. We found fewer X-linked genes than expected were induced by Zika or Kallithea virus, albeit at insignificant differences (Fig. 4a). Neither viral infection resulted in a significant deviation from the expected number of X-linked downregulated genes either (Fig. 4b). The Kallithea virus data were collected from both males and females, and genes that were upregulated in females after Kallithea infection were significantly under-represented on the X chromosome (Fig. 4a). There was also a deficiency of X-linked downregulated genes following Kallithea virus infection in males (Fig. 4b). In addition, genes that were DE after Zika virus infection were under-represented on the X chromosome (Fig. 4c).
Fig. 4.
Genes that are DE after viral or abiotic treatments can be under- or over-represented on the X chromosome. Bar graphs show the number of X-linked genes that were up regulated (a), downregulated (b), or DE (c) after D. melanogaster were subjected to a viral or abiotic treatment. The treatments are listed on the x-axis, and bars are colored based on the sex of the flies used in the experiment (males are light blue, females are pink, both sexes are purple, and unkown sex is gray). Dots connected by broken lines show the expected number of X-linked genes in each category based on the fraction of autosomal genes that are DE after treatment. Black asterisks represent significant differences between observed and expected counts in a Fisher’s exact test (*P < 0.05; *****P < 0.000005).
Genes that are DE after viral or abiotic treatments can be under- or over-represented on the X chromosome. Bar graphs show the number of X-linked genes that were up regulated (a), downregulated (b), or DE (c) after D. melanogaster were subjected to a viral or abiotic treatment. The treatments are listed on the x-axis, and bars are colored based on the sex of the flies used in the experiment (males are light blue, females are pink, both sexes are purple, and unkown sex is gray). Dots connected by broken lines show the expected number of X-linked genes in each category based on the fraction of autosomal genes that are DE after treatment. Black asterisks represent significant differences between observed and expected counts in a Fisher’s exact test (*P < 0.05; *****P < 0.000005).We also tested if genes induced by 4 different abiotic treatments are under-represented on the X chromosome. These data include copper treatment for genotypes that are sensitive to copper and those that are resistant, which we analyzed separately. The data also included starvation treatments in which gene expression was measured in whole flies, adult brains after complete starvation, and adult brains after sugar starvation. Both radiation and sugar starvation resulted in a significant deficiency of X-linked induced genes (Fig. 4a). Complete starvation (with expression measured in the brain) was the only abiotic treatment that resulted in a significant deficiency of X-linked downregulated genes (Fig. 4b). There was also a deficiency of X-linked DE genes after both radiation and starvation (Fig. 4c). In contrast to all other biotic and abiotic treatments, there was an excess of X-linked genes upregulated after complete starvation and cocaine treatment (Fig. 4a). There was also an excess of X-linked downregulated and DE genes after exposure to cocaine, regardless of the sex of the flies (Fig. 4). Both the complete starvation and cocaine treatments measured gene expression in the brain, suggesting that the brain may be an outlier with an excess, rather than a deficiency, of X-linked upregulated genes (or DE genes in general) following abiotic stress.We further tested if any individual autosomes had a deficiency (or excess) of DE genes following viral infection or abiotic treatments. None of the autosomal chromosome arms had a consistent excess or deficiency of induced, repressed, or DE genes after viral or abiotic treatment (Supplementary Fig. 8). When we ranked each chromosome arm within each treatment by the % of induced, repressed, or DE genes, the X chromosome had the lowest percentage of induced, repressed, or DE genes when averaged across all treatments (Supplementary Fig. 9). This provides additional evidence that the X chromosome is an outlier with a deficiency of DE genes.In summary, out of 6 total viral and abiotic treatments, the observed number of X-linked upregulated genes was less than expected in most treatments, and significantly so in 3 treatments (Fig. 4a). Downregulated and DE genes were also under-represented on the X chromosome in some treatments (Fig. 4, b and c). In addition, we observed similar patterns regardless of sex or genotype of the flies used in the experiments, although sex did affect whether differences were statistically significant (Fig. 4). The notable exceptions to this pattern are the effects of starvation or cocaine on gene expression in the brain, which were the only treatments (biotic or abiotic) that resulted in a significant excess of X-linked genes that were upregulated (Fig. 4).
DE genes are less likely to be bound by the DCC
We evaluated the hypothesis that the DCC prevents the induction of X-linked genes by testing if there is a relationship between induction and DCC binding. In total, a smaller fraction of DCC-bound genes were upregulated than X-linked unbound genes for nearly all bacterial, viral, and abiotic treatments (Fig. 5a), if we do not consider whether the difference is significant. Within individual treatments, DCC-bound genes were significantly less likely to be upregulated, relative to X-linked unbound genes, following P. entomophila infection, radiation treatment, starvation (in brain), copper exposure (for resistant flies), and cocaine feeding (Supplementary Fig. 10a). The negative effect of DCC binding on upregulation following cocaine was observed for both male and female flies. These results are in accordance with what would be expected if the DCC prevents induction of X-linked genes.
Fig. 5.
Genes that are DE after treatment are less likely to be bound by the dosage compensation complex (DCC). The % of genes that are DE given that they are autosomal (white), X-linked and bound by the DCC (black), or X-linked and unbound (gray) is plotted for each treatment. Data are shown for genes upregulated after a treatment (a), genes downregulated after a treatment (b), and all DE genes (c). Red asterisks show significant differences between the number of genes that are DE and not DE when comparing autosomal vs DCC bound or unbound X-linked genes (*P < 0.05 in Fisher’s exact test).
Genes that are DE after treatment are less likely to be bound by the dosage compensation complex (DCC). The % of genes that are DE given that they are autosomal (white), X-linked and bound by the DCC (black), or X-linked and unbound (gray) is plotted for each treatment. Data are shown for genes upregulated after a treatment (a), genes downregulated after a treatment (b), and all DE genes (c). Red asterisks show significant differences between the number of genes that are DE and not DE when comparing autosomal vs DCC bound or unbound X-linked genes (*P < 0.05 in Fisher’s exact test).We next tested if DCC binding could explain the paucity of upregulated genes on the X chromosome. To those ends, we determined if there is a difference in the proportion of upregulated genes when we compare the autosomes with either DCC-bound or unbound genes on the X chromosome. If DCC binding explains the paucity of upregulated genes, we expect a “V-shaped” pattern when we plot the %DE genes amongst autosomes, DCC-bound X-linked genes, and unbound X-linked genes (Fig. 5). There was a significant deficiency of DCC-bound upregulated genes relative to the autosomes across 7 different treatments (Serratia marcescens Db11, P. rettgeri, Pseudomonas entomophila, Staphylococcus aureus, radiation, copper, and cocaine), among genes upregulated in at least one bacterial treatment, and for genes upregulated by Gram-positive bacteria (Fig. 5a). This deficiency is the bottom of the V-shape. In contrast, there was only one treatment (Kallithea virus in females) in which there was a significant deficiency of upregulated X-linked genes unbound by the DCC relative to autosomal genes (Fig. 5a). Most other treatments had the V-shape, with no significant differences in the fraction of DE genes between autosomal and unbound X-linked genes. In addition, unbound X-linked genes were significantly more likely to be upregulated by cocaine than autosomal genes (Fig. 5a). Therefore, the evidence for a paucity of X-linked upregulated genes is much greater for DCC-bound than unbound genes (creating the V-shape in Fig. 5a), which is consistent with the expectation if the DCC prevents the induction of X-linked genes.We also observed that X-linked genes bound by the DCC were less likely to be downregulated after treatment than X-linked unbound genes (Fig. 5b). In 3 different viral or abiotic treatments (Kallithea virus, radiation, and copper), X-linked downregulated genes were significantly less likely to be DCC-bound than unbound (Supplementary Fig. 10b). The same results were observed for Kallithea virus when we considered female samples only, and similar trends were observed for males (although not significant because of small sample sizes of downregulated genes). Therefore, the DCC appears to interfere with both up- and downregulation of gene expression. The cumulative effect of the DCC on both up- and downregulation can be seen in the significant deficiency of DCC-bound DE genes for 5 unique treatments (Supplementary Fig. 10c).DCC-bound genes were also significantly less likely to be downregulated than autosomal genes in 4 different treatments—Kallithea virus, radiation, starvation, and copper (Fig. 5b). Similarly, DCC-bound genes were less likely to be DE than autosomal genes after most treatments (Fig. 5c). In contrast, X-linked unbound genes were significantly more likely to be downregulated or DE than autosomal genes after Kallithea virus infection, copper treatment, or cocaine (Fig. 5). Therefore, downregulated and DE genes also tend to have the V-shaped distribution. These results are all consistent with the expectations if the DCC prevents downregulation of X-linked genes.Our results provide consistent evidence that DCC binding can largely explain the paucity of X-linked upregulated, downregulated, and DE genes. Notably, we observed much stronger evidence for a deficiency of X-linked DE genes when we considered DCC-bound genes, and only weak (or no) evidence for unbound genes (the V-shapes in Fig. 5). In addition, genes bound by the DCC have a reduced magnitude of log2FC than X-linked unbound genes, regardless of whether the genes are significantly up- or downregulated, across all treatments (Supplementary Figs. 11–13). These results are consistent with the hypothesis that the DCC is a variance dampener that prevents both up- and downregulation of X-linked genes. However, there is a deficiency of upregulated X-linked unbound genes relative to autosomal genes in some treatments (Fig. 5a), suggesting that DCC binding alone cannot completely explain the paucity of upregulated genes on the X chromosome. Therefore, other factors, such as a dosage limit, may also be necessary to explain the exclusion of upregulated genes from the Drosophila X chromosome.
DE genes are further from DCC HAS
A complementary way to assess the effect of the DCC on differential gene expression is to measure the distance to the nearest DCC HAS for each gene. We cannot test for differences in distance to HAS between DE genes and non-DE genes because there are too few X-linked DE genes for statistical testing. Instead, we calculated the correlation between distance to the nearest HAS and log2FC for all genes, regardless of whether they are significantly DE.First, we considered |log2FC| as a measure of the extent of differential expression, regardless of up- or downregulation. In nearly all combinations of treatments, HAS datasets, and sexes, there was a positive correlation between |log2FC| and distance from an HAS (Fig. 6a). Therefore, X-linked genes further from an HAS were more DE following bacterial infection, viral infection, or abiotic treatment. This is consistent with the hypothesis that the DCC inhibits differential expression (i.e. both up- and downregulation).
Fig. 6.
Correlations between distance from a dosage compensation complex HAS and the log2 fold-change in expression between treatment and control (log2FC). Each dot is the rank order correlation (ρ) between distance to the nearest HAS and log2FC. Error bars show the 95% confidence interval determined by bootstrap resampling the data 1,000 times. The X-axis shows the specific treatment. Dots and error bars are colored based on the sex of the flies used in the experiment (see legend). HAS were obtained from 2 different datasets (Alekseyenko et al. 2008; Straub et al. 2008), with results from the 2 different datasets shown separately in the 2 columns. Correlations are plotted with |log2FC| values for all genes (a), only genes with log2FC > 0 (b), and only genes with log2FC < 0 (c).
Correlations between distance from a dosage compensation complex HAS and the log2 fold-change in expression between treatment and control (log2FC). Each dot is the rank order correlation (ρ) between distance to the nearest HAS and log2FC. Error bars show the 95% confidence interval determined by bootstrap resampling the data 1,000 times. The X-axis shows the specific treatment. Dots and error bars are colored based on the sex of the flies used in the experiment (see legend). HAS were obtained from 2 different datasets (Alekseyenko et al. 2008; Straub et al. 2008), with results from the 2 different datasets shown separately in the 2 columns. Correlations are plotted with |log2FC| values for all genes (a), only genes with log2FC > 0 (b), and only genes with log2FC < 0 (c).To specifically evaluate if proximity to an HAS affects upregulation, downregulation, or both, we separately considered genes with log2FC > 0 and log2FC < 0 (regardless of whether the deviation from 0 is significant). When we considered only genes with log2FC > 0, there was evidence for a positive correlation between log2FC and distance from an HAS for most treatments (Fig. 6b). In comparison, when we considered genes with log2FC < 0, there was a negative correlation between log2FC and distance from an HAS for most treatments (Fig. 6c). Both of these correlations indicate that genes further from an HAS were more likely to be either up- or downregulated after bacterial infection, viral infection, or abiotic treatment. These results are consistent with the hypothesis that the DCC inhibits both up- and downregulation of gene expression.
Differential expression, dosage compensation, and expression breadth
We next considered if expression breadth could explain the correlations between log2FC and distance from an HAS. This analysis was motivated by the previously described observation that genes bound by the DCC or closer to an HAS are narrowly expressed in fewer tissues than X-linked unbound genes and those further from an HAS (Meisel ). We quantified expression breadth using , which ranges from 0 (for genes expressed in many tissues) to 1 (for genes highly expressed in a single tissue) (Yanai ). We found that there is a positive correlation between the magnitude of log2FC and for all of our treatments, regardless of the chromosomal locations of genes (Supplementary Figs. 14 and 15). This demonstrates that the more DE a gene, the more narrowly it is expressed. Unsurprisingly, genes induced by bacterial infection were narrowly expressed in the fat body, which is a primary organ of the humoral immune response (Lemaitre and Hoffmann 2007). The pairwise correlations between log2FC, , and distance from an HAS suggest that relationships between DCC binding and differential expression could be confounded by correlations with expression breadth.To address confounding effects in the pairwise correlations, we calculated partial correlations (Schäfer and Strimmer 2005) between log2FC, distance from an HAS, and expression breadth (). We confirmed the positive correlation between and distance from an HAS, even when log2FC is included in the analysis (Supplementary Figs. 16–119). We also confirmed the positive correlation between the magnitude of log2FC and (Supplementary Figs. 16–19). This suggests that there is a relationship between expression breadth and differential expression that is independent of the DCC.When we considered the correlation with , many partial correlations between log2FC and distance from an HAS were no longer significantly different from zero (Fig. 7; Supplementary Fig. 20). This is true regardless of whether sex-specific reproductive tissues are included in the calculation of . Therefore, the correlations between log2FC and distance from an HAS could often be explained by the correlations between and both log2FC and distance from an HAS.
Fig. 7.
Partial correlations between log2 fold-change in expression between treatment and control (log2FC) and distance from a dosage compensation complex HAS. Partial correlations were calculated based on rank order correlations between log2FC, distance from an HAS, and tissue expression breadth (ρ). Each dot shows the partial correlation between log2FC and distance from an HAS, with the error bars representing 95% confidence intervals from 1,000 bootstrap replicates of the data. The X-axis shows the specific treatment. Dots and error bars are colored based on the sex of the flies used in the experiment (see legend). HAS were obtained from the Alekseyenko et al. (2008) dataset; results from the Straub data are shown in Supplementary material. Expression breadth was calculated using microarray data from either 14 unique adult tissues (left) or 10 adult tissues that are not sex-specific (right). Partial correlations are plotted with log2FC values for all genes (a), |log2FC| values for all genes (b), only genes with log2FC > 0 (c), and only genes with log2FC < 0 (d).
Partial correlations between log2 fold-change in expression between treatment and control (log2FC) and distance from a dosage compensation complex HAS. Partial correlations were calculated based on rank order correlations between log2FC, distance from an HAS, and tissue expression breadth (ρ). Each dot shows the partial correlation between log2FC and distance from an HAS, with the error bars representing 95% confidence intervals from 1,000 bootstrap replicates of the data. The X-axis shows the specific treatment. Dots and error bars are colored based on the sex of the flies used in the experiment (see legend). HAS were obtained from the Alekseyenko et al. (2008) dataset; results from the Straub data are shown in Supplementary material. Expression breadth was calculated using microarray data from either 14 unique adult tissues (left) or 10 adult tissues that are not sex-specific (right). Partial correlations are plotted with log2FC values for all genes (a), |log2FC| values for all genes (b), only genes with log2FC > 0 (c), and only genes with log2FC < 0 (d).Nonetheless, when we considered the effect of expression breadth, some partial correlations between log2FC and distance from an HAS still remained significantly different from 0 (Fig. 7). These significant correlations were almost always in a direction consistent with genes further from an HAS being more upregulated or more downregulated (i.e. a positive correlation for |log2FC| or log2FC > 0, or a negative correlation for log2FC < 0). The one exception to this rule was a positive partial correlation between log2FC and distance from an HAS for genes with log2FC < 0 after copper treatment in resistant flies (Fig. 7d). This negative correlation is suggestive that genes closer to an HAS are more downregulated following copper treatment. However, there were very few genes downregulated after copper treatment in resistant flies (Figs. 4 and 5), suggesting that this positive correlation may be an artifact of a small range of negative log2FC values.
Discussion
We analyzed RNA-seq data from D. melanogaster subjected to bacterial, viral, or abiotic treatments. We found that DE genes—both up- and downregulated—were often depauperate on the X chromosome regardless of the sex of the flies or the type of treatment (Figs. 1–4). We additionally determined that DE genes were less likely to be bound by the DCC, and that DCC binding can largely explain the deficiency of X-linked DE genes (Fig. 5). In addition, genes that are further from an HAS were more DE, regardless of whether the genes were up- or downregulated (Fig. 6). However, much of the relationship between differential expression and distance from an HAS could be explained by both variables being correlated with tissue-specific gene expression (Supplementary Figs. 16–19). Nonetheless, a significant correlation between log2FC and distance from an HAS remained for some treatments after controlling for expression breadth across tissues (Fig. 7), suggesting that the DCC dampens X-linked gene expression variance across treatments.
Chromosome 2L has a complementary gene content to the X chromosome
An excess of chromosome 2L genes were upregulated in many conditions, in contrast to the deficiency of X-linked DE genes. We found an excess of chromosome 2L genes induced in every bacterial treatment (Supplementary Figs. 1–4), even though no AMP genes (and only 10 of 74 immune effectors) are on chromosome 2L (Sackton ). An excess of chromosome 2L genes were also induced after copper treatment (Supplementary Fig. 8). Previous analyses revealed that chromosome 2L (also known as Muller element B) has an excess of genes with male-biased expression in multiple Drosophila species, in contrast to the paucity of X-linked genes upregulated in males (Parisi ; Meisel ). In addition, chromosome 2L has an excess of genes encoding accessory gland proteins, while the X chromosome is depauperate (Ravi Ram and Wolfner 2007). The expression level in accessory gland is also higher for chromosome 2L genes, and lower for X-linked genes, than other chromosomes (Meisel ). Moreover, chromosome 2L genes have narrower expression in specific tissues than genes on other chromosomes, in contrast to X-linked genes that tend to be more broadly expressed (Meisel ). Future work could address why chromosome 2L has a complementary pattern to the X chromosome.
Minimal support for the dosage limit hypothesis
Our analysis allowed us to test the dosage limit hypothesis, which predicts that highly expressed genes (i.e. those that are upregulated in specific contexts) are under-represented on the X chromosome because of its haploid dose in males (Vicoso and Charlesworth 2009; Meisel ; Hurst ). We determined that D. melanogaster genes that are either up- or downregulated following biotic or abiotic treatments are frequently under-represented on the X chromosome (Figs. 1–4). The paucity of downregulated X-linked genes is inconsistent with the dosage limit hypothesis, which only predicts that upregulated genes will be under-represented on the X chromosome (Table 1). Moreover, we frequently observed a deficiency of upregulated X-linked genes in females, but not males, when data from both sexes were available (Figs. 2 and 4). A female-specific paucity of X-linked upregulated genes is also not predicted by the dosage limit hypothesis. Furthermore, there was an excess of X-linked upregulated genes following cocaine treatment in both males and females (Fig. 4), which is also inconsistent with the dosage limit hypothesis. Our results thus provide strong evidence that the dosage limit hypothesis cannot completely explain the unique gene content of the Drosophila X chromosome.The dosage limit hypothesis may be required to explain some aspects of the unique gene content of the Drosophila X chromosome. For example, we cannot explain the paucity of X-linked upregulated genes based on proximity to an HAS alone (Fig. 5). In addition, dosage limits may be especially important in testis, where the haploid X chromosome does not appear to be compensated (Meiklejohn ). Specifically, the reduced dosage of the X chromosome could explain the biased duplication of genes from the X to the autosomes, with the autosomal derived paralogs expressed primarily in testis in order to compensate for under-expression of the X (Betrán ; Meisel , 2010; Vibranovski ). In addition, the dosage limit hypothesis is also consistent with the observation that genes expressed specifically in the male accessory gland are under-represented on the X chromosome (Swanson ; Ravi Ram and Wolfner 2007; Meisel ). Therefore, both dosage limits and DCC binding may act in concert to exclude upregulated genes from the Drosophila X chromosome.
Evidence for the DCC as a variance dampener that inhibits X-linked DE genes
Our results add evidence in support of the hypothesis that the DCC is a variance dampener (Lee ), which prevents both up- and downregulation of gene expression on the Drosophila X chromosome. First, there is a significant deficiency of downregulated genes on the X chromosome, in addition to a paucity of X-linked upregulated genes, in a variety of treatments (Figs. 2–4). Second, the X-linked genes that are up- or downregulated tend not to be bound by the DCC, and DCC-binding largely explains the X-autosome differences in upregulated, downregulated, and DE genes (Fig. 5). Lastly, the magnitude of differential expression is correlated with distance from an HAS for both up- and downregulated genes (Figs. 6 and 7). Because we observe similar results for both up- and downregulated genes, we hypothesize that the DCC is a variance dampener that inhibits both induction and repression of gene expression (Table 1).We further hypothesize that the variance dampening effect of the DCC can explain other aspects of X chromosome gene content. For example, there is a paucity of genes with tissue-specific expression on the Drosophila X chromosome (Mikhaylova and Nurminsky 2011; Meisel ). Tissue-specific expression can be thought of as a form of context-dependence in which the context is developmental, rather than environmental. X-linked genes with tissue-specific expression are less likely to be DCC-bound than broadly expressed X-linked genes (Meisel ), suggesting that the variance dampening effect of the DCC may prevent tissue-specific induction, or repression, of gene expression. In contrast to that predicted effect, when Argyridou inserted reporter constructs with tissue-specific expression randomly throughout the D. melanogaster genome, X-linked insertions did not differ in the amount of protein produced relative to autosomal insertions. However, they did not consider whether insertions were DCC bound or near an HAS, and they measured protein expression instead of transcripts. Additional experimentation is therefore needed to evaluate if and how the DCC dampens expression variance across tissues.An alternative explanation for the relationship between DCC binding and differential expression is that genes expressed in specific contexts may not require a global mechanism of male upregulation. Genes with tissue-specific expression and genes that are DE in other contexts (i.e. biotic or abiotic treatments) may therefore not be under selection to acquire DCC binding sites. This alternative hypothesis could explain the relationships between DCC binding and both tissue-specific expression and differential expression for X-linked genes. However, this hypothesis cannot explain the paucity of X-linked genes with tissue-specific expression (Mikhaylova and Nurminsky 2011; Meisel ), nor can it explain the deficiency of X-linked DE genes that we observed. We thus conclude that the DCC, or some other factor, is required to explain the exclusion of genes with context-specific expression from the Drosophila X chromosome.The general exclusion of genes with context-specific expression from the X chromosome by the DCC could explain the variation in partial correlations between log2FC, expression breadth, and distance from an HAS across treatments (Supplementary Figs. 16–19). We hypothesize that the DCC dampens gene expression variance, which should prevent both tissue-specific regulation and induction/repression in specific treatments. We further found that log2FC and expression breadth are correlated, such that genes with tissue-specific expression were more likely to be DE following either biotic or abiotic treatments (Supplementary Figs. 14 and 15). If the DCC prevents both tissue-specific expression and induction/repression, the correlation of distance from an HAS with both expression breadth and log2FC could therefore be confounded by their correlation with each other. These confounding correlations could explain why we only observe significant partial correlations between log2FC and distance from an HAS for some treatments, yet we always observe significant correlations between expression breadth and distance from an HAS (Supplementary Figs. 16–19).The variance dampening hypothesis could also explain the differences in gene content between mammalian and Drosophila X chromosomes. The mammalian X has an excess of genes with tissue-specific expression (Lercher ; Meisel ), in contrast to the deficiency of genes with context-dependent expression on the D. melanogaster X chromosome. A dosage limit hypothesis has been proposed to explain the excess of narrowly expressed genes on the mammalian X chromosome because broadly expressed genes have a higher maximal expression (Hurst ). Evidence for large-scale upregulation of the mammalian X is much weaker than the evidence that the Drosophila DCC upregulates X-linked expression (Xiong ; Lin ; Pessia ; Deng ; Mank 2013; Gu and Walters 2017; Deng and Disteche 2019). Therefore, the taxon-specific peculiarities of dosage compensation could explain the differences in X chromosome gene content between Drosophila and mammals. Specifically, only in Drosophila is there evidence for a variance dampener that targets X-linked genes, whereas dosage limits may be more pronounced in mammals.
Mechanistic and evolutionary explanations for the effects of the DCC on X chromosome gene content
There are multiple explanations for how the variance dampening effect of the DCC could lead to a deficiency of genes with context-specific expression on the X chromosome. First, in what we will call the mechanistic explanation, the DCC itself may affect gene expression in the experiments from which we obtained data. While this mechanism is feasible for gene expression in males, it is less obvious how it could explain the deficiency of X-linked DE genes in females (Figs. 2 and 4) because the DCC does not assemble in females (Lucchesi and Kuroda 2015). The lack of a complete DCC in females occurs because the Male-specific lethal 2 (Msl-2) protein is only expressed in males (Kelley ). However, Mof, the component of the DCC responsible for H4K16ac, is expressed in both sexes, and there is evidence that it serves as a variance dampener in both males and females (Lee ). In addition, the Drosophila X chromosome has a distinct chromatin environment from the autosomes in females, including histone marks associated with dosage compensation (Zhang and Oliver 2010). These chromatin modifications, possibly mediated by Mof, may serve to dampen variance in X-linked gene expression in both sexes, and therefore reduce differential expression on the X chromosome. Furthermore, there is a complex interplay between heterochromatin and dosage compensation in Drosophila (Makki and Meller 2021), which could also contribute to a variance dampening effect in both sexes.Alternatively, in what we refer to as evolutionary explanations, we may have observed an evolved difference between the X and autosomes in the experimental data we analyzed. These evolved differences could be a result of selection against genes with context-dependent gene expression on the X chromosome as a result of the variance dampening effect of the DCC, or perhaps only Mof. This explanation is attractive because it allows for selection in males only (where the DCC assembles) to shape the gene expression profile of the X chromosome in both sexes, where we observe the paucity of X-linked genes with context-dependent expression. This is analogous to how a shared genome prevents the evolution of sexual dimorphism because of intersexual phenotypic correlations (Rowe ). Consistent with the evolutionary explanation, DCC bound genes have slower evolving gene expression levels than X-linked unbound genes (Meisel ), suggesting that the DCC may inhibit the evolution of gene expression. Similarly, X-linked genes whose expression levels changed more during a laboratory evolution experiment were further from an HAS (Abbott ). This evolutionary explanation assumes that up- or downregulation of gene expression after biotic or abiotic stress is an adaptive response that mitigates the negative effects of stress (e.g. induction of AMPs and other effectors after bacterial infection).An alternative evolutionary explanation arises if we consider that differential gene expression could reflect a deleterious effect of the stress itself. Changes in gene expression across conditions can result via nonadaptive, or “passive,” responses to the environment, which are not necessarily aligned with the adaptive response (van Kleunen and Fischer 2005). These passive gene expression changes could instead be maladaptive dysregulation (i.e. it would be beneficial to the organism for there to be no change to gene expression). Under this alternative evolutionary explanation, selection would favor the X-linkage of genes that are vulnerable to dysregulation—where the DCC or Mof could buffer against expression variance—which could explain the reduced expression variance of the X chromosome in both males and females. It is worth noting, however, that the mechanistic and evolutionary explanations are not mutually exclusive, and additional work is necessary to evaluate how well each can explain the effect of the DCC, or Mof in particular, on the unique gene content of the Drosophila X chromosome.
Exceptions to the rules are also evidence for the DCC as a variance dampener
We observed exceptions to the general relationships between DE genes, X-linkage, and the DCC within specific treatments. For example, there was an excess of DE genes on the X chromosome after cocaine and starvation treatments (Fig. 4). In addition, some of the correlations between log2FC and distance from an HAS were in opposite directions depending on the treatment (Figs. 6 and 7). We discuss these exceptions below, explain how they are consistent with the variance dampener hypothesis despite the atypical patterns, and describe how they can help us discriminate between the mechanistic and evolutionary explanations for the reduced DE on the X chromosome.
Excess of X-linked DE genes in brain
In both of the brain samples analyzed (starvation and cocaine treatments), we observed an excess of X-linked DE genes (Fig. 4). The cocaine treatment resulted in an excess of both up- and downregulated genes, whereas the starvation treatment only caused an excess of upregulated genes. There is a similar enrichment of X-linked genes with male-biased expression in D. melanogaster brain (Huylmans and Parsch 2015). Multiple genes that encode DCC proteins, including msl-2 and mle, are highly expressed in brain (Chintapalli ; Straub ; Vensko and Stone 2014; Huylmans and Parsch 2015), as are the noncoding RNAs that assemble with the DCC (Amrein and Axel 1997; Meller ). If the DCC inhibits context-dependent differential expression, we may expect fewer X-linked DE genes in the brain because of the high expression of the DCC components. It is therefore surprising that we observe the opposite pattern in the brain.Despite the excess of X-linked upregulated genes in the brain after either starvation or cocaine, we observed that DE genes in both treatments are less likely to be bound by the DCC, consistent with most other treatments (Fig. 5). Moreover, X-linked DE genes after cocaine treatment were further from an HAS, similar to other treatments (Fig. 7). These observations are consistent with the hypothesis that the DCC is a variance dampener, but they do not explain why the unbound X-linked genes are more likely to be DE in the brain than in other tissues.One explanation for an excess of X-linked DE genes in the brain is the heterogeneity of dosage compensation across the X chromosome in brain cells. Belyi observed greater position effects on brain gene expression than expression in nonbrain head tissues for transgenes inserted on the X chromosome. This suggests that the X chromosome is more of a patchwork of DCC-bound and unbound regions in the brain than in other tissues. The unbound regions may allow for context-dependent transcriptional regulation of X-linked genes in the brain more than in other tissues where DCC-binding is more uniform.
Excess of X-linked DE genes after cocaine treatment
The excess of X-linked DE genes after cocaine treatment points to a possible mechanism to explain the context-dependent transcriptional regulation of X-linked genes unbound by the DCC. Cocaine affects chromatin in neural cells by reversing trimethylation of histone H3 at lysine 9 (H3K9me3), a repressive chromatin mark, leading to de-repression of genes and repetitive elements that would normally be silenced (Kumar ; Renthal ; Covington ; Maze ). We observed that the majority of X-linked genes that were upregulated following cocaine treatment in D. melanogaster brain were not bound by the DCC (Fig. 5a), and previous work showed that unbound genes are more likely to be associated with repressive chromatin marks than DCC-bound genes (Meisel ). Therefore, the X-linked DE genes after cocaine treatment are probably located in chromatin that was unbound by the DCC, and cocaine induced a conversion from repressive to transcriptionally active chromatin. Importantly, this interpretation is consistent with the hypothesis that the DCC is a variance dampener and X-linked genes unbound by the DCC are more likely to have context-dependent expression. Future work should evaluate if chromatin state was in fact altered in X-linked unbound genes in neuronal cells following cocaine treatment.
Genotypic differences help discriminate between mechanistic and evolutionary explanations
We observed different patterns following copper treatment depending on whether the flies were sensitive or resistant. Many more X-linked genes were downregulated in sensitive flies, and this was accompanied by an equivalent enrichment of downregulated autosomal genes (Fig. 4b). The downregulated X-linked genes were extremely biased toward those unbound by the DCC (Fig. 5b), consistent with our hypothesis that the DCC is a variance dampener. However, there was a difference between sensitive and resistant flies in the correlation of log2FC and distance from an HAS. In sensitive flies, the correlation was positive when we considered |log2FC| or genes with log2FC > 0 (Fig. 7, a and b), consistent with more differential expression further from an HAS. In resistant flies, the correlation was positive for genes with log2FC < 0 (Fig. 7c), suggesting that genes further from an HAS were less downregulated. One possible explanation for this discrepancy is that, because there were so few DE genes in resistant flies, the correlation between log2FC and distance from an HAS differed from other treatments (possibly because few genes have extreme negative log2FC values). This explanation is not well supported because other treatments had even fewer DE genes (e.g. Zika virus and starvation), yet the correlations between log2FC and distance from an HAS in those other treatments were in the same directions as in the treatments with more DE genes (Figs. 6 and 7). Alternatively, resistance to copper (or any treatment) may affect the relationship between gene expression and the DCC.Regardless of the causes of the differences between copper sensitive and resistant flies, the comparison is informative for evaluating the mechanistic and evolutionary explanations for the relationship between DE genes and DCC binding. Copper resistant flies are likely to resemble genotypes that would emerge following adaptation to copper exposure. As described above, X-linked DE genes in copper resistant flies were not necessarily further from an HAS (Fig. 7). In contrast, X-linked DE genes in copper sensitive flies were further from an HAS, similar to what was observed in most other treatments (Fig. 7). Therefore, adaptation may not necessarily lead to the evolved relationship between X-linked DE genes and the DCC that we observed, which could be interpreted as support for the mechanistic explanation for X-autosome differences. However, there are clear evolved differences between the X and autosomes, such as the absence of X-linked AMP genes (Hill-Burns and Clark 2009) and deficiency of X-linked accessory gland expressed genes (Swanson ; Ravi Ram and Wolfner 2007; Meisel ), which cannot be explained by the mechanistic explanation. This leads us to conclude that both mechanistic effects of the DCC (or Mof) and evolved X-autosome differences are responsible for the paucity of X-linked DE genes. The evolved X-autosome differences are likely the result of selection against X-linked genes with context-dependent expression in response to both the variance dampening effects of the DCC/Mof and dosage limits of the haploid X chromosome in males.
Data availability
The authors affirm that all data necessary for confirming the conclusions of the article are present within the article, figures, and tables.Supplemental material is available at G3 online.Click here for additional data file.Click here for additional data file.
Authors: Liuqi Gu; Patrick F Reilly; James J Lewis; Robert D Reed; Peter Andolfatto; James R Walters Journal: Curr Biol Date: 2019-11-14 Impact factor: 10.834
Authors: Doris Bachtrog; Judith E Mank; Catherine L Peichel; Mark Kirkpatrick; Sarah P Otto; Tia-Lynn Ashman; Matthew W Hahn; Jun Kitano; Itay Mayrose; Ray Ming; Nicolas Perrin; Laura Ross; Nicole Valenzuela; Jana C Vamosi Journal: PLoS Biol Date: 2014-07-01 Impact factor: 8.029