Literature DB >> 32653899

Contributions of cis- and trans-Regulatory Evolution to Transcriptomic Divergence across Populations in the Drosophila mojavensis Larval Brain.

Kyle M Benowitz1, Joshua M Coleman1,2, Carson W Allan1, Luciano M Matzkin1,3,4.   

Abstract

Natural selection on gene expression was originally predicted to result primarily in cis- rather than trans-regulatory evolution, due to the expectation of reduced pleiotropy. Despite this, numerous studies have ascribed recent evolutionary divergence in gene expression predominantly to trans-regulation. Performing RNA-seq on single isofemale lines from genetically distinct populations of the cactophilic fly Drosophila mojavensis and their F1 hybrids, we recapitulated this pattern in both larval brains and whole bodies. However, we demonstrate that improving the measurement of brain expression divergence between populations by using seven additional genotypes considerably reduces the estimate of trans-regulatory contributions to expression evolution. We argue that the finding of trans-regulatory predominance can result from biases due to environmental variation in expression or other sources of noise, and that cis-regulation is likely a greater contributor to transcriptional evolution across D. mojavensis populations. Lastly, we merge these lines of data to identify several previously hypothesized and intriguing novel candidate genes, and suggest that the integration of regulatory and population-level transcriptomic data can provide useful filters for the identification of potentially adaptive genes.
© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  RNA-seq; cactophilic; local adaptation; pleiotropy; transcriptional regulation

Mesh:

Year:  2020        PMID: 32653899      PMCID: PMC7495911          DOI: 10.1093/gbe/evaa145

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Significance

Gene expression evolution can be driven by changes in the focal gene itself (cis-regulation) or by changes in the genes that regulate it (trans-regulation). The importance of these two processes, and their contribution to adaptation specifically, remains under debate. Through a novel integration of data on genetic variation in gene expression with data on gene regulatory evolution, we found increased evidence for a primary role of cis-regulation in both total expression evolution as well as adaptive expression evolution. These results inject nuance into the discussion of how regulatory processes influence evolution within species and outline an approach for using expression data to address adaptive hypotheses.

Introduction

Statistical correlations between phenotypes impose fundamental constraints on phenotypic evolution (Lande 1979). As such, selection may disfavor the propagation of especially pleiotropic mutations whose causal effects alter many traits (Otto 2004). This idea has led to considerable speculation on the precise molecular effects of successful mutations. Vigorous debate regarding the relative importance of coding sequence and gene regulatory evolution hinged on claims regarding the respective pleiotropic consequences of these types (Hoekstra and Coyne 2007; Carroll 2008). Within the category of regulatory mutations, however, further distinctions are likely to be relevant in this context. Specifically, trans-regulatory changes, which are primarily a consequence of changes in expression and/or structure of transcription factors, are expected to affect large networks of target genes and therefore be highly pleiotropic (Gibson 1996; Wittkopp 2007; but see Lynch and Wagner [2008]). In contrast, cis-regulatory mutations, occurring in promoters or enhancers of the target genes themselves, might affect only single genes in specific contexts (Stern 2000; Prud’homme et al. 2007). In an early and thorough theoretical treatment of the subject, Wray et al. (2003) did not equivocate in hypothesizing that cis-regulatory evolution should primarily be responsible for the evolution of gene expression phenotypes. In the years since that prediction, the accumulation of evidence regarding the prevalence of cis- and trans-regulatory effects in evolution has led to a far murkier picture. This may in part reflect the methodological diversity of studies approaching the question (reviewed in Signor and Nuzhdin [2018]). Some experiments, such as chromosomal substitutions (Hughes et al. 2006; Osada et al. 2006), crosses utilizing the diversity of a reference panel (Genissel et al. 2007; Fear et al. 2016; Osada et al. 2017), and eQTL mapping studies (Massouras et al. 2012; King et al. 2014) have generally, but not always (Lemos et al. 2008; Wang et al. 2008) corroborated the hypothesis, finding greater contributions of cis-effects to intrapopulation variation. On the other hand, results from another frequently used experimental design, which we will henceforth call the F1 hybrid design, have consistently led to the opposite conclusion. The F1 hybrid design requires expression data from two parental lines and their F1 hybrids. Cis-regulatory effects are measured using the differential expression of allele-specific reads within the hybrid samples, whereas trans-regulatory effects are calculated by subtracting the cis-regulatory effect from the overall differential expression between the parental lines (Wittkopp et al. 2004). Usage of the F1 hybrid design has repeatedly found that trans-regulation dominates expression variability within species, whereas cis-regulation plays a greater role in interspecific differences (Graze et al. 2009; Wittkopp et al. 2004, 2008; McManus et al. 2010; Suvorov et al. 2013; Coolon et al. 2014; Metzger et al. 2017; Glaser-Schmitt et al. 2018). Given the power of the F1 hybrid design and its applicability to a wide range of study systems and biological contexts, closer attention to the interpretations stemming from this approach is merited. As such, recent work has begun to approach the F1 hybrid paradigm with increased nuance. Glaser-Schmitt et al. (2018) perform a tissue-specific study, filling an important gap given the focus of previous work on whole-body samples. Taking this one step further, Combs and Fraser (2018) estimate fine-scale spatial variation in allele-specific expression within embryos. From a different angle, two recent commentaries (Fraser 2019; Zhang and Emerson 2019) make salient points regarding potential biases in the estimation of trans-regulatory divergence given that it cannot be estimated independently of cis-regulatory and parental divergence using this approach, and stress the need for replication to mitigate this. Here, we build from these efforts and probe the initial findings from an across-population F1 hybrid study using two simple experiments. First, we conduct a tissue-specific study in parallel with a whole-body study, to directly estimate the effects of sample heterogeneity on the estimation of regulatory type. Second, we supplement our measures of parental divergence with further sampling of genotypes from each parental population, to gain more confidence in patterns of within and between-population variation in transcription. We apply these experiments to an investigation of gene expression evolution in larval brains across two populations of the cactophilic fly Drosophila mojavensis. This combination of organism and tissue lends itself to a strong hypothesis of predominant cis-regulatory evolution, for two reasons. First, D. mojavensis is predicted to have experienced strong differential selection pressures across populations due to variable ecological conditions. The two populations studied here, from Santa Catalina Island, CA, and the Sonoran Desert (Guaymas, Sonora, Mexico and Organ Pipe National Monument, Arizona), are genetically distinct (Reed et al. 2006) primarily utilize highly divergent cactus species, the prickly pear Opuntia littoralis and the columnar Stenocereus thurberi, respectively (Heed 1978; Ruiz et al. 1990). These host cacti form unique chemical and nutritional environments (Kircher 1982; Starmer and Phaff 1983), and detoxification genes in particular have seen substantial expression and coding sequence evolution across these populations (Matzkin et al. 2006; Allan and Matzkin 2019). In addition to selection from the host, these populations experience vastly different temperature and humidity regimes, which is expected to generate selection broadly on phenology and organismal physiology (Matzkin 2014). We choose to focus on brains here in part because we previously identified larval behavioral differences related to locomotion and pupation (Coleman et al. 2018), indicating the potential for the evolution of expression changes in the brain, as well as muscle and fat body. Second, despite this potential for selection, there are also a priori expectations that transcriptome-wide evolution should actually be reduced. Brain gene expression is highly conserved in many animals, including Drosophila (Brawand et al. 2011; Catalán et al. 2012; Uebbing et al. 2016). Additionally, gene expression in larvae is more conserved than in later developmental stages in Drosophila (Artieri and Singh 2010). The pairing of strong directional and strong stabilizing selection across genes is precisely the scenario that should result in transcriptional fine-tuning due to cis-regulatory evolution. Thus, our expectation was to uncover a greater role for cis-regulatory changes than observed in other intraspecific studies using similar experimental designs.

Materials and Methods

Sample Collection and Sequencing

For initial analysis of population divergence and analyses of allele-specific expression, we used single genome-sequenced isofemale lines of D. mojavensis from Santa Catalina Island, CA (Drosophila 12 Genomes Consortium 2007) and Guaymas, Sonora, Mexico (Allan and Matzkin 2019). These lines have been maintained as isofemale lines without direct inbreeding in the laboratory on banana-molasses media (Coleman et al. 2018) since 2002 and 1999, respectively. We generated F1 hybrids between these two lines by placing 20 virgin genome-line Catalina Island males and 20 virgin genome-line Sonora females in vials containing banana-molasses media, and performed the reciprocal cross in an identical manner. For analyses of genotypic variation in expression, we selected seven additional isofemale lines from Santa Catalina Island and seven isofemale lines from the Sonoran Desert population from Organ Pipe National Monument, AZ, which were collected between 2007 and 2009 (Coleman et al. 2018) and maintained as described earlier. We collected all samples during the third-instar wandering stage. For whole-body samples, we collected five larvae per replicate, washing each larva in deionized water before storing them on ice in trisEDTA buffer. For brain samples, we dissected ten brains per replicate in trisEDTA before storing them on ice. We then froze samples at −80 °C for storage. We collected three biological replicates for each genome line and hybrid (brain and body) and single replicates of each additional isofemale line (brain only). We ground samples in TRIzol (Thermo Fisher, Waltham, MA) and used Qiagen RNEasy columns (Qiagen, Hilden, Germany) to extract RNA, prepared libraries using Illumina TruSeq kits (Illumina, San Diego, CA), and sequenced samples as 150-bp paired-end reads on an Illumina HiSeq. Information on sample identity and sequencing can be found in supplementary table S1, Supplementary Material online.

Bioinformatic Analysis

We removed Illumina adapters and low-quality sequence using Trimmomatic (Bolger et al. 2014) and used NextGenMap (Sedlazeck et al. 2013) with default parameters to separately map all reads to both the original Catalina Island genome (Drosophila 12 Genomes Consortium 2007; FlyBase version r1.04_FB2018_06) and the same genome templated with Sonora genomic reads (Allan and Matzkin 2019). We calculated total read counts at the gene level for each sample using HTSeq-count (Anders et al. 2015), using the reads mapped to the Catalina Island genome for analysis. We then downsampled reads to 11,908,854 reads over 13,410 genes in brain samples and 14,001,634 reads over 13,628 genes in whole-body samples, to match the lowest coverage sample in each tissue. For the additional brain isofemale lines, which were more highly covered, we downsampled to 18,665,415 reads over 13,565 genes. From these gene sets, we analyzed only genes with at least ten total reads in each sample. After comparing the consistency between biological replicates within each group using Spearman’s correlation coefficients, we discarded three samples from the genome lines as outliers: one Sonora brain sample, one Sonora (f)×Catalina Island (m) hybrid body sample, and one Catalina Island (f)×Sonora (m) body sample. In the analysis of genotypic variation in the brain, we discarded an additional Sonora sample as an outlier based on the same criteria. We included only a single randomly chosen replicate from each genome-sequenced line in the analysis of genotypic variation to avoid pseudoreplication. For allele-specific counts, we used SAMtools mpileup (Li et al. 2009) and VarScan2 (Koboldt et al. 2013) to identify informative variants for allele-specific expression analysis. We first removed all SNPs where we found, in any of the parental genotype samples the allele from the other parental genome at >5% frequency (Glaser-Schmitt et al. 2018). This step helps to avoid analyzing heterozygous sites, which will lead to inaccuracy in the assignment of reads to parental genomes. We then compared the remaining SNPs from the mapping results to both reference genomes and removed sites with substantially differing allele frequencies in the two resulting data sets, following previous work (Benowitz et al. 2019). In this way, we removed sites potentially affected by mapping bias, which, although not a major problem here (supplementary table S1, Supplementary Material online), would result in overestimation of allele-specific expression of genes containing those sites. We then filtered all bam files (mapped to the Catalina Island reference) for informative reads using VariantBam (Wala et al. 2016) and output these reads as text files using sam2tsv (https://lindenb.github.io/jvarkit/Sam2Tsv.html; last accessed July 2020). We then counted allele-specific reads overlapping each informative SNP, generating gene-level counts after accounting for reads overlapping multiple variants in R 3.4 (R Core Team 2018). We ran this pipeline independently for brain and body samples. We randomly downsampled allele-specific reads in each brain sample to 4,541,638 reads, matching the reads in the lowest coverage sample. These reads covered SNPs in 7,933 genes. For the whole-body samples, which covered 8,584 genes, we downsampled all read counts to 4,939,804 total reads, to preserve the ratio of reads per gene between brains and whole bodies. In both data sets, we analyzed only genes containing at least ten total reads in each sample. The same three samples identified as outliers above were also outliers in this data set and we accordingly discarded them for allele-specific expression analysis as well.

Statistical Analysis

We calculated per-gene parental divergence from the total (not allele-specific) expression counts as log2(PCI/PSON), where PCI and PSON are either parental genome-sequenced line means (when comparing single genotypes; in brains and whole bodies) or parental population means (when comparing all genotypes; brains only). We calculated transcriptome-wide differentiation across populations as 1−ρ, where ρ is the Spearman’s correlation of expression divergences between populations. We estimated 95% confidence intervals of 1−ρ from 10,000 bootstrapped replicates. We calculated cis-regulatory divergence, following previous studies, as log2(HCI/HSON), where HCI and HSON are averages of allele-specific counts across all F1 hybrid replicates. We then calculated trans-regulatory divergence as the difference between parental divergence (between the genome-sequenced lines) and cis-regulatory divergence, log2(PCI/PSON)−log2(HCI/HSON). We independently assessed the contributions of cis- and trans-regulatory divergence to both metrics of parental divergence using Spearman’s correlation coefficient ρ. As above, we assessed 95% confidence intervals from 10,000 bootstrap replicates. We visualized correlations using least-squares regression lines and 95% confidence regions around those regressions using the R package ggplot2. We estimated differential expression between parental populations using FDR-corrected negative binomial tests using the R package NBPseq (Di et al. 2011). We also used negative binomial tests comparing allele-specific counts to assess the significance of cis-regulatory effects. To estimate the significance of trans-regulatory effects at the gene level, we used Fisher’s exact tests comparing the ratio of allele-specific expression differences to total gene expression differences in the parental samples. For evolutionary analysis of regulatory evolution in the brain, we used previously published dN/dS values across the four D. mojavensis populations (Allan and Matzkin 2019). To estimate network connectivity, we identified the closest Drosophila melanogaster ortholog for each gene and used the in-degree metric calculated in Marbach et al. (2012) and used previously in a similar analysis in Yang and Wittkopp (2017). Briefly, in-degree quantifies the number of transcription factors found to have significant regulatory interactions with each gene. We analyzed dN/dS and in-degree between genes in different regulatory categories using Mann–Whitney U test with the R function pairwise.wilcox.test, using Holm’s method to correct for multiple comparisons. For these analyses, we defined trans-regulated genes via the brain analysis using multiple parental genotypes. We performed all statistical analyses in R 3.4 (R Core Team 2018).

Results

Analysis of Single Parental Genotypes in Brains and Whole Bodies

Examining a single genotype per population, transcriptome-wide expression differentiation across populations was lower in brains (1−ρ  =  0.017; 95% CI = [0.016, 0.018]) than in bodies (1−ρ  =  0.036; 95% CI = [0.034, 0.038]), as expected. In contrast, we found evidence for considerably more significantly differentially expressed (DE) genes in brains than in bodies (table 1 and supplementary table S2, Supplementary Material online). The lack of statistical support for many DE genes in bodies despite increased overall expression differences reflects substantially greater intragenotypic variation in whole-body data (supplementary fig. S1, Supplementary Material online).
Table 1

Numbers of Differentially Expressed, Cis-Regulated, and Trans-Regulated Genes Using Both the Single-Genotype Parental Data Set and the Multigenotype Parental Data Set (brains only)

Single Parental Genotype
Multiple Parental Genotypes
Whole BodiesBrainsBrains
DE genes231530308
Cis-regulated genes20143
Trans-regulated genes1072467265

Note.—The calculation of cis-regulated genes relies only on F1 hybrids, thus there is no recalculation of the number of cis-regulated genes in the multiparent genotype data set.

Numbers of Differentially Expressed, Cis-Regulated, and Trans-Regulated Genes Using Both the Single-Genotype Parental Data Set and the Multigenotype Parental Data Set (brains only) Note.—The calculation of cis-regulated genes relies only on F1 hybrids, thus there is no recalculation of the number of cis-regulated genes in the multiparent genotype data set. We then correlated measures of divergence across all genes with measures of cis- and trans-regulatory divergence as estimated from F1 hybrids between these two lines (Coolon et al. 2014; Metzger et al. 2017). This correlation broadly estimates the contributions of each regulatory type to total expression divergence without relying on thresholds of statistical significance. We found that trans-effects were more closely associated with parental divergence than cis-effects to parental divergence in both brains (trans: ρ  =  0.577; 95% CI = [0.561, 0.591], cis: ρ  =  0.455; 95% CI = [0.438, 0.472], fig. 1) and whole bodies (trans: ρ  =  0.643; 95% CI = [0.629, 0.657], cis: ρ  =  0.338; 95% CI = [0.319, 0.356], fig. 1). Lastly, we found more individual genes displaying evidence of regulation in trans than in cis in both brains and bodies, although this trend was much more dramatic in whole bodies (table 1 and supplementary tables S3 and S4, Supplementary Material online).
. 1.

The relationship between cis- and trans-regulatory divergence and divergence between parental genotypes in (A) brains and (B) whole bodies. Bold points indicate significantly differentially expressed genes in each data set. Trend lines represent least-squares regressions surrounded by 95% confidence intervals.

The relationship between cis- and trans-regulatory divergence and divergence between parental genotypes in (A) brains and (B) whole bodies. Bold points indicate significantly differentially expressed genes in each data set. Trend lines represent least-squares regressions surrounded by 95% confidence intervals.

Analysis of Multiple Parental Genotypes in Brains

The above comparison, using only single parental genotypes, provides a limited estimate of expression evolution across populations. To more confidently assess expression evolution across populations, we analyzed brain RNA-seq data from seven additional Catalina Island genotypes and six additional Sonoran genotypes. Specifically, we expected the inclusion of multiple genotypes to reduce sampling error and result in lower expression differentiation between populations. Indeed, parental divergence across populations was lower in this data set (1−ρ  =  0.005; 95% CI = [0.004, 0.005]) and the number of significantly DE genes was reduced (table 1 and supplementary table S2, Supplementary Material online). We then examined correlations between parental divergence using multiple genotypes with the identical cis- and trans-regulatory divergence values calculated above. We now found the opposite result: cis-effects were more related to population divergence as measured by multiple genotypes than trans-effects (cis: ρ  =  0.362; 95% CI = [0.343, 0.380], trans: ρ  =  0.287; 95% CI = [0.268, 0.306], fig. 2). We also used the multigenotype data set to recalculate the number of trans-regulated genes, and found far fewer than in the single genotype analysis, although still considerably more than the number of cis-regulated genes (table 1 and supplementary tables S3 and S4, Supplementary Material online).
. 2.

The relationship between cis- and trans-regulatory divergence and divergence between parental genotypes in brains, where parental divergence is measured using all genotypes. Cis- and trans-regulatory divergence data are the same as in figure 1. Bold points indicate significantly differentially expressed genes. Trend lines display least-squares regressions surrounded by 95% confidence intervals.

The relationship between cis- and trans-regulatory divergence and divergence between parental genotypes in brains, where parental divergence is measured using all genotypes. Cis- and trans-regulatory divergence data are the same as in figure 1. Bold points indicate significantly differentially expressed genes. Trend lines display least-squares regressions surrounded by 95% confidence intervals.

Evolutionary and Candidate Gene Analysis

We found that cis-regulated genes had higher evolutionary rates within D. mojavensis than did trans-regulated genes (P = 0.0010) or genes that were either conserved or lacking a clear regulatory pattern (P = 0.0018). We also found that the in-degree (number of transcriptional regulators), as inferred from D. melanogaster orthologs, of cis-regulated genes in our data set was lower than that of either trans-regulated genes (P = 0.0034) or those with no identified regulatory type (P = 1.1e−5). The distributions of dN/dS and in-degree values for genes in each regulatory classification are shown in figure 3. To examine potential evolutionary hypotheses on a more granular level, we also compiled a list of candidate genes displaying two criteria: differential expression between the two populations in the multiple genotype brain data set, and a statistically significant pattern of cis- and/or trans-regulatory evolution. About 68 genes met these criteria, of which 27 where cis-regulated, 35 were trans-regulated, and six had significant cis- and trans effects (table 2). Of these six, five showed evidence for compensatory (cis×trans) evolution, whereas only one showed evidence for combined (cis+trans) evolution.
. 3.

Relationships between regulatory classification and gene-level metrics. (A) Comparison of evolutionary rate (dN/dS) among Drosophila mojavensis populations (from Allan and Matzkin [2019]). (B) Comparison of the number of transcriptional regulators (in degree) as inferred from D. melanogaster orthologs (Marbach et al. 2012). P values for significant pairwise comparisons are indicated, all other comparisons are nonsignificant.

Table 2

The Genes Displaying Both Differential Expression across Populations in the Brain Multigenotype Data Set and Statistical Evidence for Cis- and/or Trans-Regulatory Evolution in the Brain

Drosophila mojavensis FlyBase ID Drosophila melanogaster Gene Name P Value (Parental DE) P Value (Regulatory Type)Population with Higher ExpressionRegulatory Classification
FBgn0140302Cyp28a56.75E-304.3E-10SON cis
FBgn0147102GstD11.67E-288.25E-09CI cis
FBgn0139804Ugt36Bc2.81E-225.14E-08SON cis
FBgn0280601NA2.62E-190.00000471CI cis
FBgn0143628NA1.22E-160.00159SON cis
FBgn0147447Cyp9f27.89E-130.000000667CI cis
FBgn0136497CG145672.41E-115.24E-10CI cis
FBgn0133237RdhB1.82E-082.58E-08CI cis
FBgn0133140CG53796.73E-080.000332SON cis
FBgn0136870CG335210.0000005350.00000103SON cis
FBgn0136131CG22110.00000140.000000357CI cis
FBgn0146788CG185470.00000250.0195SON cis
FBgn0140494CG317770.00004440.0000392SON cis
FBgn0139666CG53160.00004741.67E-09SON cis
FBgn0136235CG339690.00008350.000765CI cis
FBgn0142519Phr0.0001620.00133SON cis
FBgn0145297CG105500.0009170.00377SON cis
FBgn0140061NA0.001492.15E-11SON cis
FBgn0147596CG344090.003730.0169CI cis
FBgn0146770Cyp12e10.004470.000129CI cis
FBgn0085888ST6Gal0.007380.009SON cis
FBgn0142451CG93440.008850.0111CI cis
FBgn0280848NA0.01990.0486SON cis
FBgn0136999CG80860.02080.00551SON cis
FBgn0132957CG101650.0210.00602CI cis
FBgn0141924CG35110.02440.0322CI cis
FBgn0132961Acyp20.04340.0195CI cis
FBgn0134299Fbp13.37E-261.72E-46CI trans
FBgn0145801Lsp1beta4.18E-246.82E-35CI trans
FBgn0143382Lsp27.92E-182.74E-22CI trans
FBgn0142713Sans1.53E-160.00527SON trans
FBgn0140799Lsp1gamma1.71E-162.23E-08CI trans
FBgn0140406Fbp21.35E-122.53E-15CI trans
FBgn0135991NA2.29E-120.0000427CI trans
FBgn0135183CAH24.69E-120.000012CI trans
FBgn0134843CG320372.93E-100.0223CI trans
FBgn0139371GstE21.4E-090.00239CI trans
FBgn0146682TyrRII2.96E-090.00385SON trans
FBgn0139800Mhc3.12E-090.0346SON trans
FBgn0139723Cg25C8.58E-090.00847CI trans
FBgn0138516Sirup2.74E-080.0179SON trans
FBgn0147600MtnA0.0000008964.35E-39CI trans
FBgn0142365CG79970.000001240.000746CI trans
FBgn0141762CG35200.00000190.000176CI trans
FBgn0142390Bru0.00001870.000000015SON trans
FBgn0147345CG142910.00007350.0226SON trans
FBgn0141634CG304600.0002170.00153SON trans
FBgn0146346NA0.0003430.00903CI trans
FBgn0142161CG137420.0005490.000399CI trans
FBgn0146625CG312780.001110.0197CI trans
FBgn0138139Nocte0.001962.85E-33CI trans
FBgn0132831CG63640.002330.00353CI trans
FBgn0134066CG180810.007950.000801CI trans
FBgn0146517Snap250.01110.0000444CI trans
FBgn0140557CG171240.01180.00327CI trans
FBgn0138450CG147850.01780.00243SON trans
FBgn0132826CG67230.01790.0000984SON trans
FBgn0143281RpL230.02670.00682CI trans
Fbgn0145013CG343770.0270.000183CI trans
FBgn0144093NA0.03130.00176SON trans
FBgn0145765Npc2b0.03630.000000643SON trans
FBgn0145063Obp99b0.03970.0289CI trans
FBgn0146132Obp99a1.67E-28 cis: 4.74E-04CI cis + trans
trans: 4.30E-06
FBgn0141410CG180672.95E-11 cis: 7.01E-13SON cis x trans
trans: 1.12E-04
FBgn0146495Spartin0.0000296 cis: 7.89E-03CI cis x trans
trans: 6.74E-11
FBgn0084818NA0.00145 cis: 1.02E-05CI cis x trans
trans: 7.85E-04
FBgn0147549NANS0.0039 cis: 2.13E-03SON cis x trans
trans: 1.03E-05
FBgn0141293CG156510.0235 cis: 4.28E-16SON cis x trans
trans: 1.11E-06
Relationships between regulatory classification and gene-level metrics. (A) Comparison of evolutionary rate (dN/dS) among Drosophila mojavensis populations (from Allan and Matzkin [2019]). (B) Comparison of the number of transcriptional regulators (in degree) as inferred from D. melanogaster orthologs (Marbach et al. 2012). P values for significant pairwise comparisons are indicated, all other comparisons are nonsignificant. The Genes Displaying Both Differential Expression across Populations in the Brain Multigenotype Data Set and Statistical Evidence for Cis- and/or Trans-Regulatory Evolution in the Brain

Discussion

The measurement of allele-specific expression in F1 hybrid offspring has been one of the primary approaches for understanding genome-wide patterns of cis- and trans-regulatory evolution both within and between species. Although other methodologies for quantifying these effects have been used effectively, the advantage of the F1 hybrid approach, in our opinion, is its simplicity and potential applicability to a wide range of study systems and evolutionary contexts. However, as with any other genome-scale approach to evolution, the conclusions stemming from F1 hybrid studies come with biases and limitations. Here, our goal was to investigate how two straightforward modifications to this common experimental design affect the evolutionary interpretations regarding the prevalence of cis- and trans-regulation in natural populations. Furthermore, we aimed to leverage the ecological and evolutionary information from our model system, D. mojavensis, to examine how successfully the integration of complex regulatory data can uncover adaptive gene expression changes across populations.

The Effects of Tissue Specificity on Estimation of Regulatory Type

Most of the original genome-wide studies of cis- and trans-regulatory evolution used gene expression measurements taken from whole organisms. This experimental design may blunt the ability to detect cis-regulatory changes, if those changes are only realized in a subset of tissues. Here, we performed allele-specific expression experiments in both whole bodies and brains of larval D. mojavensis in parallel, to determine if and how much the use of heterogeneous tissue samples affects quantification of regulatory type. We found clear evidence that our analysis of whole-body samples both overestimated trans effects and underestimated cis effects. This is reflected both in correlations between regulatory and parental divergences (fig. 1) as well as in the numbers of genes statistically categorized as cis- or trans-regulated (table 1). We cannot say precisely how much of this difference is due to the problems of using heterogeneous tissue samples and how much is due to regulatory properties specific to the larval brain. A systematic data set of allele-specific expression in multiple tissues and life stages collected would be needed to robustly address this question.

The Effects of Using Multiple Parental Genotypes on Estimation of Regulatory Type

It is well known that the F1 hybrid approach can bias estimates of trans-regulatory evolution, because they cannot be estimated independently of the measurement of parental expression divergence. Fraser (2019) pointed out how this issue, when combined with error in the estimation of allele-specific expression, can lead to overestimation of cis–trans compensatory evolution. By the same logic, we hypothesized that simple errors in the measurement of parental expression evolution might lead to inflation of the degree of trans-regulatory effects. To address this issue, we simply compared the quantification of cis- and trans-regulatory effects in brains using two measures of parental expression divergence across populations: one measured from only the single genotype utilized in the allele-specific expression experiment, and one using seven additional genotypes from each population. Using population expression values has two obvious potential consequences for each gene. First, it should reduce noise in the estimation of population expression means coming from the small sample size of using only a few samples of a single genotype. This should lead to reduced estimates of trans-regulatory evolution. However, there will also be a subset of genes whose expression in the focal genotype substantially differs from the mean of its population. Thus, for some number of genes, our method should result in the artificial detection of trans-regulatory effects because the parental divergence will be mismatched with the allele-specific expression data. Despite this, we find that using estimates of parental population divergence using multiple genotypes considerably reduces the genome-wide estimate of trans-regulatory evolution. Our data present mixed results, however, on the question of whether cis- or trans-regulation is primarily responsible for expression evolution between our populations. Although the correlation analysis suggests that cis-regulatory divergence is more closely related to population divergence (fig. 2), our per-gene hypothesis tests maintain nearly twice as many genes with evidence of trans-regulatory divergence (table 1). However, we argue that the numbers of genes displaying evidence for trans-regulation here is an overestimate for three reasons. First, as mentioned above, the inclusion of multiple parental genotypes will induce false positives in cases where intrapopulation variation in gene expression is substantial. Second, it is likely that the difference in power between the methods to detect cis- and trans effects contributes to the number of genes detected in each category (Graze et al. 2009; Coolon et al. 2014; but see Glaser-Schmitt et al. [2018]). Third, the estimation of parental expression differences, and therefore trans-regulatory effects, are more error-prone due simply to the biology of our samples. We compared larval samples across two populations that develop at different rates (egg-pupation time [h]: CI = 275.20 ± 5.48; SON = 315.21 ± 5.96; Benowitz KM, Unpublished data), making it impossible to guarantee that sampling occurred at precisely the same exact developmental stage. Thus, some ontogenetic or environmental variation in gene expression is inevitable here. The usage of multiple genotypes should mitigate this problem but is unlikely to resolve it completely, and thus we may be generating false positives for this reason. For example, we find significant and consistent population differences in expression of two fat body proteins (Fbp1; Fbp2) and three larval serum proteins (Lsp1beta; Lsp1gamma; Lsp2) that are all clearly attributed to trans-regulatory evolution (table 2). Fat body proteins and larval serum proteins interact in a key pathway for nutritional storage prior to pupation (Burmester et al. 1999), and therefore could lead us to a hypothesis of adaptation via trans-regulation to variable nutritive environments. However, it is also well established that all four of these genes undergo rapid increases in expression during the third-instar wandering stage (Burmester et al. 1999). Our results are therefore equally consistent with the possibility that the expression differences were due to slight variations in the developmental stages sampled, and that expression patterns of these genes have not meaningfully evolved at all. In contrast, the estimation of cis-regulatory effects is completely controlled for any such environmental variability because it is measured within individuals (Pastinen 2010), and is therefore inherently less prone to similar errors.

Integrating Regulatory Data to Address Evolutionary Hypotheses

Recent work has sought to identify the evolutionary and structural properties associated with genes evolving via cis- and trans-regulation. Here, we demonstrate that in D. mojavensis larval brain cis-regulated genes tend to display faster rates of coding evolution. Furthermore, we show that cis-regulated genes also tend to occupy less central positions within transcriptional networks, confirming the results of Yang and Wittkopp (2017) and supporting their generality. Notably, we reached this conclusion using D. melanogaster network data, given that similar data are unavailable for D. mojavensis. Thus, our results suggest that gross network properties may be conserved across significant lengths of evolutionary time. Considered together, our findings linking regulatory type to evolutionary rate and network connectivity indicate that the genes experiencing cis-regulatory evolution are relatively unconstrained compared with trans-regulated genes. This perhaps contrasts with our predictions, which were that cis-regulation should be predominant due precisely to the presence of such constraints. However, it is not clear whether errors regarding the determination of trans-regulation discussed above may be obscuring any potential statistical relationships. Ideally, the determination of regulatory evolution will also help identify adaptively regulated genes (Fraser 2011; Delbare and Clark 2018). RNA-seq experiments in ecology and evolution nearly always result in hundreds if not thousands of DE genes, many of which are likely false positives (Todd et al. 2016; Bengston et al. 2018). Truly DE genes must have experienced cis- or trans-regulatory evolution; therefore, the corroboration provided by a statistically significant regulatory effect estimated from an independent sample may help weed out noisy or environmentally variable genes. Thus, allele-specific expression data have been used to supplement studies of gene expression adaptation at the genome-wide (Juneja et al. 2016; Verta and Jones 2019) and candidate gene (Bendesky et al. 2017) levels. We thus turned our attention to the identities of the genes displaying clear patterns of both divergence and regulation, and compared between those displaying cis- and trans-regulation. Previous transcriptomic (Matzkin et al. 2006; Matzkin 2012; Matzkin and Markow 2013; Smith et al. 2013) and genomic (Allan and Matzkin 2019) investigations of D. mojavensis have identified detoxification and chemosensory genes as important classes of genes likely related to adaptation to the alternative chemical environments provided by their hosts. Taking this as an a priori hypothesis, we examined the identities of genes identified here to search for candidates fitting these categories. We are most interested in the cis-regulated genes, which have the cleanest interpretations in this data set. Among the 33 cis-regulated candidate genes are 28 with D. melanogaster orthologs, of which 12 have described functions. Noteworthy among these is GstD1, a detoxification gene with considerable evidence for a functional role in adaptation across D. mojavensis populations (Matzkin et al. 2006; Matzkin 2008). Here, we find that expression differences in GstD1 are clearly attributable to cis-regulatory evolution between these populations, leading to increased expression in the Catalina Island population. Four other genes, including three cytochrome p450s and one UDP-glycosyltransferase, have well-characterized roles in detoxification of plant chemicals as well (Heckel 2014). We also find a single chemosensory gene, Obp99a. Among the 19 characterized genes displaying trans-regulation, we find the detoxification gene GstE1 as well as the chemosensory genes Obp99a (regulated in cis and trans) and Obp99b. Thus, although we are less confident in the trans-regulated gene set as a whole, this confirmation suggests that the regulatory evolution of at least a subset of these genes is accurately represented in this data set. Although it is unsurprising that chemosensory genes are expressed and have evolved specifically in the brain, it is not as immediately clear why the expression of detoxification genes should be important in brain tissue. Detoxification is usually associated with tissues such as the midgut, Malpighian tubule, and fat body (Chung et al. 2009) and the blood–brain barrier tends to shield the brain from harmful chemicals (Stork et al. 2008; Hindle and Bainton 2014). However, the Drosophila blood–brain barrier is not completely impermeable to xenobiotics (Zhang et al. 2018), and important detoxification processes have been demonstrated in the brain in other insects (Zhu et al. 2010). Thus, it is at least plausible, given the chemical cocktail that D. mojavensis is exposed to within organ pipe and prickly pear necroses (Kircher 1982; Starmer and Phaff 1983), that some compounds may enter the brain. Alternatively, it is possible that we are witnessing the consequences of indirect selection. For example, strong selection on GstD1 expression in the midgut might have resulted in a cis-regulatory change to a binding site for a transcription factor that is also highly expressed in brains. If the resulting change in brain expression is neutral or nearly neutral it may then persist without having any adaptive function. Given the ability of our approach to recapitulate a priori hypotheses about expression evolution, we also asked whether this approach might lead to novel predictions about phenotypic and genetic adaptation. Among the remaining cis-regulated genes are photorepair (phr) and CG5316 (ortholog of human aprataxin), which function to repair UV-damaged DNA (Boyd and Harris 1987; Hirano et al. 2007). In D. melanogaster, selection has generated adaptive differences in DNA repair mechanisms between tropical and temperate populations, and has resulted in both coding and noncoding genetic changes (Svetec et al. 2016). Here, both phr and CG5316 are upregulated in the Sonoran Desert population, where presumably intense UV exposure is a more pressing environmental challenge than in the cooler and wetter clime of Santa Catalina Island. Interestingly, phr was not among the D. melanogaster candidate genes differentiated in sequence or expression among populations, whereas CG5316 showed evidence of protein-coding but not expression evolution (Svetec et al. 2016). Thus, even when the predictability of DNA repair evolution pathways to similar environmental variables may extend to the gene, the type of genetic change itself may still be unpredictable. Broadly, our usage of a replicated, tissue-specific data set and requirement of gene to display a clear regulatory pattern, especially one of cis-regulation, has led us to a manageable set of a few dozen highly intuitive genes that may be adaptively regulated across D. mojavensis populations associated with local ecological conditions. We propose that deeper understanding of patterns of regulatory evolution in ecological model systems, where there are strong predictions regarding selection, will be essential for a robust understanding of the differing roles of cis- and trans-regulation in local adaptation and evolution.

Supplementary Material

Supplementary data are available at Genome Biology and Evolution online. Click here for additional data file.
  72 in total

Review 1.  Genome-wide allele-specific analysis: insights into regulatory variation.

Authors:  Tomi Pastinen
Journal:  Nat Rev Genet       Date:  2010-06-22       Impact factor: 53.242

2.  QUANTITATIVE GENETIC ANALYSIS OF MULTIVARIATE EVOLUTION, APPLIED TO BRAIN:BODY SIZE ALLOMETRY.

Authors:  Russell Lande
Journal:  Evolution       Date:  1979-03       Impact factor: 3.694

3.  DNA single-strand break repair is impaired in aprataxin-related ataxia.

Authors:  Makito Hirano; Aya Yamamoto; Toshio Mori; Li Lan; Taka-aki Iwamoto; Masashi Aoki; Keiji Shimada; Yoshiko Furiya; Shingo Kariya; Hirohide Asai; Akira Yasui; Tomohisa Nishiwaki; Kyoko Imoto; Nobuhiko Kobayashi; Takao Kiriyama; Tetsuya Nagata; Noboru Konishi; Yasuto Itoyama; Satoshi Ueno
Journal:  Ann Neurol       Date:  2007-02       Impact factor: 10.422

4.  Characterization of Drosophila melanogaster cytochrome P450 genes.

Authors:  Henry Chung; Tamar Sztal; Shivani Pasricha; Mohan Sridhar; Philip Batterham; Phillip J Daborn
Journal:  Proc Natl Acad Sci U S A       Date:  2009-03-16       Impact factor: 11.205

5.  Complex genetic interactions underlying expression differences between Drosophila races: analysis of chromosome substitutions.

Authors:  Hurng-Yi Wang; Yonggui Fu; Mary Sara McPeek; Xuemei Lu; Sergey Nuzhdin; Anlong Xu; Jian Lu; Mao-Lien Wu; Chung-I Wu
Journal:  Proc Natl Acad Sci U S A       Date:  2008-04-22       Impact factor: 11.205

6.  A Circadian Clock in the Blood-Brain Barrier Regulates Xenobiotic Efflux.

Authors:  Shirley L Zhang; Zhifeng Yue; Denice M Arnold; Gregory Artiushin; Amita Sehgal
Journal:  Cell       Date:  2018-03-08       Impact factor: 41.582

7.  Molecular evidence for increased regulatory conservation during metamorphosis, and against deleterious cascading effects of hybrid breakdown in Drosophila.

Authors:  Carlo G Artieri; Rama S Singh
Journal:  BMC Biol       Date:  2010-03-31       Impact factor: 7.431

8.  Tempo and mode of regulatory evolution in Drosophila.

Authors:  Joseph D Coolon; C Joel McManus; Kraig R Stevenson; Brenton R Graveley; Patricia J Wittkopp
Journal:  Genome Res       Date:  2014-02-24       Impact factor: 9.043

9.  HTSeq--a Python framework to work with high-throughput sequencing data.

Authors:  Simon Anders; Paul Theodor Pyl; Wolfgang Huber
Journal:  Bioinformatics       Date:  2014-09-25       Impact factor: 6.937

10.  The Adaptive Significance of Natural Genetic Variation in the DNA Damage Response of Drosophila melanogaster.

Authors:  Nicolas Svetec; Julie M Cridland; Li Zhao; David J Begun
Journal:  PLoS Genet       Date:  2016-03-07       Impact factor: 5.917

View more
  2 in total

1.  Molecular Evolution across Mouse Spermatogenesis.

Authors:  Emily E K Kopania; Erica L Larson; Colin Callahan; Sara Keeble; Jeffrey M Good
Journal:  Mol Biol Evol       Date:  2022-02-03       Impact factor: 16.240

2.  Unraveling regulatory divergence, heterotic malleability, and allelic imbalance switching in rice due to drought stress.

Authors:  Nelzo C Ereful; Antonio Laurena; Li-Yu Liu; Shu-Min Kao; Eric Tsai; Andy Greenland; Wayne Powell; Ian Mackay; Hei Leung
Journal:  Sci Rep       Date:  2021-06-29       Impact factor: 4.996

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.