David M Rand1, Jim A Mossman1, Lei Zhu1, Leann M Biancani1,2, Jennifer Y Ge1,3,4,5. 1. Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA. 2. Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, USA. 3. Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA. 4. Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA. 5. Harvard-MIT Division of Health Sciences and Technology, Harvard Medical School, Boston, MA, USA.
Abstract
Mitochondrial function requires the coordinated expression of dozens of gene products from the mitochondrial genome and hundreds from the nuclear genomes. The systems that emerge from these interactions convert the food we eat and the oxygen we breathe into energy for life, while regulating a wide range of other cellular processes. These facts beg the question of whether the gene-by-gene interactions (G x G) that enable mitochondrial function are distinct from the gene-by-environment interactions (G x E) that fuel mitochondrial activity. We examine this question using a Drosophila model of mitonuclear interactions in which experimental combinations of mtDNA and nuclear chromosomes generate pairs of mitonuclear genotypes to test for epistatic interactions (G x G). These mitonuclear genotypes are then exposed to altered dietary or oxygen environments to test for G x E interactions. We use development time to assess dietary effects, and genome wide RNAseq analyses to assess hypoxic effects on transcription, which can be partitioned in to mito, nuclear, and environmental (G x G x E) contributions to these complex traits. We find that mitonuclear epistasis is universal, and that dietary and hypoxic treatments alter the epistatic interactions. We further show that the transcriptional response to alternative mitonuclear interactions has significant overlap with the transcriptional response to alternative oxygen environments. Gene coexpression analyses suggest that these shared genes are more central in networks of gene interactions, implying some functional overlap between epistasis and genotype by environment interactions. These results are discussed in the context of evolutionary fitness, the genetic basis of complex traits, and the challenge of achieving precision in personalized medicine.
Mitochondrial function requires the coordinated expression of dozens of gene products from the mitochondrial genome and hundreds from the nuclear genomes. The systems that emerge from these interactions convert the food we eat and the oxygen we breathe into energy for life, while regulating a wide range of other cellular processes. These facts beg the question of whether the gene-by-gene interactions (G x G) that enable mitochondrial function are distinct from the gene-by-environment interactions (G x E) that fuel mitochondrial activity. We examine this question using a Drosophila model of mitonuclear interactions in which experimental combinations of mtDNA and nuclear chromosomes generate pairs of mitonuclear genotypes to test for epistatic interactions (G x G). These mitonuclear genotypes are then exposed to altered dietary or oxygen environments to test for G x E interactions. We use development time to assess dietary effects, and genome wide RNAseq analyses to assess hypoxic effects on transcription, which can be partitioned in to mito, nuclear, and environmental (G x G x E) contributions to these complex traits. We find that mitonuclear epistasis is universal, and that dietary and hypoxic treatments alter the epistatic interactions. We further show that the transcriptional response to alternative mitonuclear interactions has significant overlap with the transcriptional response to alternative oxygen environments. Gene coexpression analyses suggest that these shared genes are more central in networks of gene interactions, implying some functional overlap between epistasis and genotype by environment interactions. These results are discussed in the context of evolutionary fitness, the genetic basis of complex traits, and the challenge of achieving precision in personalized medicine.
Genotype by Genotype interaction, implying epistasisGenotype by Environment interactionThree‐way interaction between epistasis (G × G) and Environment. Evidence that epistasis is modified by the environmentmitochondrial DNADrosophila Genetic Reference Panel
INTRODUCTION
Mitochondria provide a fascinating link between macroevolution and microevolution, and between evolutionary and medical genetics. Three billion years ago there were no eukaryotes, but two billion years later virtually all eukaryotes had DNA‐carrying mitochondria fully integrated into their cellular functions while being spatially separated from the DNA housed in each cell's nucleus. There is general agreement that mitochondria evolved from free‐living eubacterial cells that formed an alliance with a distinct microbial lineage 1, 2. There is, however, a vigorous debate over the timing and patterns of gene transfer from symbiont to host during the 1 to 2 billion‐year window of mitochondrial macroevolution 3, 4.The ongoing evolution of mtDNA presents a rich set of microevolutionary questions that are contingent on this macroevolutionary history 5. The genes retained in animal mtDNA encode core subunits of the oxidative phosphorylation (OXPHOS) system and ATP synthesis, which should be under strong selection to maintain proper functions that are central to organismal fitness. The functions of mtDNA genes require coordinated expression of hundreds of nuclear encoded gene products, a legacy of mitochondrial macroevolution. In animals, the high mutation and substitution rate for mtDNA, coupled with maternal inheritance and limited recombination, predispose it to mutational decay by Muller's ratchet 6, 7. Yet mitochondria still retain mtDNA 8. Selection apparently has prevented the elimination of truly essential genes in mitochondria, through strict purifying selection, or through a coevolutionary dynamic wherein deleterious mtDNA mutations are rescued by compensatory mutations in the nuclear genome, which enjoys the benefits of sex and recombination and can more rapidly adapt to altered interacting gene products from the mitochondria 9, 10. Through more than a billion years of coevolution, these mitonuclear interactions that regulate the core processes of energy transduction, signaling, nutrient sensing, and apoptosis, are a compelling example of a coadapted gene complex. Mitonuclear genomes must accommodate a wide distribution of fitness effects from positive and negative mutations, different recombinational landscapes and modes of transmission, and coordinate signaling pathways that allow organisms to adapt to mutational and environmental stressors that challenge their survival.These fundamental biological processes are why mitochondria also provide a fascinating link between evolutionary and medical genetics 11. Most traits of biological and medical significance exhibit continuous, quantitative variation in natural and clinical populations. Understanding the genetic basis of this phenotypic variation is one of the fundamental goals of biology 12. The central challenge here is to determine mechanistic models of how nucleotide variation among individual genomes generates the phenotypic variation that is pervasive in all populations. With the development of genome‐wide approaches to scan for nucleotides associated with trait variation (genome‐wide association studies or GWAS), there was hope that the genotype–phenotype problem would be solved. These studies have identified many gene variants that play significant roles in human diseases and traits of interest to plant and animal breeders. However, GWAS methods have produced two unexpected results 13: the most significant genes in these studies account for only a small fraction of the total genetic variation for traits, and many of the GWAS hits are in non‐coding regions of the genome. The first result, known as the “missing heritability” problem 14, implies that many genes of small effect, or gene interactions, that contribute to trait variation remain undetected. The second result implies that mutations affecting gene regulation, rather than amino acid sequence in proteins, are a more important component of trait variation. One implication from the missing heritability result is to simply “look harder”: GWAS projects with larger and larger sizes will eventually find all the genetic variation for traits 15. This view follows Fisher's 16 infinitesimal model that most variation is “additive.” An alternative view is that missing heritability is wrapped up in gene–gene (G x G, or epistatic) and gene–environment (G x E) interactions. Boyle et al. 13 proposed an “omnigenic” model in which “core” genes at the hubs of regulatory networks are connected through numerous “peripheral” genes that modify core, cellular functions. Since peripheral genes outnumber core genes 100‐fold, the omnigenic model seeks to integrate the puzzle of missing heritability with the phenotypic effects of noncoding SNPs. While this view extends the infinitesimal “additive” model to an extreme, it is our opinion that interactions between core and peripheral genes are realized mechanistically by G x G and G x E interactions.In this article, we examine the influence of epistasis and G x E on complex traits using a mitonuclear system in Drosophila that captures aspects of macro and microevolution, and evolutionary and medical genetics 17, 18, 19, 20. We note that virtually all GWA analyses ignore mtDNA variation, and frequently X‐linked variation, in models linking genotype to phenotype 21. Mitochondrial function is by nature a problem in gene interaction: there are 37 genes encoded in mtDNA and >1,000 nuclear encoded proteins that are needed for mitochondrial biogenesis 22. And mitochondrial function is by nature a problem in G x E: the cellular processes governed by each individual's unique mitonuclear genotype have been selected to catabolize nutrients acquired from the environment and pass their electrons to oxygen supplied by the environment. There are dozens, if not hundreds, of mitochondrial diseases that are associated with mutations in either mtDNA or nuclear genes 23 (http://www.umdf.org), many of which are exacerbated by environmental stressors. Since G x G and G x E effects can limit the power to detect individual gene effects in GWAS, we are curious to learn the contribution of mitonuclear G x G and G x E effects on complex traits.Here we use a panel of mitonuclear genotypes in Drosophila to examine the joint effect of mtDNA and nuclear genomes on development time and genome wide transcript abundance. We ask how different environmental treatments (diet or hypoxia, respectively) alter these traits in orthogonal designs. Development time is a key fitness trait and is a common marker of disease when development is delayed. Transcript abundance, as measured by genome wide RNAseq, is a very efficient means of quantifying thousands of quantitative traits. The level of each RNA transcript is a continuous trait influenced by dozens of interacting cis‐ and trans‐regulatory elements that are sensitive to the internal genomic and external environmental context in the production of mature RNAs 24. Since nuclear transcripts are unlinked from mtDNA genes, the impact of alternative mtDNA variants on RNA transcript levels serves as trans‐modifying factors that are a distinct class of gene interactions. We present new analyses of data from two experiments 19, 20 that explore the common features of mitonuclear G x G and G x E by recognizing that the mtDNA genome is a cellular environmental factor for the nucleus, and vice versa. Any external environmental factors that trigger signaling pathways that initiate cellular responses are likely to work through shared components of complex mitonuclear signaling.
EXPERIMENTAL PROCEDURES
Genetic Stocks
We report data from two sets of Drosophila genotypes. The “mito‐DGRP” stocks were used to study mitonuclear epistasis and genotype × environment effects of diet on development time; the Oregon‐Austria stocks were used to study mitonuclear epistasis and genotype × environment effects of hypoxia on transcriptional profiles. The mito‐DGRP lines are a panel of 72 genotypes constructed from six different mtDNAs each placed on 12 nuclear chromosomal backgrounds 19. The mtDNAs are from three lines of Drosophila melanogaster (OregonR, Zimbabwe 53, and Austria W132) and three lines of D. simulans (siI from Hawaii, siII (from stock C167.4 = “sm21”), and siIII from Madagascar (= maI)). Each mtDNA was placed on to the nuclear backgrounds of 12 D. melanogasterDrosophila Genetic Reference Panel lines (DGRP; 25, 26: (DGRP‐304, ‐313, ‐315, ‐358, ‐375, ‐517, ‐707, ‐712, ‐714, ‐765, ‐786, and ‐820), as described previously 19. Each of these mtDNAs was independently placed on the nuclear backgrounds of D. melanogaster OregonR and Austria W132, as decried previously 17, 20. Genotype notation is as follows: mtDNA;nuclearDNA, so Zim53;315 is the Zimbabwe 53 mtDNA on DGRP‐315 nuclear chromosomal background and siI;OreR is the siI mtDNA from D. simulans on the OregonR nuclear chromosomal background.In each case, initial line construction used balancer chromosome stocks to replace the nuclear chromosomes on to each cytoplasm to prevent recombination and possible selection for mitonuclear allelic combinations. Following mitonuclear construction, several generations of backcrossing to male DGRP, OregonR or Austria lines was done to homogenize nuclear backgrounds across mtDNA genotypes. All lines were treated with tetracycline to remove Wolbachia prior to construction, and were confirmed to not carry Wolbachia after lines were built. The choice of mtDNAs was intended to capture both mtDNA polymorphism within species and divergence between species to measure the impact of mtDNA nucleotide variation on phenotypes. The three mtDNAs from D. melanogaster differ by 18 nonsynonymous changes and 40 synonymous changes across the coding regions of the molecule; those for D. simulans differ by 24–45 nonsynonymous and ~247–298 synonymous changes, and the divergence between D. melanogaster and D. simulans is 90–103 nonsynonymous and 401–438 synonymous changes. Pairwise differences for other classes of nucleotides are reported in 17, 27. The orthogonal pairing of each mtDNA with a different nuclear background tests the hypothesis that the phenotypic effects of mtDNA effects are dependent on nuclear genetic background, which can be assessed using ANOVAs for main and interaction effects.
Analysis of Development Time
Each of the 72 mito‐DGRP lines was scored for development time from egg to adult on four different culture media that varied the yeast:sugar ratio. All stocks were cultured at controlled density for two generations on standard laboratory food prior to the assay. Thirty eggs from each stock were picked from grape plates after a fixed period of egg laying, and placed into each of 12 replicate vials. The eggs for these replicated culture vials were collected over 3 days from two independent egg laying chambers containing ~50 females and 50 males, neither of which proved significant is subsequent analyses, so data were pooled across collection days. Some of the genotypes did not produce enough eggs to establish 12 vials with 30 eggs per vial, so 12 replicates were set up with fewer than 30. The effect of egg density was modeled in the analyses and did not improve the fit of the statistical models when the number of emerging adults was analyzed as a covariate. Additional details of the experimental set up are described in 19. All experimental vials were housed in the same environmental chamber at 25 °C and 12:12 light:dark cycle. Racks of vials were randomized across the chamber, and moved during culture to avoid spatial effects in the chamber. All vials were scored for emerging adults twice each day (9 am and 5 pm), with the time of day and number of males and females in each vial recorded. The values from each vial over the eclosion period provided estimates of mean development time, mean egg‐to‐adult viability and the coefficient of variation for these traits for males and females of each of the 72 genotypes. Analyses of these phenotypes have been reported 19; here we present new analyses of how different dietary environments influence the phenotypes of these 72 mitonuclear genotypes.
Dietary Environments
All genotypes were quantified for development time in four dietary environments that differed in the ratio of sugar and yeast in the media. Three of the diets are approximately isocaloric, and the fourth diet is the standard laboratory food on which the files are normally maintained. The three isocaloric diets were reported previously 28. The High P:C, Equal P:C, and Low P:C diets have 32, 20, and 8 g of SAF yeast, and 8, 20, 32, g of sucrose, respectively, for varying ratios of Protein:Carboydrate. All remaining ingredients are the same in each diet (1 g agar, 9 g yellow cornmeal, 0.45 g tegosept in 4.5 ml of 95% ethanol, cooked in 200 ml of distilled water). The standard lab food contains 1.8 g agar, 5 g SAF yeast, 10.4 g yellow cornmeal, 22 g sucrose, 0.9 g tegosept in 4.5 ml 95% ethanol, cooked in 200 ml of distilled water. The Matzkin et al. recipe uses methyl paraben as an antifungal agent, but we substituted tegosept as that is used in our standard food recipe. The food was prepared on a stovetop, with the agar dissolved first followed by the remaining ingredients, and the tegosept solution added only after the cooked food had cooled to 55 °C. Glass vials were filled with 10 ml of cooked food, and no additional dry yeast added on top. Eggs from each genotype were collected from grape plates, as described above, and placed in to replicate vials of each diet. Thus egg hatching, larval development, pupariation and adult eclosion took place in these different nutritional environments.
Hypoxia Exposure
Transcriptional profiles were determined for four mitonuclear genotypes exposed to normoxic or hypoxic environments, in a fully factorial design of two different mtDNAs (OreR and siI), two different nuclear backgrounds (OreR and Austria), and three oxygen environments (room air, or 6% oxygen for 30 or 120 min). The mitonuclear genotypes were OreR;OreR, siI;OreR, OreR;Austria and siI;Austria. All flies were cultured at controlled density for two generations on normal laboratory food prior to the experiment. Adult flies were allowed to eclose, mature, and mate for 3–5 days, and then were separated in to single sex vials of 30 flies. Replicate vials were placed in a Biospherix Animal chamber (Biospherix, Lacon, NY) equilibrated at 6% O2 by streaming N2 gas into the chamber using a BioSpherix ProOx Model 110 gas oxygen controller (BioSpherix, Lacona, NY). Replicate control vials remained at room air (21% O2) in a duplicate chamber with the door wide open, and all experimental vials had screen mesh covering the vial opening to allow free exchange of air. Vials were introduced into and removed from the hypoxia chamber through a sleeve attached to a port in the side of the chamber so O2 levels remained stable during the exposure process. Air pressure was ambient, so this system creates physiological hypoxia by reducing the partial pressure of O2 to ~5.8 kPa, from the normal level of ~20.3 kPa in room air at sea level. After 30 and 120 min of exposure to 6% O2, replicate vials were removed from the hypoxia chamber and the flies in each vial were immediately flash frozen in liquid N2 for later RNA extraction. The room air control samples are referred to as time point t = 0.
RNA Extraction and Sequencing
RNA extraction, Illumina sequencing and downstream data analyses were performed as described previously 20, 29. RNA was extracted from each sample of 30 adult flies. There were three replicate samples for each 4‐genotype × 3‐hypoxia x 2‐sex treatment conditions, with the exception of only two replicates from the female siI;Austria genotype at time point t = 0, resulting in a total of 71 RNA libraries. Analyses reported here are for 35 libraries from female samples. RNA extraction followed procedures described in 30 for mRNA purification, fragmentation, first and second strand cDNA synthesis, adapter ligation, and PCR enrichment. Nucleic acid quantification at each stage was performed using a Qubit fluorimeter using Qubit reagents. Libraries for RNAseq were prepared from amplified cDNA fragments ranging from 334–500 bp using a Caliper Lab Chip XT apparatus with DNA 750 chips (Caliper Life Sciences, Hopkinton, MA). Quantification of transcript levels was performed using 50 bp single end sequence reads on an Illumina HiSeq 2000 instrument at the Brown University Genomics Core Facility.
RNAseq Data Analyses
Raw sequence reads from the Illumina instrument were processed for quality control and adaptor removal using fastq quality filter tools in the FASTX‐Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/commandline.html#fastq_quality_filter_usage, and http://hannonlab.cshl.edu/fastx_toolkit/commandline.html#fastx_clipper_usage) which removed reads in which 80% of the base calls have a quality score <20. Reads passing these filters were mapped to the dm3 D. melanogaster reference sequence using Tophat v.2.0.12 31: https://ccb.jhu.edu/software/tophat/index.shtml) and Bowtie2 v2.2.3 (http://bowtie-bio.sourceforge.net/index.shtml). The BAM files generated from Tophat were converted to. SAM files with Samtools 32 (http://samtools.sourceforge.net/), and reads mapping to specific genes in the dm3 annotation were counted using HTseq‐count 33 (accessed in November 2016 at http://www-huber.embl.de/users/anders/HTSeq/doc/count.html, using version 0.6.1; but now moved to http://htseq.readthedocs.io/en/release_0.10.0/). The read count table from HTseq was used for downstream analyses in edgeR 34. Initial results from these data have been reported 20 and details are provided there. Briefly, for differential expression (DE) analyses in edgeR with multi factor analyses such as those used in our design, sequential estimations of dispersion are recommended 34, 35. Thus we estimated a common dispersion, followed by trended dispersion, followed by a tagwise dispersion of the expression counts. These steps, coupled with the exclusion of genes whose read count fell below one count per million in at least three libraries, should restrict the analyses to genes with comparable patterns of variation among our experimental treatments (see Supporting Information Fig. S1). For main effects of mtDNA, nuclear DNA or hypoxic condition on expression levels, we fit negative binomial generalized linear models (glmFit) in edgeR where the model matrix included the intercept and the term against the model (e.g., mtDNA, nuclear DNA or hypoxia time point). We then conducted likelihood ratio tests using the glmLRT function, with the contrast between the baseline (e.g., nDNA OreR), against the nDNA coefficient (e.g., nDNA AutW132).
Statistical Tests for Shared G x G and G x E Effects
Here we present analyses using edgeR that draw on specific models of mitonuclear epistasis (G x G interactions) and genotype x environment interactions (G x E) using the hypoxia data set. We want to test the null model that genes that are sensitive to mitonuclear epistasis are independent of genes that are sensitive mitonuclear G x E. To identify transcripts that show significant G x G or G x E effects, we fit generalized linear models as described above, but include interaction terms in the design matrix. The likelihood ratio tests were performed on the interaction effect against the model intercept. So, to identify transcripts that show significant mitonuclear G x G effects we fit the following model:where M the term for mtDNA genotype (OreR or siI), N is the term for nuclear genotypes (OreR or Austria) and μ and ε are model mean and error. Using likelihood ratio tests, we seek the significance of the interaction term (M x N) against the intercept (using mtDNA = OreR, nuclear DNA = OreR, and hypoxia time = 0 as baseline). We tested this model using combined data from two hypoxic conditions (t = 0 and t = 30 min), and thus the hypoxic environmental treatments are collapsed in each genotype term (M, N, or M x N, each with 1 degree of freedom). This design may have reduced power to identify significant transcripts because of this pooled data set, but produced a list of significantly differentially expressed genes.Likewise, to identify transcripts that exhibit a significant mitonuclear G x E, we fit the following model:where G is the term for the mitonuclear genotype (which has four levels: OreR;OreR, siI;OreR, OreR;Austria, siI;Austria), H is the term for hypoxia (time = 0 or 30, with 0 being the room air control). As before, using likelihood ratio tests, we seek the significance of the interaction term (G x H) against the intercept (using mtDNA = OreR, nuclear DNA = OreR, and hypoxia time = 0 as baseline). We tested this model with the same data set described above (t = 0 and 30 min), and note that the degrees of freedom for G are df = 3 while that for H is df = 1, and there is no pooled data in any factor in this design. These analyses produced a list of genes with significant G × E terms from the model. To test the hypothesis that gene lists derived from these two models are independent (have a 5% overlap of gene names), we used a G‐test of heterogeneity with a 5% overlap as the null expectation and the observed overlap as the alternative.Recognizing that the above models are not fully comparable, we performed a more complete test using one full, three‐way model that allows for comparisons of genes responding to the mitonuclear epistatic interaction (G × G) and genes responding to the combine mitonuclear genotype × environment interaction (G x G x E). The data consisted of all libraries from the three hypoxia time points (t = 0, 30, and 120) and the four mitonuclear genotypes. We fit the following model in edgeR by including all interaction terms in the design matrix:where M, N, and H are as defined above. The significance of each interaction term was inferred using likelihood ratio tests against the model intercept with mtDNA = OreR, nuclear DNA = OreR, and hypoxia time = 0 as the baseline. The numbers of genes that were significant under these two tests differed, with very different FDR cutoff values. We compared the top 500 and 800 genes based on raw P values from each list. As before, we tested the null hypothesis that the lists were independent using a G‐test of heterogeneity with a 5% overlap of gene names as the null and the observed as the alternative.To capture the time course component of the transcriptional responses, we performed analyses using Weighted Gene Coexpression Network Analysis, WGCNA 36, 37. This approach identifies modules of coexpressed transcripts with common patterns across treatments, and can be used to assign significance to individual factors (e.g., mtDNA, nuclear DNA or hypoxia time point) that are associated with specific modules. Initial module discovery is unsupervised, other than specifying the minimum number of transcripts to be assigned to any module. The results of these analyses provide a means of assessing the shared functions among genes with similar temporal responses to hypoxia in our experimental design. It is likely that a transcript that goes up and then down over the time course will not be detected as significant by simple edgeR analyses, but may have common or conflicting responses over time in different mitonuclear genotypes that reflects underlying functional responses to the experimental manipulations.
RESULTS
Mitonuclear Coevolution and Epistasis for Development Time
A primary question we sought to test using the panel of 72 mitonuclear genotypes concerned mitonuclear coevolution or coadaptation. If mtDNAs are coadapted to their respective nuclear backgrounds we predicted that the genotypes carrying one of the three mtDNAs from D. melanogaster (“home team” mitonuclear combinations) would be more “fit” than those carrying one of the three mtDNAs from D. simulans (“away team” mitonuclear combinations). With ~100 amino acid substitutions between D. melanogaster and D. simulans in the mtDNA‐encoded proteins, the “away team” mitonuclear genotypes were expected to have delayed development time. The evidence for this is weak at best. Figure 1 shows that in only 2 of the 12 DGRP nuclear backgrounds do the D. simulans mtDNAs have delayed development time compared to the home team D. melanogaster mtDNAs. In the DGRP‐714 and ‐765 nuclear backgrounds, the D. simulans mtDNAs are significantly delayed compared to the D. melanogaster mtDNAs (in Fig. 1 green box plots = D. simulans mtDNA, yellow box plots = D. melanogaster mtDNA). But in 10 of the 12 nuclear backgrounds there is no significant “species” effect of mtDNA divergence.
Figure 1
Mitonuclear variation for development time in Drosophila. Each panel represents a different nuclear genetic background of the DGRP collection. Each box plot inside each panel represents the distribution of development time scores for flies carrying a different mtDNA in the given DGRP background. The green box plots represent three mtDNAs from (maI = siIII, siI, siII = sm21) and the yellow boxplots represents three mtDNAs from (Zim53, Austria, OreR). The y‐axis is development time in hours. The colored rectangles overlapping various genotypes (red, purple, blue) refer to three broad patterns of mtDNA effects across different nuclear genetic backgrounds.
Mitonuclear variation for development time in Drosophila. Each panel represents a different nuclear genetic background of the DGRP collection. Each box plot inside each panel represents the distribution of development time scores for flies carrying a different mtDNA in the given DGRP background. The green box plots represent three mtDNAs from (maI = siIII, siI, siII = sm21) and the yellow boxplots represents three mtDNAs from (Zim53, Austria, OreR). The y‐axis is development time in hours. The colored rectangles overlapping various genotypes (red, purple, blue) refer to three broad patterns of mtDNA effects across different nuclear genetic backgrounds.In contrast, one of the more common results was the appearance of “permissive” nuclear backgrounds that showed somewhat random differences in the development times for different mtDNA across nuclear genotypes (315, 358, 375, 707, 712, 820 in Fig. 1). There were two nuclear backgrounds that appeared to suppress or “canalize” the effects of mtDNA variation (517 and 786). Collectively, these data provide clear evidence that mitonuclear epistasis is a more general result than mitonuclear coadaptation for the evolution of mtDNA in the D. melanogaster subcomplex, at least for the fitness trait of egg‐to‐adult development time (see also Montooth et al., 2010). The same conclusion holds for egg‐to‐adult survival 19.
Relationship between Mitonuclear Epistasis and G x E for Development Time
Next we ask if mitonuclear epistasis is robust to dietary environment? Each of the 72 genotypes described in Figure 1 were cultured on three additional dietary environments with isocaloric combinations of protein and carbohydrate (P:C ratio from yeast and sucrose in the food medium; see Materials and Methods). The High P:C diet results in the fastest egg‐to‐adult development time, the Medium P:C diet is intermediate, the Low P:C diet slower, with the standard laboratory food, with the lowest protein concentration, resulting in the slowest development time 19. A three‐way ANOVA with main effects of nuclear genotype, mtDNA, and diet, reveals that all main effects and interaction terms were highly significant (see Table 2 in ref. 19). Diet contributed the most to variation in development time, but there were a number of mitonuclear genotypes that showed faster development time on a lower‐protein diet than other genotypes on the higher protein diets. These represent the G x G and G x E interaction terms, and are visualized as crossing reaction norms for development time of mitonuclear genotypes on all diets 19 (Figs. 1 and 2). These results imply that the genetic interaction defined by a mitonuclear genotype is sensitive to the environment in which development proceeds.
Figure 2
Mitonuclear epistasis for development time is modified by diet. a, Modified reaction norm plots for development time across six different mtDNA environments (x‐axes), in each of four nuclear backgrounds (each panel), with diet as the line displayed in each interaction plot. The black line is the lab food diet, and the yellow, blue and red lines are the different protein:carbohydrate diets. The among mtDNA variance in each nuclear background is highest for the lab food and lowest for the High P:C diet. b, Variance partitions for a nuclear x mtDNA ANOVA on development time in each diet. The nuclear x mtDNA component is highest in the lab food diet, as reflected in the panels in A.
Mitonuclear epistasis for development time is modified by diet. a, Modified reaction norm plots for development time across six different mtDNA environments (x‐axes), in each of four nuclear backgrounds (each panel), with diet as the line displayed in each interaction plot. The black line is the lab food diet, and the yellow, blue and red lines are the different protein:carbohydrate diets. The among mtDNA variance in each nuclear background is highest for the lab food and lowest for the High P:C diet. b, Variance partitions for a nuclear x mtDNA ANOVA on development time in each diet. The nuclear x mtDNA component is highest in the lab food diet, as reflected in the panels in A.Figure 2 presents a new analysis of these results, plotting “reaction norms” of dietary environments across mtDNA backgrounds within several nuclear DGRP backgrounds. Traditional norm of reaction plots display the phenotype for a genotype across a range of environment; this plots inverts that relationship and treats the mtDNA genotype as an “environment” for each nuclear genome, with the diets serving as the conditions that reveal sensitivity to mtDNA. A notable feature of each plot is that the increasing protein concentration alters the position and the shape of each line: the yellow line in Figure 2A (high P:C diet), is always lower than the black line (lab food), but generally flatter across mtDNAs in each nuclear background. The impact of the diet on the genetic basis of the variation in development time is quantified in Figure 2B, which shows the proportion variance explained by the main effects of Nuclear genotype, mtDNA, and the interaction terms Nuclear x mtDNA (i.e., the importance of mitonuclear epistasis). There is a striking difference in the variance explained by the mitonuclear epistatic (G x G) term in P:C diets compared to the lab food with the lowest protein concentration. It is evident that the increase in the contribution of mitonuclear epistasis reduces the residual variance explained by the model. This demonstrates that mitonuclear epistasis is sensitive to environmental condition. The patterns suggest that higher P:C diets are suppressing the epistatic relationship between mtDNA and nuclear background.
Relationship between Mitonuclear Epistasis and G x E for Gene Expression under Hypoxia
The important impacts of mitonuclear G x G and G x E for one complex trait (development time) raises the question of how general these patterns are for other complex traits. Here we consider gene expression, as defined by transcript abundance, as one source of information on thousands of complex traits that are integrative features of functioning genomes. The level of any transcript depends on a very large number of interacting cis‐ and trans‐factors ranging from nucleotide specific variation at transcription factor binding sites, to concentrations of transcriptional activators and repressors, to chromatin availability, all of which are sensitive to tissue specific factors, developmental stage, and external environmental cues 24. We test the null hypothesis that the genes whose expression is altered by differences in mitonuclear genotype are independent of the genes whose expression is altered by changes in environment. We use a set of four mitonuclear genotypes with two mtDNAs each placed on two nuclear chromosomal backgrounds: OreR;OreR, siI;OreR, OreR;Austria, siI;Austria. Adult flies of each genotype were placed in a hypoxic environment (6% O2) for 30 or 120 min, and transcriptional profiles were quantified by Illumina RNAseq (see Methods).Details of the patterns of gene expression variation in this Drosophila system have been reported previously 20, 29. Here we report the results of new statistical analyses that test the null hypothesis of independence of G x G and G x E effects. Figure 3 provides a summary of the results for females; the effects for males were similar but slightly less pronounced. Figure 3A is a multidimensional scaling (MDS) plot that displays the distinct patterns of nuclear, mtDNA and hypoxic effects on expression for the top 100 genes in an analysis of differential expression (DE) among 35 RNAseq libraries, using edgeR (one library from siI;Austria was removed from the analysis). The nuclear genetic effect is the largest source of variation in expression (log(fold change), on dimension 1), with mtDNA variation the second largest source of variation and the different time points contributing significant, but minor components of variation. Figure 3B,C demonstrate that each mitonuclear genotype has a unique set of genes that are either up‐ or down‐regulated (Fig. 3B,C, respectively) due to hypoxic exposure. Each Venn diagram displays a larger number in the unique regions that are restricted to individual mitonuclear genotypes, and smaller number of genes in the intersecting regions that are shared across genotypes. This is evidence for “private” or “personalized” mitonuclear responses to hypoxia 20.
Figure 3
Personalized expression profiles of Drosophila mitonuclear genotypes in a time course of hypoxia exposure. a, a multidimensional scaling plot of the top 100 genes in a differential expression analysis among 35 RNAseq libraries. The four mitonuclear genotypes are listed in distinct colors below the plot, and the shape of the symbol identifies the time in 6% oxygen. Each mitonuclear genotype falls in a distinct region of the plot, reflecting the unique contribution of nuclear (dimension 1) mtDNA (dimension 2) to expression variation. b and c, Each mitonuclear genotype has a unique set of up‐regulated (b) or down‐regulated (c) genes. Few genes are shared among mitonuclear genotypes in their response to hypoxia.
Personalized expression profiles of Drosophila mitonuclear genotypes in a time course of hypoxia exposure. a, a multidimensional scaling plot of the top 100 genes in a differential expression analysis among 35 RNAseq libraries. The four mitonuclear genotypes are listed in distinct colors below the plot, and the shape of the symbol identifies the time in 6% oxygen. Each mitonuclear genotype falls in a distinct region of the plot, reflecting the unique contribution of nuclear (dimension 1) mtDNA (dimension 2) to expression variation. b and c, Each mitonuclear genotype has a unique set of up‐regulated (b) or down‐regulated (c) genes. Few genes are shared among mitonuclear genotypes in their response to hypoxia.To test for the independence of mitonuclear G x G effects and G x E effects of hypoxia we compared the results of interaction effect models using edgeR. Results from two sets of models are presented. In the first set we identify G x G genes by fitting Model 1 (see Methods) that identifies transcripts with significant mtDNA x Nuclear interaction terms in the model comparing all 2 mtDNA × 2 Nuclear genome = 4 mitonuclear genotypes across pooled data for normoxia and 30 min of hypoxia. This analysis, with one degree of freedom in each level, identified 2,040 genes with DE transcripts at the P = 0.05 level. We compare the Model 1 gene list to a gene list resulting from Model 2 (see Methods) that identified transcripts with significant genotype × hypoxia interaction terms in the model. The same data set is used in Model 1 and 2, but in Model 2 “genotype” refers to one factor with four levels of the joint mitonuclear genotype (see above). Expression levels of these four genotypes (3 degrees of freedom) are compared between normoxia and 30 min of hypoxia (one degree of freedom). Model 2 identified 1,167 genes with DE transcripts at the P = 0.05 level. There are 285 genes shared between Model 1 and 2. This represents 13.9% of the G x G (Model 1) genes and 24.4% of the G x E (Model 2) genes (see Fig. 4A). If we assume a strict null of independence, 5% of the genes on each list should overlap (102/2,040 for G x G and 58/1,167 for G x E), but the overlap is 2.8–4.9 times higher than expected, respectively. Using a G‐tests of heterogeneity, with the observed values of 285 out of 2,040 compared to 102/2,040 for G x G, or 285 out of 1,167 for G x E compared to 58/1,167, the observed values are highly significantly different from the expected by chance for either comparison (G = 82.6, P < < 2.2e‐16, G = 144.5, P < 2.2e‐16).
Figure 4
Venn diagrams showing the number of genes with differential expression due to mitonuclear (G x G) or G x E manipulations, and their overlap. a, Results of a comparison of four mitonuclear genotypes in normoxia or 30 min of 6% hypoxia, using edgR and Model 1 and 2 in the text. b, Results of a comparison of four mitonuclear genotypes in normoxia, 30, and 120 min of 6% hypoxia, using edgR and Model 3 in the text. c, Results of a comparison of the effects of an mtDNA “swap” and a hypoxia treatment for each focal genotype. The overlaps are significantly greater than expected by chance, based on Fisher's exact test (see text).
Venn diagrams showing the number of genes with differential expression due to mitonuclear (G x G) or G x E manipulations, and their overlap. a, Results of a comparison of four mitonuclear genotypes in normoxia or 30 min of 6% hypoxia, using edgR and Model 1 and 2 in the text. b, Results of a comparison of four mitonuclear genotypes in normoxia, 30, and 120 min of 6% hypoxia, using edgR and Model 3 in the text. c, Results of a comparison of the effects of an mtDNA “swap” and a hypoxia treatment for each focal genotype. The overlaps are significantly greater than expected by chance, based on Fisher's exact test (see text).One problem with this test is the different models have different degrees of freedom, and Model 1 pools the libraries from different hypoxic environment (t = 0 and t = 30) into each mtDNA or nuclear genome factor in a 2 x 2 G x G test. No such pooling is done in Model 2, where four genotypes are compared between two environments. To test if these models are generating spuriously high (or low) patterns of shared gene lists, we analyzed Model 3, which fits one full three‐way model of mtDNA x nuclear DNA x hypoxia, and uses all three hypoxia time points (see Methods). The respective G x G and G x G x E terms in this model are each tested against the intercept to generate gene lists that have significant mitonuclear epistasis (G x G) or significant mitonuclear G x E, respectively. The advantage of this approach is that there is one model shared for the comparison and we are testing for significance against the same intercept, which should standardize the comparisons. Using this approach, the G x G test identified 2,291 genes with P < 0.05 and 794 that survived a FDR cutoff of 5% (Supporting Information Table S1). The G x G x E test identified 494 genes with P < 0.05 but none survived the FDR cutoff, due to the very different distribution of P‐values from this term in the model (Supporting Information Table S2). We use the unadjusted P‐values to select gene lists for comparison. Using the top 500 genes from each list, there are 169 genes shared between G x G and G x G x E (34% shared, see Fig. 4B). Using the top 800 genes there are 303 shared between G x G and G x G x E (38% shared). Both of these estimates far exceed the null expectation of 5% shared genes (G = 102.5, P < 2.2e‐16 for the top 500 genes, and G = 192.7, P < 2.2e‐16 for the top 800 genes).To examine if this approach is biased towards enrichment of overlapping genes, we simulated shared lists by randomizing the rankings of the genes derived from the two contrasts of Model 3, and examined the number of shared genes on pairs of such random lists. With increasing sample size of genes in the sample, this comparison will always converge to 100% overlap when all genes from both lists are considered. If there is no signal of shared expression effects for G x G and G x G x E in our data, the proportion of overlapping genes should increase linearly with increasing shared list size. In clear contrast to this null expectation, the proportion of overlapping genes on the two lists accumulates much faster than by random chance (see Fig. 5A). These analyses indicate some common features of gene expression variation due to mitonuclear genetic interactions, and the interaction of genotypes with hypoxic environments.
Figure 5
Plots of simulated and observed data sets displaying the proportion of shared genes between two lists. The straight black lines represent four simulations where the P‐value rank of genes is randomized, and increasing sample sizes are chosen to compare shared genes. The colored lines show similar accumulation curves, using observed G x G and G x E gene lists, ranked by P‐value. There are more genes than expected at the list sizes observed in our analyses. By definition, sharing will be 100% when both samples include all genes in the genome (upper right corner).
Plots of simulated and observed data sets displaying the proportion of shared genes between two lists. The straight black lines represent four simulations where the P‐value rank of genes is randomized, and increasing sample sizes are chosen to compare shared genes. The colored lines show similar accumulation curves, using observed G x G and G x E gene lists, ranked by P‐value. There are more genes than expected at the list sizes observed in our analyses. By definition, sharing will be 100% when both samples include all genes in the genome (upper right corner).As a third approach to examine the shared nature of mtDNA‐mediated and hypoxia‐mediated effects on gene expression, four genotype‐specific tests were conducted in edgeR. For each mitonuclear genotype, genes were identified that were significantly differentially expressed by “swapping” the mtDNA in the same nuclear background under normoxic conditions (a ΔmtDNA effect comparing, e.g., OreR;OreR with siI;OreR at t = 0). This is essentially a one‐way model like Model 1, with no nuclear term. For comparison, genes were identified that were significantly differentially expressed by the hypoxia treatments to OreR;OreR (a ΔO2 effect, e.g., OreR;OreR across t = 0, 30 and 120 min in 6% oxygen). This is essentially a one‐way model like Model 2, with no genotype term. These comparisons are conservative as they seek to identify differentially expressed genes that are shared between different data sets. Figure 4C shows the results, which support shared features of gene expression regulation by hypoxia and by mtDNA‐mediated trans‐effects, with a clear asymmetry of the effects. For genes differentially expressed by mtDNA, 7.7, 12.1, 9.8, and 27% were shared when compared to the genes differentially expressed by hypoxia. But the reverse comparison shows that genes differential expressed by hypoxia are strongly shared with genes differentially expressed by mtDNA (between 15% and 45%). The genes differentially expressed by mtDNA are a larger set of genes, and those differentially expressed by hypoxia are strongly enriched for genes relating to mitonuclear genetic interactions. The randomization tests show that this overlap is greater than expected by chance (Fig. 5B).
Gene Coexpression Analyses
The patterns of shared gene lists suggest that some underlying transcriptional mechanisms are shared for genomic integration between mtDNA and nuclear factors, and those processes that sense oxygen. To examine the underlying patterns of shared gene expression we performed weighted gene coexpression network analysis (WGCNA) 36, 37. These analyses revealed a large number of modules of genes with common profiles of expression across the four mitonuclear genotypes and the three hypoxia time points. One such example is illustrated in Figure 6. The heat map shows a module of genes that demonstrate the patterns of G x G x E effects for expression. The rows are individual genes, the columns are the individual RNAseq libraries, partitioned from left to right by mitonuclear genotype and hypoxic time point. The left half of the figure shows that the Austria nuclear background has lower expression than the OreR nuclear background (green vs. red). In the Austria background the OreR mtDNA (first nine columns) exhibits a clear temporal response to hypoxia as revealed by the eigengene plots below the heat map. The siI mtDNA in the Austria nuclear background is somewhat unresponsive to hypoxia. In contrast, these two mtDNAs in the OreR nuclear background appear to have opposite effects on this module of genes: the OreR mtDNA has a modest response to hypoxia, but the siI mtDNA show a clear linear response.
Figure 6
Heat maps and gene expression network for a module of differentially expressed genes reveled by WGCNA analyses. a, Gene expression levels for each gene in the module (rows) and replicate library and experimental treatments in the experiment (columns). The bar plot below the heat map displays the value of the eigengene from WGCNA, reflecting overall expression of the genes in that treatment. Note that the effects of hypoxia and mtDNA genotype in each nuclear background are unique to the particular combination of variable, reflecting G x G x E effects. b, Network of coexpressed genes in the module from a, with gene showing significant mitonuclear epistatic effects (G X G) highlighted in red. More of these genes are central than peripheral in the network. c, The same network in b with genes showing significant G x G x E effects on gene expression. The overlap of these significant genes is in the proportions represented in Figure 4. The module network layout was informed by edge weights using the prefuse force directed algorithm implemented in Cytoscape. The network topology is therefore identical in both figures.
Heat maps and gene expression network for a module of differentially expressed genes reveled by WGCNA analyses. a, Gene expression levels for each gene in the module (rows) and replicate library and experimental treatments in the experiment (columns). The bar plot below the heat map displays the value of the eigengene from WGCNA, reflecting overall expression of the genes in that treatment. Note that the effects of hypoxia and mtDNA genotype in each nuclear background are unique to the particular combination of variable, reflecting G x G x E effects. b, Network of coexpressed genes in the module from a, with gene showing significant mitonuclear epistatic effects (G X G) highlighted in red. More of these genes are central than peripheral in the network. c, The same network in b with genes showing significant G x G x E effects on gene expression. The overlap of these significant genes is in the proportions represented in Figure 4. The module network layout was informed by edge weights using the prefuse force directed algorithm implemented in Cytoscape. The network topology is therefore identical in both figures.When this module of genes is assembled in to a network using Cytoscape, we can compare the positions of the genes that are differentially expressed by mtDNA (G x G genes; Fig. 6B, top right), and those differentially expressed by hypoxia (G x E genes, Fig. 6C, bottom right). Several of these genes overlap, and in each case they seem to be more common in “core” positions, rather than “peripheral” positions in the network. The WGCNA analysis uncovered a number of other gene expression modules with comparable G x G x E patterns that differ in direction and degree. While there is not space to illustrate all of these heat maps and networks, the cluster shown in Figure 6 is representative of the overall picture of mitonuclear G x G and G x E integration.
DISCUSSION
The biology of mitochondria demands a joint macroevolutionary and microevolutionary perspective to appreciate the role they play in phenotypic variation. The functional integration of two genomes from distinct domains of life is tremendously complex due to at least three sources of variation. First, proper mitochondrial function requires the coordinated expression of 37 genes encoded in mtDNA inside mitochondria and over 1,000 nuclear‐encoded genes 22. This intracellular, intergenomic communication problem is compounded by variation among sexes, tissues, and age groups 38. Second, all genes exhibit variation in natural populations such that every individual has a unique mitonuclear genotype. The high mutation rate for mtDNA 7, 39, and the large mutational target of nuclear genes introduces additional complexity to the fitness effects of nuclear‐mitochondrial genetic interactions in natural populations. Third, it is well known that environmental conditions alter how a genotype is expressed as a phenotype 40, 41, and this is true for nuclear‐mitochondrial interactions as well 18, 19, 42. Not surprisingly, the genetic bases of complex mitochondrial disease, and more generally the genetic variation for mitochondrial performance in natural populations, are very poorly understood. There are a growing number of examples that mitonuclear coevolution can drive adaptive evolution 10, 43. With the advent of mitochondrial replacement therapies that place an F1 nucleus from the biological parents into a cytoplasm with nondisease mtDNAs from a third, healthy parent, the fitness consequences of mitonuclear interactions are of increasing medical relevance 44, 45.In this article, we report new analyses of a Drosophila model that seeks to dissect the properties of mitonuclear gene x gene (G x G) interactions and genotype x environment (G x E) interactions. These analyses ask a more general question: do epistatic G x G interactions have anything in common with G x E interactions? We present two different tests of this idea: mitonuclear effects on development time on different diets, and mitonuclear effects on transcriptional profiles under hypoxia. These experiments are directly relevant to mitochondrial biology, but we feel that they bear more generally on the question of the genetic architecture of complex traits, the debates over the missing heritability 13, 46, and the challenges of developing meaningful ways of advancing personalized genomic, or precision, medicine.Our earlier studies documented the details of a mitonuclear epistasis for development time in Drosophila, and its modification by dietary environments 19. We have reanalyzed these data in an “inverted” form of norm of reaction plots, and shown that the variation among mtDNAs in different nuclear genetic backgrounds (itself a G x G effect), is altered by high versus low protein:carbohydrate diets. The patterns of mitonuclear genetic variation across normal laboratory food is more pronounced, and with increasing protein concentrations in the food, development time decreases. The genotypes with the more striking epistatic interactions appear to be suppressed by high P:C diets (see Fig. 2). This is indirect evidence that the mechanisms that are altered by mismatched mtDNAs and nuclear genomes are further modified by changing nutrient levels during development.The literature is mixed on the effects of stress on the relative importance of direct selection or epistatic contributions to variation 47. An implication of our results is the complex signaling pathways that maintain proper mitonuclear function are part of the same mechanisms that allow organisms to respond to environmental stress. It is intuitive that dietary environments should have a special connection with mitonuclear epistasis as these provide different fuels that ultimately find their way to the OXPHOS and β‐oxidation pathways and contribute to energy transduction. A metabolomics study of alternative mtDNAs in a common nuclear background showed shifts in a number of metabolites in the same direction as dietary rapamycin that should suppress TOR signaling, implying parallel effects of common pathways from genetic or environmental manipulations 48. Further transcriptional, biochemical and metabolic analyses of the mito‐DGRP lines described here are needed to determine the generality of these links between dietary and mtDNA effects.Just as dietary environments should have a close biochemical link with mitonuclear genetic variation, so should altered oxygen environments. Oxygen gas (O2) is the terminal acceptor of the electrons that are derived from dietary input, and cytochrome oxidase (complex IV of the OXPHOS chain) catalyzes this reaction. These facts motivated the experiments and analyses we report here, using a different set of mitonuclear genotypes. Under physiological conditions, mitochondria consume oxygen faster than it is delivered, so some level of hypoxia is pervasive, and is caused by mitochondrial activity. It follows that mitonuclear interactions might be central to the hypoxia transcriptional response, and thus provide a promising context to ask the question of overlap between mitonuclear G x G and G x E. Our earlier studies showed that each different mitonuclear genotype has a unique transcriptional response to hypoxia 20, 29, which is summarized in Figure 3. A surprising result is that our data fail to identify a “canonical” hypoxia response. If such a core system exists, it should have emerged as shared responses across all of the genotypes, but there are remarkably few shared genes that respond to hypoxia in different genotypes. Our analyses do uncover the hairy gene as an important differentially expressed gene in most genotypes, so we have confidence that our experiments are working through this transcriptional repressor that has been linked to hypoxia 49. But the identification of a number of other genes or pathways that are sensitive to hypoxia is not unique to our specific Drosophila mitonuclear system. In yeast, the transcriptional response to hypoxia is highly varied across genes and involves multiple regulatory pathways 50. This underscores the need to advance our understanding of how networks of interacting genes and environmental factors mediate phenotypic changes.The transcriptional data provide a rich resource of phenotypes (transcript abundance) to address the question of common features of the mitonuclear epistasis and G x E responses. Using edgeR to analyze RNAseq data from four mitonuclear genotypes exposed to different hypoxic conditions, we have generated lists of differentially expressed genes that are sensitive to mitonuclear G x G or G x E effects. We find a repeatable and highly significant overlap for genes that respond to mitonuclear epistatic variation, and genes that are responsive to hypoxic exposure. The degree of overlap varies somewhat with the different forms of analysis. If nonoverlapping subsets of the data are used for the G x G and the G x E effects, the overlap is somewhat lower (7%–25%, Fig. 4), but if the different interaction effects (G x G or G x G x E) are extracted from a single full, three‐way model, there is considerably more overlap (>30%, Fig. 4).We examined the data for possible sources of bias that might inflate the observed overlap above that assuming independence of G x G and G x G x E. All genes were subject to the same tagwise adjustment for dispersion of expression (Supporting Information Fig. S2A) and identical gene lists were present in both (G x G and G x G x E) analyses. The two lists have very similar ranges of total expression, and the distribution of significant genes in the G x G and G x G x E lists both showed similar proportions of significant genes across the range of total expression (see Supporting Information Fig. S2B,C). The G x G genes show a positive relationship between the proportion of significant genes and total expression level, that is not evident in the G x G x E genes, but these distributions are broadly overlapping (Supporting Information Fig. S2D). While it is possible that a 5% null is not the appropriate expected overlap value, the observed overlaps reported in Figure 4 appear not to be due to inherent biases in the data. If we were to relax this to a 10% null overlap, all of the tests presented in Figure 4A–C would still be highly significant, and at a 20% null overlap the tests in Figure 4B,C are still highly significant. We suggest that these overlaps are sufficiently high that they point to the existence of shared pathways that govern mitonuclear communication, and those that allow different genotypes to respond to altered environmental conditions.We explored the patterns of gene co‐expression among these data sets and uncovered a number of distinct modules of co‐expressed genes. The opposing directions of transcriptional response of four different mitonuclear genotypes and different hypoxic time points reveal the challenge of defining a simple direction and core pathway of hypoxic response (Fig. 6). Analyses of the network structure of these modules, and the placement of differentially expressed G x G genes or G x E genes on these networks, shows that these genes are more represented in the center of the network, than at the edges (Fig. 6B,C). We propose that the common functional basis of G x G and G x E genes is that they represent “core” genes that integrate signals from other genes. Figure 6 shows quite visually why a linear pathway approach to the connection between genotype and phenotype is likely to be difficult to interpret at best, and simply wrong at worst. The overlapping and somewhat central nature of G x G and G x E genes provides a clearer picture of how a whole organism might integrate these linear pathways when genotypes and environments vary. These patterns further illustrate that the mitonuclear genetic interactions fit well with the “omnigenic” model proposed by Boyle et al., involving core and peripheral genes 13. The primary modification we suggest to this model is that G x G and G x E interaction terms are critical links connecting core and peripheral genes, and that interaction effects are central to the additive, infinitesimal model.Figure 7 presents a simple model of how to integrate the different processes implied by G x G and G x E interaction. A G x E “gene” is one that is altered by the external environment (e.g., diet or oxygen or temperature). A G x G “gene” is one that alters the cellular environment for another gene. The union of these two processes is that stimuli coming from “outside” must transduce a signal through cellular processes, and the genes that are shared in these pathways can be both kinds of gene. Conceptually, different mtDNAs provide different cellular environments for the nucleus, and vice versa. This is very much the same as a different diet being a different environment for the mitonuclear genotype. As such, plots of crossing reaction norms can look very similar in each case.
Figure 7
A model of the integration of G x G and G x E genes. G x E genes are altered by the external environment. G x G genes alter the cellular environment of another gene. In the hypoxia pathway, the sensing of oxygen and the transduction of that signal will include common genes. In the context of a norm of reaction plot, alternative mtDNAs are different environments for the nucleus, leading to different phenotypes. Likewise, different physical environments alter the phenotype of different genotypes in traditional G x E contexts. Mitochondria are thus a hub that connects epistasis with G x E.
A model of the integration of G x G and G x E genes. G x E genes are altered by the external environment. G x G genes alter the cellular environment of another gene. In the hypoxia pathway, the sensing of oxygen and the transduction of that signal will include common genes. In the context of a norm of reaction plot, alternative mtDNAs are different environments for the nucleus, leading to different phenotypes. Likewise, different physical environments alter the phenotype of different genotypes in traditional G x E contexts. Mitochondria are thus a hub that connects epistasis with G x E.While additional transgenic studies are needed to validate the mechanisms implied by our results, the complexity of the relatively simple system we have developed suggests that this complexity needs to be understood to advance personalized genomic medicine or its offspring, precision medicine. It remains to be determined how much of human phenotypic and clinical variation fits the additive or G x G and G x E models for complex traits. Likewise, it remains to be determined how much of the “omnigenic” model 13 is strictly additive or a mixture of additive and interaction effects. Technology can help us be precise, but biology makes it challenging to be accurate. Medical treatments that work precisely for one genotype may not be accurate for other genotypes. Genetic and environmental variation present a special challenge for precise, personalized medicine, but insights from evolutionary genetic approaches can help address this challenge, with hypoxia therapy being a promising example 51. Our data suggest that there are more linkages between macro and microevolution, and between evolutionary and medical genetics, and the mitochondria lie at the hub of this complex problem in systems biology.Figure S1. A comparison of the filtered data set (panel A) and the complete data set of expression values (panel B). The figure shows the biological coefficient of variation for each transcript as a function of log(counts per million). Panel A shows those transcripts that remain after removing transcripts with fewer than 1 count per million in at least 3 of the 35 RNAseq libraries. Panel B shown all transcripts. Note the large spike of transcripts at low expression values with high coefficients of variation, illustrating the importance of removing transcripts with low signal to noise that skew the overall distribution of variances across transcripts.Click here for additional data file.Figure S2. Examination of the impact of expression value on the probability of detecting significant G x G or G x E effects. Panel A recaps the data shown in Figure S1A. Panels B and C show the distribution of total and significant G x G or G x G x E genes in the analyses. Note that the distribution of significant genes mirrors the distribution of all genes, suggesting that there is no bias towards detecting genes as a function of expression level. Panel D shows that there is a slight increase in the proportion of genes that were G x G sensitive (purple circles) with increasing expression (r = 0.65, P < 0.001). There was no such effect in the G x G x E results (black circles) (r = 0.31, P = 0.1).Click here for additional data file.Supporting Information Table S1Click here for additional data file.Supporting Information Table S2Click here for additional data file.
Authors: Cole Trapnell; Adam Roberts; Loyal Goff; Geo Pertea; Daehwan Kim; David R Kelley; Harold Pimentel; Steven L Salzberg; John L Rinn; Lior Pachter Journal: Nat Protoc Date: 2012-03-01 Impact factor: 13.491