| Literature DB >> 26637434 |
Minako Izutsu1, Atsushi Toyoda2, Asao Fujiyama3, Kiyokazu Agata1, Naoyuki Fuse4.
Abstract
Environmental adaptation is one of the most fundamental features of organisms. Modern genome science has identified some genes associated with adaptive traits of organisms, and has provided insights into environmental adaptation and evolution. However, how genes contribute to adaptive traits and how traits are selected under an environment in the course of evolution remain mostly unclear. To approach these issues, we utilize "Dark-fly", a Drosophila melanogaster line maintained in constant dark conditions for more than 60 years. Our previous analysis identified 220,000 single nucleotide polymorphisms (SNPs) in the Dark-fly genome, but did not clarify which SNPs of Dark-fly are truly adaptive for living in the dark. We found here that Dark-fly dominated over the wild-type fly in a mixed population under dark conditions, and based on this domination we designed an experiment for genome reselection to identify adaptive genes of Dark-fly. For this experiment, large mixed populations of Dark-fly and the wild-type fly were maintained in light conditions or in dark conditions, and the frequencies of Dark-fly SNPs were compared between these populations across the whole genome. We thereby detected condition-dependent selections toward approximately 6% of the genome. In addition, we observed the time-course trajectory of SNP frequency in the mixed populations through generations 0, 22, and 49, which resulted in notable categorization of the selected SNPs into three types with different combinations of positive and negative selections. Our data provided a list of about 100 strong candidate genes associated with the adaptive traits of Dark-fly.Entities:
Keywords: Drosophila; environmental adaptation; experimental evolution; genome-wide analysis; reselection experiment
Mesh:
Year: 2015 PMID: 26637434 PMCID: PMC4751556 DOI: 10.1534/g3.115.023549
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Competition assay for measuring relative fitness. (A) Four kinds of flies (competitor females carrying GFP, competitor males carrying DsRed, tester females, and tester males) were reared together and mating could occur in any combination of them. From the numbers of progeny, the relative fitness of tester parents was calculated. (B) An example of observations of progeny. The views under bright-field and fluorescent lights were merged in the image. According to the fluorescent markers, progeny were categorized into four groups. (C) The mean proportion of progeny in each test. Left: assays (n = 10) against the Oregon-R-S competitors (GFP-Oregon-R-S females and DsRed-Oregon-R-S males). Right: assays (n = 5) against the Urbana-S competitors (GFP-Urbana-S females and DsRed-Urbana-S males). The colors correspond to the combinations of parental flies shown in (A) and the terms used to designate the progeny group (Yellow, Green, Red, and White). *p-value < 0.05; **p-value < 0.01, Mann–Whitney U-test.
Figure 2Mixed population experiment. (A) A schematic drawing of the mixed population experiment. The experiment started from hybrids of Dark-fly and Oregon-R-S. Those mixed populations were reared in LD and DD conditions during consecutive generations. (B) The history of the mixed population. The sizes of the mixed populations were estimated by measuring the weight of flies at every generation, except for generations 10 and 22. Three replicate populations reared in the LD condition are shown by dark-red, red, and light-red lines, and those reared in the DD condition are shown by dark-blue, blue, and light-blue lines, respectively.
Summary of genome sequencing of the mixed population
| Generation | Condition | Line | Read Length | Read Number | Mapped Read Number | Mapped Reads (%) | Mean Depth |
|---|---|---|---|---|---|---|---|
| 0 | — | — | 100 | 386775696 | 361193875 | 93.39 | 214 |
| 22 | LD | 1 | 100 | 266855274 | 254924548 | 95.53 | 151 |
| LD | 2 | 100 | 283632172 | 272484316 | 96.07 | 161 | |
| LD | 3 | 100 | 269666516 | 259954602 | 96.40 | 154 | |
| DD | 1 | 100 | 271880364 | 258818394 | 95.20 | 153 | |
| DD | 2 | 100 | 265563806 | 254379699 | 95.79 | 151 | |
| DD | 3 | 100 | 274437396 | 262753186 | 95.74 | 156 | |
| 49 | LD | 1 | 100 | 351537438 | 329248183 | 93.66 | 195 |
| LD | 2 | 100 | 386629584 | 362194949 | 93.68 | 215 | |
| LD | 3 | 100 | 269858658 | 243812205 | 90.35 | 144 | |
| DD | 1 | 100 | 418334430 | 386895826 | 92.48 | 229 | |
| DD | 2 | 100 | 345057530 | 317678449 | 92.07 | 188 | |
| DD | 3 | 100 | 379634634 | 354977577 | 93.51 | 210 | |
| Mean of total | — | — | — | 320758731 | 301485831 | 94.14 | 179 |
Populations are indicated by generation number, condition (LD or DD condition), and replicate ID number. Read length and read number obtained from NGS data are shown for each population. Reads were mapped on the Flybase Dmel 5.22 genome (168,736,537 bases), and basic data of the mapping are shown. Mean depth of all data was 179.
Figure 3Frequency of Dark-fly’s SNPs in the LD- and DD-reared populations. (A) Violin plot of the frequency of Dark-fly’s SNPs in the populations reared in LD and DD conditions at generations 0, 22, and 49. White points and back thick bars represent median values and interquartile ranges of data, respectively. (B) MDS analysis of the overall SNP frequency. Profiles of SNP frequency in each replicate population were plotted in two dimensions using MDS analysis (stress = 15.9). Dimension one divided populations at different time points, and dimension two divided populations in LD and DD conditions.
Figure 4Comparison of SNP frequency in LD- and DD-reared populations. (A, B) Scatter plots comparing SNP frequency in LD and DD conditions at generation 22 (A) and 49 (B). The SNPs showing a significant difference of frequency in Fisher’s exact test (top 5% of p-values) and higher frequency in the DD than in the LD condition were colored black at generation 22 and red, green, and blue (type 1, type 2, and type 3, respectively) at generation 49. Other SNPs were colored gray. (C, D) P-values (Fisher’s exact test) for each SNP were plotted as reverse logarithm values along chromosomal position at generation 22 (C) and 49 (D). Colors corresponded to those shown in (A, B).
Figure 5Trajectory of frequency of SNPs during successive generations. Temporal changes of SNP frequency were analyzed for each SNP type (A: type 1, B: type 2, C: type 3). Blue and red lines represent data of each SNP in DD- and LD-reared populations, respectively. Black lines represent mean SNP frequency of each type in DD- and LD-reared populations.
Figure 6Chromosomal regions selected in the mixed populations. (A) AFC of each SNP (frequency in DD minus frequency in LD) at generation 49 were plotted along chromosomal position. A blue line represents AFC = 0, meaning equal frequency in LD- and DD-reared populations. (B) LOD scores of 1-kb windows at generation 49 were plotted along chromosomal position. The blue line represents the threshold (75) for detecting LOD peaks. Numbers in the graph indicate LOD peak number. (C) A magnified view of LOD peak nine. Red points indicate SNP frequency in LD and blue points indicate that in DD. The green line and gray bar represent LOD score and 90% credible interval span, respectively. (D) A view of UCSC Genomic Browser around the 90% credible interval span of LOD peak nine. Position of Dark-fly’s SNPs and InDels are represented by bars.
Candidate regions identified at LOD peaks
| LOD Peak No. | Chromosome | Peak Position | Credible Interval Start | Credible Interval End | Span (kb) | Genes | Selection Type |
|---|---|---|---|---|---|---|---|
| 1 | X | 1252000 | 1237000 | 1339000 | 102 | CG11417, CG11418, CG11448, CG12773, CG14770, CG14773, CG3056, CG32813, CG3719, SNF1A, futsch, png | 3 |
| 2 | X | 13237000 | 13211000 | 13250000 | 39 | CG15747, IP3K2, Jafrac1, RpS15Aa | 2 |
| 3 | 2L | 1515000 | 1476000 | 1559000 | 83 | CG14351, CG18131, CG18132, CG31661, CG31926, CG31928, CG33128, CG7420, Or22a, Or22b, halo | 2 |
| 4 | 2L | 13159000 | 13316000 | 13438000 | 122 | CG10859, CG16826, CG16848, CG16956, CG16957, CG16970, CG31855, CG6523, CG6565, CG7099, CG7110, CG9293, CG9302, CG9305, CG9306, CG9377, CG9395, Nnp-1, RpL24, Tap42, Tehao, Vm34Ca, beta’Cop, loqs | 1 |
| 7 | 3R | 9109000 | 9061000 | 9192000 | 131 | Ace, CG11686, CG15887, CG15888, CG32473, CG8449, CG8630, CG8773, CG8774, CG8784, CG8790, CG8795, CheA87a, Lip3, Osi22, Ravus, Su(var)3-7, mthl12, poly, wntD | 2 |
| 8 | 3R | 9608000 | 9601000 | 9638000 | 37 | CG42375, CG9286, CG9288, CG9297, Cht5, Dip-B, tRNA:CR31331, tRNA:CR31588, tal-1A, tal-2A, tal-3A, tal-AA | 2 |
| 9 | 3R | 25260000 | 25245000 | 25271000 | 26 | Ptp99A | 2 |
Seven candidate regions showing significant difference in SNP frequency between LD and DD conditions and high score of LOD. Chromosome, positions, start and end of 90% credible interval, span, genes, and selection types (Figure 5) are given for each region.