| Literature DB >> 29078298 |
Ray Tobler1,2, Viola Nolte1, Christian Schlötterer3.
Abstract
The Y chromosome is a unique genetic environment defined by a lack of recombination and male-limited inheritance. The Drosophila Y chromosome has been gradually acquiring genes from the rest of the genome, with only seven Y-linked genes being gained over the past 63 million years (0.12 gene gains per million years). Using a next-generation sequencing (NGS)-powered genomic scan, we show that gene transfers to the Y chromosome are much more common than previously suspected: at least 25 have arisen across three Drosophila species over the past 5.4 million years (1.67 per million years for each lineage). The gene transfer rate is significantly lower in Drosophila melanogaster than in the Drosophila simulans clade, primarily due to Y-linked retrotranspositions being significantly more common in the latter. Despite all Y-linked gene transfers being evolutionarily recent (<1 million years old), only three showed evidence for purifying selection (ω ≤ 0.14). Thus, although the resulting Y-linked functional gene acquisition rate (0.25 new genes per million years) is double the longer-term estimate, the fate of most new Y-linked genes is defined by rapid degeneration and pseudogenization. Our results show that Y-linked gene traffic, and the molecular mechanisms governing these transfers, can diverge rapidly between species, revealing the Drosophila Y chromosome to be more dynamic than previously appreciated. Our analytical method provides a powerful means to identify Y-linked gene transfers and will help illuminate the evolutionary dynamics of the Y chromosome in Drosophila and other species.Entities:
Keywords: Drosophila; Y chromosome; evolution; retrocopies; transposition
Mesh:
Substances:
Year: 2017 PMID: 29078298 PMCID: PMC5676891 DOI: 10.1073/pnas.1706502114
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.Schematic overview of the Y-linked transfer detection pipeline. In step 1, two separate Cochran–Mantel–Haenszel (CMH) tests were performed and then combined to identify outlier SNPs; the first CMH test contrasted sexes with the strains (different colored flies) as replicates (A), and the second CMH test contrasted the strains with sexes as replicates. For all three species, more outlier SNPs (StdDiff > 2) were detected for male-specific variants (M) than for female-specific variants (F) (B), indicating that our pipeline was accurately identifying Y transfers. In step 2, male-specific outlier SNPs are grouped into clusters (C). The annotation of genomic regions containing outlier SNPs is indicated by different color codes. Although GeTYs are broadly dispersed across the genome of each species, TE peaks typically cluster within heterochromatic regions. In step 3, reads containing Y-linked variants were used in the de novo assembly of the Y-linked haplotype for each incipient transfer. No anno, no recorded annotation; TE, transposable element. (D) The Integrative Genomics Viewer (IGV) screen shot for the read alignment (Bottom) and subsequent de novo assembled Y-linked haplotype [green (Velvet) and red (Transabyss) bars; Top], relative to the donor gene annotation for GeTY (blue bars; Top). The final step of the pipeline involved the iterative aggregation of incipient transfers lying within 200 kb of one another into a single consensus transfer.
Summary of Y-linked transfers
| Species | Transfer ID | Transfer type | Donor(s) | Expression | Age (Ky) | |
| Dmau_2L_6.65 | Ambig | Hrb27C | 1.02,Filt | NA | 673 [309,1885] | |
| Dmau_2L_9.1 | Ambig | numb | nORF | NA | 460 [211,1287] | |
| Dmau_2R_0.15 | DNA | NA | NA | NA | 429 [197,1201] | |
| Dmau_2R_8.65 | DNA | L;ttv;LamC | nORF;nORF;0.14 | NA | 153(240) [70(110),429(672)] | |
| Dmau_2R_13.28 | DNA | CG7229 | Filt | NA | 80 [37,224] | |
| Dmau_2R_18.4 | RNA | CG3511 | 0.95 | NA | 147 [68,413] | |
| Dmau_2R_19.04 | DNA | NA | NA | NA | 305 [140,854] | |
| Dmau_3L_0.16 | RNA | CG13876 | 0.30 | NA | 158 [73,443] | |
| Dmau_3L_2.15 | DNA | NA | NA | NA | 98 [45,273] | |
| Dmau_3L_22.2 | DNA | NA | NA | NA | 957 [439,2679] | |
| Dmau_3R_0.34 | DNA | NA | NA | NA | 380 [174,1064] | |
| Dmau_3R_1.23 | DNA | Nmdar1;dmau_PG00479 | nORF;nORF | NA | 282 [130,790] | |
| Dmau_3R_2.13 | DNA | NA | NA | NA | 28 [13,78] | |
| Dmau_X_3.07 | DNA | CG16781;CG12206 | Filt,Filt;0.11 | NA | 683(778) [314(357),1913(2180)] | |
| Dmau_X_8.42 | Ambig | His3.3B | 2.83 | NA | 436 [200,1221] | |
| Dmau_X_20.09 | DNA | NA | NA | NA | 535 [246,1499] | |
| Dmel_2L_4.46 | DNA | Gs1l;RpL27A | nORF;nORF | NA | 45 [21,125] | |
| Dmel_2L_12.86 | DNA | NA | NA | NA | 174 [80,487] | |
| Dmel_2L_19.94 | DNA | sick | nORF | NA | 559 [256,1564] | |
| Dmel_2L_22.75 | DNA | NA | NA | NA | 697 [320,1951] | |
| Dmel_2R_0.09 | DNA | NA | NA | NA | 624 [287,1748] | |
| Dmel_2R_0.57 | DNA | NA | NA | NA | 303 [139,848] | |
| Dmel_2R_2.32 | DNA | NA | NA | NA | 315 [145,883] | |
| Dmel_3L_23.41 | DNA | NA | NA | NA | 443 [203,1241] | |
| Dmel_3L_24.3 | DNA | NA | NA | NA | 343 [158,962] | |
| Dmel_3R_17.04 | Ambig | CR43975 | nORF | NA | 516 [237,1444] | |
| Dmel_3R_20.95 | DNA | NA | 463(497) [213(391),1297(1391)] | |||
| Dmel_X_12.65 | DNA | ade5;CG12717 | nORF;Filt | NA | 725 [333,2031] | |
| Dmel_X_12.66 | DNA | NA | NA | NA | 430 [197,1204] | |
| Dsim_2L_11.91 | DNA | bru1 | Filt | 2/11 [3] | 169 [77,472] | |
| Dsim_2L_19.34 | DNA | NA | NA | NA | 351 [161,983] | |
| Dsim_2R_0.06 | DNA | NA | NA | NA | 267 [122,747] | |
| Dsim_3L_10.87 | RNA | Sod | 0.09 | 1/8 [10.4] | 182(477) [84(219),509(1335)] | |
| Dsim_3L_22.2 | DNA | NA | NA | NA | 408 [187,1142] | |
| Dsim_3R_4.18 | DNA | NA | NA | NA | 161 [74,452] | |
| Dsim_3R_12.73 | DNA | NA | NA | NA | 96 [44,269] | |
| Dsim_X_7.6 | RNA | Sdt | Filt | 11/30 [2.3] | 374 [172,1047] | |
| Dsim_X_15.22 | RNA | Cyp1 | 0.78 | 0/18 [4.5] | 222 [102,623] | |
| Dsim_X_20.2 | DNA | CG17450/CG32819/CG32820 | Filt | 0/8 [19.5] | 268 [123,750] |
A total of 45 unique Y-linked transfers were detected, arising either as retrocopies (RNA), as DNA translocations (DNA), or via an undetermined mode (Ambig). Twenty-five of the Y-linked transfers harbored at least one gene—i.e., were GeTYs [donor gene names in donor(s) column]—with six GeTYs being shared between species (italicized rows). In all columns, GeTYs comprising several genes have each gene name separated by semicolons, with those having identical gene models being separated by a forward slash. Purifying selection was detected for three GeTYs (ω column; genes having more than one detected ORF being further delineated by a comma), whereas others showed evidence of degeneration (ω column; nORF = no ORF; Filt = Y-linked or donor CDS lacked either a start or stop codon, contained an inactivating mutation, or >10% Y-linked codons were missing in donor alignment; Dataset S6). Two publicly available testes-specific RNA-Seq datasets (23, 24) revealed weak evidence for Y-linked GeTY expression, with most GeTYs having low relative expression (i.e., the fraction of diagnostic exonic sites where the Y-linked allele contributed >1% of the total coverage for that site; fraction shown in expression column) and low absolute expression (i.e., mean coverage of Y-linked alleles mapping to diagnostic exonic sites; values in square brackets in expression column). Point estimates of transfer times revealed evolutionarily recent origins, with the oldest transfer arising around 1 Mya (see age column; error margins in square brackets, age estimates for putatively functional genes after correcting for purifying selection shown in standard parentheses; Dataset S7). For some Y-linked transfers, multiple haplotypes were detected in male-specific short read data suggesting that these transfers likely underwent subsequent duplication on the Y chromosome, with GeTY Dsim_2R_9.41 also showing signs of an additional autosomal/X-linked duplication. Age and expression estimates may be unreliable for these Y-linked transfers.
Putative duplicated transfers.
Postmultiple testing correction significance (q < 0.05).
Fig. S2.Retrocopy validation. Y-linked retrocopies were validated using a mixture of in silico and in vitro techniques. Sashimi plots depicting GMAP alignments of reads carrying Y-linked variants from (A) GeTYs Dsim_3L_10.87 and (B) Dsim_X_15.2. Read depth is shown on the y axis, and chromosomal position is shown on the x axis. The donor gene annotation is shown at the base of each panel (blue bars; thin lines indicate introns, internal bars indicate exons, and thick terminal bars indicate UTRs). In both cases a large number of reads are split across an annotated intron in the donor copy (see values positioned within the curved black lines), indicating that the intron is missing in the Y-linked copy that arose from a retrocopy transfer. (C) Results from PCR assays of two GeTYs in D. simulans (lanes 2–5) and one in D. mauritiana (two different introns; lanes 6–9). Products were taken from both males (♂) and females (♀), with males showing an additional unique band that is consistent with intronless GeTY. A GeneRuler 100 bp Ladder (Fermentas) is shown in lanes 1 and 10.
Fig. S3.Intronless retrocopy validation. IGV screen capture of showing the GMAP alignments of Y-linked reads with to their donor genes for two GeTYs. (A) A GeTY in D. mauritiana that has reads that terminate abruptly at the end of annotated gene region, indicative of a retrocopy. (B) The GeTY has reads that terminate 9 bp downstream of the 3′ end of the annotation, whereby this GeTY was conservatively called as ambiguous (that is, it cannot be confidently assigned as either an DNA or RNA transfer).
Fig. S1.Autosomal/X-linked GeTY transfer. IGV screenshot showing evidence for putative Y-to-autosome/X chromosome gene transfer involving GeTY Dsim_2R_9.41. The first panel shows the annotations for the donor regions and the alignment of the GeTY Dsim_2R_9.41. The second through fourth panels show DNA read alignment and splicing patterns for two D. simulans female crosses (second and third panels) and the combined data for two strains used in to detect the GeTYs. The splicing patterns show that the intron that was absent in the GeTY Dsim_2R_9.41, a retrocopy, is also absent in some of the reads from the two D. simulans females. The higher coverage of the exonic regions in these females implies that GeTY Dsim_2R_9.41 was involved in a Y-to-autosome transfer that is present in these lines.
Fig. S4.GeTY expression patterns. Box plot showing the log10 transformed absolute expression (i.e., coverage depth) for donor and Y-linked copies of the GeTYs detected in D. simulans. For each GeTY, expression was quantified for each exonic diagnostic outlier SNP.
Fig. 2.Origin of Y-linked gene transfers. Retrocopies (circles), DNA translocations (diamonds), and ambiguous transfers (squares) are indicated on the inferred branch of origin in the D. melanogaster clade. Divergence times are shown at the red nodes. Shared GeTYs are only found in the D. simulans clade. The D. simulans clade also contains a significant excess of GeTYs relative to D. melanogaster, which appears to be primarily driven by a surplus of retrocopy transfers. Note that the branch lengths are not to scale; both the D melanogaster and D. simulans clade branches are truncated (depicted by the break points).