| Literature DB >> 24846630 |
Minhan Yi1, Feng Chen1, Majing Luo1, Yibin Cheng1, Huabin Zhao2, Hanhua Cheng3, Rongjia Zhou3.
Abstract
The Piwi-interacting RNA (piRNA) pathway is responsible for germline specification, gametogenesis, transposon silencing, and genome integrity. Transposable elements can disrupt genome and its functions. However, piRNA pathway evolution and its adaptation to transposon diversity in the teleost fish remain unknown. This article unveils evolutionary scene of piRNA pathway and its association with diverse transposons by systematically comparative analysis on diverse teleost fish genomes. Selective pressure analysis on piRNA pathway and miRNA/siRNA (microRNA/small interfering RNA) pathway genes between teleosts and mammals showed an accelerated evolution of piRNA pathway genes in the teleost lineages, and positive selection on functional PAZ (Piwi/Ago/Zwille) and Tudor domains involved in the Piwi-piRNA/Tudor interaction, suggesting that the amino acid substitutions are adaptive to their functions in piRNA pathway in the teleost fish species. Notably five piRNA pathway genes evolved faster in the swamp eel, a kind of protogynous hermaphrodite fish, than the other teleosts, indicating a differential evolution of piRNA pathway between the swamp eel and other gonochoristic fishes. In addition, genome-wide analysis showed higher diversity of transposons in the teleost fish species compared with mammals. Our results suggest that rapidly evolved piRNA pathway in the teleost fish is likely to be involved in the adaption to transposon diversity.Entities:
Keywords: evolution; positive selection; reproduction; teleost fish
Mesh:
Substances:
Year: 2014 PMID: 24846630 PMCID: PMC4079211 DOI: 10.1093/gbe/evu105
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FPhylogenetic tree of Piwil1 in 30 vertebrates. Phylogenetic analysis was performed with PhyML and the lamprey (an order of jawless fish, also a very ancient lineage of vertebrates) Piwil1 was used as an outgroup. Bootstrap values (percentage) were estimated using bootstrap analysis with 100 replicates and the values above 50% are shown next to the nodes. The number of amino acid substitutions per site was used to assess evolutionary distances. The distance scale (0.1) is shown. The accession numbers of all sequences in 30 vertebrates are listed in supplementary table S1, Supplementary Material online.
FNonsynonymous (dN) and synonymous (dS) nucleotide distance, and dN/dS ratio for 18 genes between teleosts and mammals. The first 12 genes (Piwil1, Piwil2, Tdrd1, Tdrkh, Pld6, Ddx4, Asz1, Mov10l1, Fkbp6, Kif17, Prmt5, and Hen1) are involved in piRNA pathways. Five Ago genes (Ago1, Ago2, Ago3a, Ago3b, and Ago4), which are involved in miRNA/siRNA pathways, and β-actin were used as control genes. (A) Nonsynonymous (dN) and synonymous (dS) nucleotide distance between teleosts and mammals. Straight lines between two circles (dS) or two triangles (dN) show distance between teleosts and mammals. All values are mean ± SE. (B) dN/dS ratio between teleosts and mammals. Black and gray columns show dN/dS ratios of different genes in teleosts and mammals, respectively. Gene full names are as follows: Piwil1 (Piwi-like 1), Piwil2 (Piwi-like 2), Tdrd1 (Tudor domain containing 1), Tdrkh (Tudor and KH domain containing protein), Pld6 (Phospholipase D member 6), Ddx4 (DEAD box polypeptide 4), Asz1 (Ankyrin repeat, SAM and basic leucine zipper domain containing 1), Mov10l1 (Moloney leukemia virus 10-like 1), Fkbp6 (FK506 binding protein 6), Kif17 (Kinesin family member 17), Prmt5 (Protein arginine methyltransferase 5), Hen1 (HEN1 methyltransferase homolog 1), Ago1 (Argonaute RISC catalytic component 1), Ago2 (Argonaute RISC catalytic component 2), Ago3a (Argonaute RISC catalytic component 3a), Ago3b (Argonaute RISC catalytic component 3b), and Ago4 (Argonaute RISC catalytic component 4). The dN/dS ratios of piRNA pathway genes are significantly larger than those of miRNA/siRNA pathway genes (P < 0.01, t-test).
Branch model tests for piRNA pathway genes and Ago genes
| Gene | 2Δ | ω0 | ω1 | |
|---|---|---|---|---|
| 712.356 | 0.022 | |||
| 48.940 | 0.081 | |||
| 8.499 | 0.253 | |||
| 33.835 | 0.126 | |||
| 52.876 | 0.036 | |||
| 76.679 | 0.074 | |||
| 35.661 | 0.135 | |||
| 60.794 | 0.099 | |||
| 31.633 | 0.094 | |||
| 28.705 | 0.063 | |||
| 30.149 | 0.031 | |||
| 25.956 | 0.230 | |||
| 0.402 | 0.526 | 0.021 | 0.019 | |
| 27.078 | 0.009 | |||
| 45.255 | 0.006 | |||
| 9.874 | 0.025 | |||
| 79.917 | 0.003 | |||
| 3.359 | 0.066 | 0.013 | 0.018 |
Note.—Higher ω values are in bold.
aTwice the difference of likelihood values between two nested models (one-ratio model vs. two-ratio model).
bP values significant at the 1% threshold after Bonferroni correction in bold.
cω0 indicates a ratio of the mammalian clade (background).
dω1 indicates a different ratio of the teleost fish clade (foreground).
eAgo3b is a fish specific copy of Ago3.
Detection of positive selection for piRNA pathway genes and Ago genes
| Gene | 2ΔL | Estimates of the Parameters in the Modified Model A | Positively Selected Sites | |
|---|---|---|---|---|
| 12.997 | ω0 = 0.068; ω2 = 40.906 | |||
| 17.351 | ω0 = 0.083; ω2 = 10.057 | |||
| 28.065 | ω0 = 0.183; ω2 = 86.504 | 5F, 14L, 31P, | ||
| 20.404 | ω0 = 0.116; ω2 = 60.256 | 69H, 123Q, 126V, | ||
| 4.124 | 0.042 | ω0 = 0.065; ω2 = 6.764 | ||
| 9.197 | ω0 = 0.068; ω2 = 335.073 | |||
| 10.732 | ω0 = 0.155; ω2 = 40.485 | |||
| 17.078 | ω0 = 0.085; ω2 = 11.126 | |||
| 6.917 | ω0 = 0.109; ω2 = 46.815 | 79T, 96A, 118Q, | ||
| 8.569 | ω0 = 0.045; ω2 = 4.658 | |||
| 19.085 | ω0 = 0.036; ω2 = 97.992 | |||
| 4.658 | 0.031 | ω0 = 0.160; ω2 = 99.653 | 179C | |
| 6.078 | ω0 = 0.017; ω2 = 9.523 | 131R, | ||
| 2E-06 | 0.999 | ω0 = 0.012; ω2 = 1 | ||
| 0 | 1 | ω0 = 0.014; ω2 = 6.090 | ||
| 0 | 1 | ω0 = 0.030; ω2 = 12.715 | ||
| 0 | 1 | ω0 = 0.007; ω2 = 10.684 | ||
| 4.324 | 0.037 | ω0 = 0.012; ω2 = 998.998 | ||
aThe ancestral branch leading to the teleosts was assigned as the foreground branch. Twice the difference between the likelihood values of the null model and the alternative model is indicated as 2ΔL. The modified model A is used as the alternative model and the modified model A with ω2 fixed at 1 is the null model.
bP values significant at the 5% threshold after Bonferroni correction in bold.
cP0 is the proportion of codons that have ω0 in all branches, P1 is the proportion of codons that have ω1 = 1 in all branches, P2a is the proportion of codons that have ω0 in the background branches but ω2 in the foreground branches, and P2b is the proportion of codons that have ω1 in the background branches but ω2 in the foreground branches.
dSites with the BEB PPs higher than 90% are shown. Sites higher than 95% are single underlined and sites higher than 99% are double underlined. The sites of Mov10l1 and Fkbp6 are indexed by the amino acids at the sites in the tetraodon, whereas the sites of the other genes are indexed by the amino acids at the sites in the zebrafish.
FDistribution of the positively selected sites in functional domains. (A) Putatively positively selected sites with BEB PPs higher than 90% were shown. Functional domains in x axis: PAZ–MID (Piwi/Ago/Zwille-Middle), PIWI (P-element induced wimpy testis), Tudor, KH (K homology), PLDc (Phospholipase D family), DEAD (Asp-Glu-Ala-Asp box), Ank (Ankyrin repeat), DEXD (DEAD-like helicase superfamily ATP binding domain), AAA12 (ATPases Associated with a wide variety of cellular Activities superfamily), FKBPc (FKBP-type peptidyl–prolyl cis–trans isomerase), Kinesin (Kinesin motor domain), BBP1c (C-terminal domain of BBP1), Mtase (Arginine-N-methyltransferase). The numbers above stacked columns show the ratios of the sites distributed on functional domains/total positively selected sites for each gene. (B) Sliding window analysis shows ω value distribution (dN/dS) along the coding sequence of Piwil1 between teleosts (red line) and mammals (blue line). Functional domains of Piwil1: PAZ, MID, and PIWI. sDMA represents the binding sites for Tudor domain of Tdrd1 or Tdrkh.
FStructural modeling suggests functional importance of positively selected sites. NJ trees recapitulate vertebrate phylogeny. The amino acid alignments of nine representative species are shown in the middle panels. Homology modeling of zebrafish Piwil1 PAZ domain (from 275E to 387T) (A), zebrafish Piwil2 PAZ domain (from 459R to 574T) (B), the third Tudor domain (from 616L to 807G) of zebrafish Tdrd1 (C), and zebrafish Tdrkh Tudor domain (from 348S to 440C) (D). The binding pockets (piRNA 3′-end binds to PAZ; sDMA of Piwi binds to Tudor) are shown with dashed ellipse. Arrows (in amino acid alignments) or red spheres (in 3D models) indicate the positively selected sites. The modeling domains are colored by structural motifs: α helix, blue; β sheet, cyan; coil, purple. The modeling structures in Piwil1, Piwil2, and Tdrkh are not a whole domain because of incompleteness of corresponding template.
FBranch model analysis of the ω values for 12 genes in the teleost lineage. (A) Cloning and expression levels of five Ago genes and six piRNA pathway genes in the swamp eel detected by RT-PCR. Hprt was used as an inner control. Ago genes are expressed ubiquitously, whereas the piRNA pathway genes are mainly expressed in three kinds of gonads except Tdrkh. (B) The branch model analysis (one-ratio model against two-ratio model). Black columns indicate ω values of the genes in the swamp eel (foreground branch), and gray columns indicate ω values of the other fish species (background branch). *LRT, P < 0.05; **P < 0.01. Ago genes and β-actin were used as controls.
FAssociation between rapid evolution of piRNA pathway and TEs diversity in the teleost fish species. (A) Comparison of TE superfamilies with copy number >1,000 between teleost fish species and mammals. Blue columns, retrotransposon superfamily numbers (Class I); red columns, DNA transposon superfamily numbers (Class II). The data of human and mouse are from a previous study results (Lander et al. 2001; Mouse Genome Sequencing Consortium et al. 2002). As many as 12–25 superfamilies of retrotransposons have been observed in the teleost fish species, whereas only 10–12 superfamilies are detected in the genomes of both human and mouse. For Class II, 7–18 superfamilies are detected in the fish species, whereas there are 2–3 superfamilies in human and mouse, which are inactive in mammals. The red dots indicate the branches with rapid evolution. (B) Diverse TE loci of zebrafish piRNAs. Nonredundant piRNA sequences of zebrafish from Ketting’s study (Houwing et al. 2007) together with our data (Zhou et al. 2010) were used to search against TE superfamilies using Repeat-Masking with default parameters in Repbase. The number indicates percentage for each superfamily with respect to total TE-associated piRNAs. Different colors in pie diagram represent different kinds of the TE superfamily. TE superfamilies with percentages less than 1% are included in other TEs. DIRS, named for the founding member from Dictyostelium discoideum; ERV, endogenous retrovirus; LTR, long terminal repeat; L1, long interspersed element 1; L2, long interspersed element 2; non-LTR, nonlong terminal repeat; SINE, short interspersed elements. (C) A model of adaptive evolution of piRNA pathway to silence diverse TEs in fish genome. Diverse piRNAs transcribed from corresponding TE loci are loaded onto Piwi proteins, which have undergone an adaptive evolution in the lineage of the teleost fish. Besides Piwi proteins, other factors, such as Tdrd1 and Tdrkh, are also involved in piRNA pathway. Finally, piRNA pathway represses TE activation through DNA methylation to protect genome in germline cells.