| Literature DB >> 26689951 |
Srilakshmy L Harikrishnan1, Pascal Pucholt1, Sofia Berlin1.
Abstract
Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows.Entities:
Mesh:
Year: 2015 PMID: 26689951 PMCID: PMC4687058 DOI: 10.1038/srep18662
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary of Illumina sequencing, assembly and mapping.
| 520 leaves | % | 520 roots | % | 592 leaves | % | 592 roots | % | Sum | % | |
|---|---|---|---|---|---|---|---|---|---|---|
| Total no of reads | 207,824,532 | 168,557,233 | 187,381,473 | 205,123,179 | 768,886,417 | |||||
| Total no of bases (Gbp) | 20.8 | 16.9 | 18,7 | 20.5 | 76.9 | |||||
| No of reads after trimming | 174,156,958 | 148,667,480 | 168,291,008 | 190,426,984 | 681,542,430 | |||||
| Average read length after trimming | 98.4 | 98.5 | 97.9 | 98.7 | ||||||
| Total mapped reads | 64,492,170 | 37.0 | 35,511,470 | 23.9 | 38,630,740 | 23.0 | 39,889,518 | 20.9 | 178,523,898 | 26.2 |
| Total unmapped reads | 109,664,788 | 63.0 | 113,156,010 | 76.1 | 129,660,268 | 77.0 | 150,537,466 | 79.1 | 503,018,532 | 73.8 |
| Multi-position matches | 19,530,798 | 30.3 | 8,648,426 | 24.4 | 9,284,534 | 24.0 | 10,379,248 | 26.1 | 47,843,006 | 26.8 |
| Unique matches | 44,961,372 | 69.7 | 26,863,044 | 75.6 | 29,346,206 | 76.0 | 29,510,270 | 74.0 | 130,680,892 | 73.2 |
aNumber of filtered sequencing reads that were aligned to the contigs in the filtered assembly.
bRelative to the number of reads after trimming.
cSequencing reads that could not be aligned to the contigs in the filtered assembly.
dRelative to the number of mapped reads.
eSequencing reads that aligned to two or more positions in the contigs in the filtered assembly.
fSequencing reads aligned to only one position in the contigs in the filtered assembly.
Figure 1Length distribution of the contigs generated by the Trans-ABySS assembler.
The white distribution shows the contig lengths in the unfiltered assembly and the light grey distribution shows the contig lengths in the filtered assembly.
Figure 24DTV rates between all predicted duplicates.
The predicted Salicoid duplicates are highlighted in light grey.
Figure 3Positions of Salicoid duplicate copies in the S. purpurea genome.
The lines connect the two copies in every pair. Most Salicoid duplicates are located on homeologous chromosomes originating from the Salicoid WGD, for example on chromosome 8 and 10, 12 and 15, 5 and 7 and on 2, 5 and 14. Homeologous chromosomes were defined in the P. trichocarpa genome9. The image was created with Circos46.
Figure 4Levels of differential expression between Salicoid duplicates in the two genotypes and tissues.
FC = fold change.
Figure 5Distribution of Ka/Ks values between the Salicoid duplicates.
The median is indicated by the dashed line.
Spearman rank correlation coefficients (r) and the level of significance (P-value) between differential expression and Ka/Ks, Ka, Ks and 4DTV.
| Ka/Ks | Ka | Ks | 4DTV | |
|---|---|---|---|---|
| 520 leaves | 0.051. | 0.008, ns. | 0.001, ns. | 0.013, ns. |
| 520 roots | 0.089. | −0.009, ns. | 0.0014, ns. | −0.001, ns. |
| 592 leaves | 0.039. | 0.013, ns. | 0.016, ns. | 0.015, ns. |
| 592 roots | 0.087. | −0.013, ns. | −0.001, ns. | −0.003, ns. |
ns. = not significant.
Figure 6Enriched GO terms.
(A) All Salicoid duplicates vs. all contigs. (B) The non-differentially expressed Salicoid duplicates vs. all contigs. (C) The differentially expressed Salicoid duplicates vs. all contigs.