| Literature DB >> 29617811 |
Yue Hao1, Jacob D Washburn2, Jacob Rosenthal3, Brandon Nielsen4, Eric Lyons5, Patrick P Edger6,7, J Chris Pires8,9, Gavin C Conant1,9,10,11,12.
Abstract
Genes that are inherently subject to strong selective constraints tend to be overretained in duplicate after polyploidy. They also continue to experience similar, but somewhat relaxed, constraints after that polyploidy event. We sought to assess for how long the influence of polyploidy is felt on these genes' selective pressures. We analyzed two nested polyploidy events in Brassicaceae: the At-α genome duplication that is the most recent polyploidy in the model plant Arabidopsis thaliana and a more recent hexaploidy shared by the genus Brassica and its relatives. By comparing the strength and direction of the natural selection acting at the population and at the species level, we find evidence for continued intensified purifying selection acting on retained duplicates from both polyploidies even down to the present. The constraint observed in preferentially retained genes is not a result of the polyploidy event: the orthologs of such genes experience even stronger constraint in nonpolyploid outgroup genomes. In both the Arabidopsis and Brassica lineages, we further find evidence for segregating mildly deleterious variants, confirming that the population-level data uncover patterns not visible with between-species comparisons. Using the A. thaliana metabolic network, we also explored whether network position was correlated with the measured selective constraint. At both the population and species level, nodes/genes tended to show similar constraints to their neighbors. Our results paint a picture of the long-lived effects of polyploidy on plant genomes, suggesting that even yesterday's polyploids still have distinct evolutionary trajectories.Entities:
Mesh:
Year: 2018 PMID: 29617811 PMCID: PMC5887293 DOI: 10.1093/gbe/evy061
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—Distributions of measures of selective constraints. (A and B) Selective constraints of Arabidopsis thaliana genes that retained both At-α duplicates, and genes that returned to single copy after At-α. (A) Red: distribution of At-Al Ka/Ks for A. thaliana genes that survived At-α, blue: distribution of within-species pN/pS for corresponding retained At-α duplicates that have orthologs in A. lyrata; (B) Red: distribution of Ka/Ks for At-Al 1:1 orthologs, blue: distribution of pN/pS for the corresponding A. thaliana genes. Dotted lines: fitted lognormal density distribution curves. The distributions of Ka/Ks from retained At-α duplicates and from single-copy genes are significantly different (P < 0.00001), as are the two distributions for pN/pS (P < 0.00001). (C and D): Selective constraints of Brassica rapa genes that survived the Br-α triplication, and those of the single-copy genes, as well as the selective constraints of their A. thaliana orthologs. (C) Purple: distribution of Ka/Ks for Br-Bo orthologs that preserved three copies in both B. rapa and B. oleracea, green: distribution of Ka/Ks for At-Br orthologs where A. thaliana genes are orthologous to the same B. rapa gene set (retained triplets). (D) Purple: distribution of Ka/Ks for single copy Br-Bo orthologs, green: distribution of Ka/Ks for single copy At-Br orthologs. Dotted lines are as for (A) and (B). The distributions of Ka/Ks for Br-Bo 3:3 and Br-Bo 1:1 orthologs are significantly different (P < 0.00001); the distributions of Ka/Ks for At-Br 1:3 and At-Br 1:1 orthologs are also significantly different (P < 0.00001). Arrows mark the average selective constraint of each distribution. See also table 1.
Average Selective Constraints
| Selective Constraints | Single-Copy Orthologs | Retained Duplicates/Triplicates | |||
|---|---|---|---|---|---|
| At vs. Al | 11,966 | 0.2203 | 4,261 | 0.1914 | −13.13 |
| At vs. Br | 5,367 | 0.1843 | 5,069 | 0.1724 | −6.42 |
| Br vs. Bo | 7,604 | 0.2814 | 5,680 | 0.2454 | −12.79 |
| At 1135 ecotypes pN/pS | 14,293 | 0.4557 | 4,839 | 0.4078 | −10.50 |
| Br 126 accessions pN/pS | 1,316 | 0.1269 | 1,031 | 0.1285 | 1.19 |
Note.—See also figure 1.
Sample size for the calculation of mean selective constraint (b): for Ka/Ks this value corresponds to the number of orthologous pairs; for pN/pS to the number of genes.
Mean value of the measure of selective constraint in question (i.e., Ka/Ks or pN/pS, left).
The difference as a percentage of the selective constraints of single copy genes.
The average Ka/Ks computed between Arabidopsis thaliana and A. lyrata.
The average Ka/Ks computed between A. thaliana and Brassica rapa.
The average Ka/Ks computed between B. rapa and B. oleracea.
The average pN/pS for A. thaliana genes with an ortholog in A. lyrata. About 1,610 genes with pS = 0 and pN ≤ 1 were removed. pN/pS values for 150 genes with pS = 0 and pN > 1 were set to 1. The total number of genes after filtering was 25,806. Only genes with orthologs in A. lyrata were included in the analysis (as noted in a).
The average pN/pS for B. rapa genes with orthologs in B. oleracea. About 266 genes with pS = 0 and pN ≤ 1 were removed. pN/pS values for 129 genes with pS = 0 and pN > 1 were set to 1. The total number of genes after filtering was 5,128. Only genes with orthologs in B. oleracea were included in the analysis (as noted in a).
. 2.—Notched box plots of log selective constraints among Al-At-Br-Bo syntenic orthologs. Colors indicate the loss/retention state of Arabidopsis orthologs after At-α WGD, and Brassica orthologs after Br-α WGT. “Lost”: genes returned to singleton state after polyploidy, “retained”: duplicated/triplicated copies are preserved in the genome. Subplots are boxplots of (A) log(Ka/Ks) for A. thaliana versus A. lyrata, (B) log(Ka/Ks) for B. rapa versus B. oleracea, (C) log(Ka/Ks) for A. thaliana versus B. rapa; and (D) log(pN/pS) for 1,135 A. thaliana ecotypes, (E) log(pN/pS) for 126 B. rapa accessions. The notches are 95% confidence intervals of the medians. Kruskal–Wallis multiple comparison tests were performed to evaluate significant differences across medians, P values: ***P < 0.0001, **P < 0.001, *P < 0.01, •P < 0.05. The black dots represent the log(mean) selective constraints. See also supplementary table S1, Supplementary Material online.
Spearman’s Correlations of Selective Constraints and Network Statistics
| Gene Sets | Number of Nodes | Number of Edges | Selection and Node Degree | Selection and Clustering Coefficient | Selection and Betweenness Centrality | ||||
|---|---|---|---|---|---|---|---|---|---|
| Spearman’s Correlation | Spearman’s Correlation | Spearman’s Correlation | |||||||
| 0.083 | 0.0975 | 0.002 | 0.002 | ||||||
| 0.0149 | 0.392 | 0.0986 | 0.008 | 0.010 | |||||
| 0.1216 | 0.001 | 0.0183 | 0.343 | 0.0757 | 0.040 | ||||
P value < 0.05.
Metabolic networks defined from three gene sets: 1) the Arabidopsis network with genes appearing in both the Ka/Ks and the pN/pS analyses, 2) the full Brassica network and 3) the reduced Brassica network with genes that appeared in both the Ka/Ks and the pN/pS analyses.
Number of nodes in the metabolic network. Each node is a biochemical reaction.
Number of edges in the metabolic network. An edge connects two nodes in the network if the reactions for those nodes share a metabolite.
Node degree is the number of edges connected to a node.
Clustering coefficient is defined as the ratio of existing links connecting a node’s neighbors to each other over the maximum possible number of such links.
Betweenness centrality is the number of the network’s shortest paths that pass through a node.
The Ka/Ks for each node, calculated by taking the average of Ka/Ks of enzyme-coding genes corresponding to the reaction of the node, computed between A. thaliana and A. lyrata.
The average pN/pS for each node calculated in a similar way, for A. thaliana genes with an ortholog in A. lyrata.
The average Ka/Ks for each node, computed between B. rapa and B. oleracea. Results for a subset of B. rapa genes for which At-Br Ka < 0.1127 are shown in parentheses.
The average Ka/Ks for each node, computed between B. rapa and B. oleracea, for an intersection set of genes that appeared in both the Ka/Ks and the pN/pS analyses. Results for a subset of B. rapa genes for which At-Br Ka < 0.1127 are shown in parentheses.
The average pN/pS for each node, for B. rapa genes with orthologs in B. oleracea. Results for a subset of B. rapa genes for which At-Br Ka < 0.1127 are shown in parentheses.
. 3.—The distributions of selective constraints in the Arabidopsis thaliana metabolic network. Nodes represent biochemical reactions and are colored with the average selective constraints of genes encoding enzymes for each reaction. The diameter of each node is in proportion to the number of genes for the node. Edges connect two nodes if the two reactions share compounds. (A) Nodes are colored by At-Al Ka/Ks, with red indicating a Ka/Ks is below the network mean, and blue above that mean. The histogram shows the density distribution of At-Al Ka/Ks. (B) Nodes are colored by At pN/pS, with red indicating below-average constraints, and blue above. The histogram shows the density distribution of At pN/pS.