| Literature DB >> 32343808 |
Tao Shi1,2, Razgar Seyed Rahmani3, Paul F Gugger4, Muhua Wang5, Hui Li1,2,6, Yue Zhang1,2,6, Zhizhong Li1,2,6, Qingfeng Wang1,2,7, Yves Van de Peer3,8,9,10, Kathleen Marchal3,11, Jinming Chen1,2.
Abstract
For most sequenced flowering plants, multiple whole-genome duplications (WGDs) are found. Duplicated genes following WGD often have different fates that can quickly disappear again, be retained for long(er) periods, or subsequently undergo small-scale duplications. However, how different expression, epigenetic regulation, and functional constraints are associated with these different gene fates following a WGD still requires further investigation due to successive WGDs in angiosperms complicating the gene trajectories. In this study, we investigate lotus (Nelumbo nucifera), an angiosperm with a single WGD during the K-pg boundary. Based on improved intraspecific-synteny identification by a chromosome-level assembly, transcriptome, and bisulfite sequencing, we explore not only the fundamental distinctions in genomic features, expression, and methylation patterns of genes with different fates after a WGD but also the factors that shape post-WGD expression divergence and expression bias between duplicates. We found that after a WGD genes that returned to single copies show the highest levels and breadth of expression, gene body methylation, and intron numbers, whereas the long-retained duplicates exhibit the highest degrees of protein-protein interactions and protein lengths and the lowest methylation in gene flanking regions. For those long-retained duplicate pairs, the degree of expression divergence correlates with their sequence divergence, degree in protein-protein interactions, and expression level, whereas their biases in expression level reflecting subgenome dominance are associated with the bias of subgenome fractionation. Overall, our study on the paleopolyploid nature of lotus highlights the impact of different functional constraints on gene fate and duplicate divergence following a single WGD in plant.Entities:
Keywords: gene balance; gene expression; methylation; subgenome dominance; whole-genome duplication
Mesh:
Year: 2020 PMID: 32343808 PMCID: PMC7403625 DOI: 10.1093/molbev/msaa105
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
. 1.Circos plot of lotus genome assembly. From outside to inside rings: (I) size (Mb) of the assembly for each chromosome; (II) density distribution of genes; (III) density distribution of sRNA − TEs; (IV) density distribution of sRNA + TEs; (V) dot plot of nucleotide diversity of CDS for each gene; (VI) methylation level of genes and flanking regions; (VII) gene expression level (log-transformed FPKM value); and (VIII) syntenic paralogs are linked by colored lines.
. 2.Violin plots of expression, functional, and genomic features of genes from different gene groups (based on duplication status). (A) The average copy number of orthologs. (B) Coefficient of variation (CV) of copy number among taxa. (C) Ratio of orthologs as “angio-singles.” (D) The mean of log-transformed FPKM. (E) The ratio of silent genes. (F) Tissue specificity index (based on Tau index). (G) The average portion of the deleted genic sequence in tropical lotus comparing to the reference genome (ratio of deletion). (H) Nucleotide diversity (π). (I) Length of the genic region. (J) Exon number. (K) The number of protein–protein interactions inferred from the closest orthologs in Arabidopsis. (L) CDS length. Black line: median; gray line: quantile.
. 3.Differences in average CG, CHG, and CHH methylation level (ML) in lotus leaf along the gene and flanking regions among different gene groups based on the duplication status. (A–C) Methylation of all annotated genes. (D–F) Methylation of the genes with RNA-seq evidence.
. 4.Violin plots of expression, functional, methylation, and evolutionary features of WGD-derived duplicate genes with different level of expression divergence (group A, group B, group C, and group D). (A) Connectivity score. (B) dN, nonsynonymous mutation. (C) dS, synonymous mutation. (D) The number of protein–protein interaction inferred from the closest orthologs in Arabidopsis. (E) The mean of log-transformed FPKM. (F) Tissue specificity index (based on Tau index). (G–I) r (correlation coefficient) of CG methylation levels in tissues between duplicates for gene body (G), upstream (H), and downstream region (I). Black line: median; gray line: quantile.
. 5.Subgenome fractionation and dominance in lotus. (A) Differences in the number of singlets (noncollinear) genes across 130 pairs of duplicate syntenic blocks. (B) The ratios of dominant copies in collinear genes between LF blocks and MF blocks across 41 RNA-seq samples. (C–H) Differences in average CG, CHG, and CHH methylation level in leaf along gene and flanking regions between duplicates that belong to LF blocks and MF blocks. (C–E) Methylation of all annotated genes. (F–H) Methylation of the genes with RNA-seq evidence.