| Literature DB >> 32591023 |
Fengli Zhao1, Yuexing Wang2, Jianshu Zheng1, Yanling Wen3, Minghao Qu1, Shujing Kang1, Shigang Wu1, Xiaojuan Deng1, Kai Hong1, Sanfeng Li2, Xing Qin1, Zhichao Wu1, Xiaobo Wang1, Cheng Ai1, Alun Li1, Longjun Zeng1,4, Jiang Hu2, Dali Zeng2, Lianguang Shang1, Quan Wang1, Qian Qian1,2, Jue Ruan5, Guosheng Xiong6,7.
Abstract
BACKGROUND: Copy number variations (CNVs) are an important type of structural variations in the genome that usually affect gene expression levels by gene dosage effect. Understanding CNVs as part of genome evolution may provide insights into the genetic basis of important agricultural traits and contribute to the crop breeding in the future. While available methods to detect CNVs utilizing next-generation sequencing technology have helped shed light on prevalence and effects of CNVs, the complexity of crop genomes poses a major challenge and requires development of additional tools.Entities:
Keywords: Asymmetric evolution; Copy number variation; Duplicated gene; Evolutionary fate; Gene expression
Mesh:
Year: 2020 PMID: 32591023 PMCID: PMC7318451 DOI: 10.1186/s12915-020-00798-0
Source DB: PubMed Journal: BMC Biol ISSN: 1741-7007 Impact factor: 7.431
Fig. 1.The result and verification of CNV calling for the 93 rice accessions. a The phylogenetic tree of the 93 O. sativa accessions based on SNP markers, with two O. glumaepatula accessions (W1183 and W1187, purple branches) used as outgroup. And O. sativa Xian group and Geng group were marked yellowgreen and blue, respectively. The red branches represent two tropical O. sativa accessions from Southeast Asia. b Number of deletions (red) and duplications (blue) of each accession compared with the Nipponbare RefSeq. c, d The depth distribution around GL7 (c) and the promotor of IPA1 (d). The red and blue bars showed the duplicated and normal regions, respectively. Each bin represents a length of 5 bp. And XF13 and XF75 were selected as negative controls. e, f The PCR verification of the duplications around GL7 (c) and the promotor of IPA1 (d)
Fig. 2.The impact of copy number variation on gene expression. a–c The distributions of expression folds (duplications to normal copy number) of the positively correlated genes (a), negatively correlated genes (b), and non-significantly correlated genes (c). CN1 means that its copy number is equal to 1 and so on. * and ** indicate significant difference at P < 0.05 and P < 0.01, respectively, determined by the Tukey HSD test in R. The outliers (out of μ ± 3σ) are not displayed. d The correlation between copy number and expression level of the GL7 (LOC_Os07g41200), and a TPM outlier from the CN1 group was discarded. The fitting and significance test of linear equation were performed by “trendline” function from the “basicTrendline” package in R. e The distributions of the increase rate of the two statistics of the positively correlated genes: AddCN1 (add one copy at a time) and DupCN1 (duplication compared to normal copy number). Values greater than 400% are not included in the figure. The data in the pink-shaded area accounted for more than 80% of each group. f The different effects of tandem duplications (TD) and non-tandem duplications (nonTD) on gene expression level. * and ** indicate significant difference at P< 0.05 and P < 0.01, respectively, determined by the Wilcoxon test in R. The outliers (out of μ ± 3σ) are not displayed
Fig. 3.The expression and evolution of duplicated genes. a The Ks distribution of non-pseudogenetic duplicate pairs. The four Ks values (red dotted lines marked) represent key evolutionary events in the evolution of the Oryza genera, respectively, referring in the Stein et al. (2018). “Recent” means their Ks values are 0. b The component of the pseudogene copies. About half of the pseudogene copies were indistinguishable. And the rest was dominated by offspring copies. c–e The difference on Ka (c), Ks (d), Ka/Ks (e) and among neo-functionalized (Neo-), subfunctionalized (Sub-), undifferentiated (Non-) duplicated genes, functional gene-pseudogene pairs (Gene-Ψ), and pseudogene-pseudogene pairs (Ψ-Ψ). * and ** indicate significant difference at P < 0.05 and P < 0.01, respectively, determined by the Wilcoxon test in R. The outliers (out of μ ± 3σ) are not displayed. f, g The dosage sharing of major/minor (f) and parent/offspring (g) copies. The expression fold was normalized to the average TPM values of its corresponding normal gene (CN = 1). ** indicates a significant difference at P < 0.01 determined by the Tukey HSD test in R. The outliers (out of μ ± 3σ) are not displayed. h, i The proportions of major/minor copy (h) or differentiated copies (i) between parent and offspring copies
The statistics of non-pseudogenetic duplicate pairs at six stages
| Stages | Range of | Time (MYA) | Duplicate pairsa | Non-differentiated pairs | Differentiated pairs | Rate of differentiated pairs (%) | Neo-pairs | Subpairs | Rate of neo- in differentiated pairs (%) |
|---|---|---|---|---|---|---|---|---|---|
| I | = 0 | Recent | 1899 | 1891 | 8 | 0.42 | 0 | 8 | 0 |
| II | 0–0.0072 | 0–0.55 | 5 | 3 | 2 | 40 | 0 | 2 | 0 |
| III | 0.0072–0.0313 | 0.55–2.41 | 417 | 395 | 22 | 5.28 | 9 | 13 | 40.91 |
| IV | 0.0313–0.0879 | 2.41–6.76 | 1351 | 1181 | 170 | 12.58 | 103 | 67 | 60.59 |
| V | 0.0879–0.195 | 6.76–15 | 1369 | 1132 | 237 | 17.31 | 89 | 148 | 37.55 |
| VI | > 0.195 | > 15 | 297 | 271 | 26 | 8.75 | 25 | 1 | 96.15 |
| Total | – | – | 5338 | 4873 | 465 | 8.71 | 226 | 239 | 48.60 |
aA gene with a copy number of 3 will derive three duplicate pairs. And the genes with CDS length < 150 bp were excluded
Fig. 4.The model of asymmetric evolution of duplicated genes. This model is only applicable to cases where the two copies are separated after duplication. The copy that moved to a new genomic position is considered an offspring copy, and the other one is a parent copy. After moving to a new genomic region, the expression of offspring copy is no longer affected by the original regulation network and may be more active. At the same time, because of the functional redundancy, the functional constraint of the offspring copy is weaker than that of the original gene, so it is easier to accumulate more harmful mutations and degenerate into pseudogenes. On the contrary, due to the high expression of the new copy and feedback regulation, the expression level of the parent copy is relatively low. The parent copies are more limited because they were in their original positions, especially when the offspring copies accumulate more deleterious mutations that affect their functions, and the parent copies become more functionally constrained to maintain their original function. Thus, the evolution of two copies of a gene that were separated is asymmetrical in terms of expression and fate