| Literature DB >> 33510865 |
Haiyue Zeng1,2,3, Xingyu Chen3, Hongbo Li4, Jun Zhang5, Zhaoyuan Wei1,2, Yi Wang1,2.
Abstract
Retroduplication variation (RDV), a type of retrocopy polymorphism, is considered to have essential biological significance, but its effect on gene function and species phenotype is still poorly understood. To this end, we analyzed the retrocopies and RDVs in 3,010 rice genomes. We calculated the RDV frequencies in the genome of each rice population; detected the mutated, ancestral and expressed retrogenes in rice genomes; and analyzed their RDV influence on rice phenotypic traits. Collectively, 73 RDVs were identified, and 14 RDVs in ancestral retrogenes can significantly affect rice phenotypes. Our research reveals that RDV plays an important role in rice migration, domestication and evolution. We think that RDV is a good molecular breeding marker candidate. To our knowledge, this is the first study on the relationship between retrogene function, expression, RDV and species phenotype.Entities:
Keywords: BP, Biological process; CC, Cellular component; FPKM, Fragments per kilobase per million mapped reads; GO, Gene ontology; Indel, Insertion-deletion; MF, Molecular function; MST, monosaccharide transporter; ORF, Open reading frame; RDV; RDV, Retroduplication variation; Retrocopy; Retrogene; Retroposition; Rice
Year: 2021 PMID: 33510865 PMCID: PMC7811064 DOI: 10.1016/j.csbj.2020.12.046
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1(A) Formation of a retrocopy: the mRNA transcribed from the gene is reverse transcribed into cDNA and subsequently inserted into the genome. (B) Formation of an RDV: an insertion or deletion in a retrocopy.
Fig. 2Summary of 74 rice retrocopies (Table S1). (A) Parental gene protein coverage within the retrocopy-parental alignment. (B) Retrocopy-parental gene alignment percent identity. (C) Ratio of functional retrogenes and retropseudogenes. (D) Ratio of retrocopy with and without conserved ORF. (E) Ka/Ks distribution of functional retrogenes and retropseudogenes.
. Selective pressure of rice retrocopies.
| Gene | Ka/Ks < 0.5 | Ka/Ks = 0.5 ~ 1.2 | Ka/Ks > 1.2 | Geometric mean |
|---|---|---|---|---|
| Functional retrogene | 22/49 (44.90%) | 16/49 (32.65%) | 11/49 (22.45%) | 0.3747 |
| Retropseudogene | 3/25 (12.00%) | 19/25 (76.00%) | 3/25 (12.00%) | 0.5168 |
Fig. 3(A) Origin time of retrocopy based on a molecular clock. (B) Retroposition in Tracheophyta. Left boxes indicate the number of retrocopies originating in a certain ancestral genome; right boxes reveal the number of retrocopies in the Oryza sativa japonica genome. All detected ancestral retrocopies are marked by red triangles. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 4(A) Seventy-three RDVs in 35 retrogenes. (B) Frequencies of RDVD_osat_9_1 in different subpopulations. (C) Frequencies of RDVD_osat_9_1 in different geographic regions.
Fig. 5(A) The Venn diagram shows the number of retrogenes that are expressed, ancestral and mutated. (B) GO enrichment analysis. BP (biological process), CC (cellular component) and MF (molecular function). Enriched GO terms were selected by Fisher’s exact test and FDR < 0.05.
Effects of three RDVs in three retrogenes (retro_osat_9, retro_osat_16, retro_osat_27) on the agronomic traits and grain size of rice from the 3,000 Rice Genomes Project based on statistical analysis. The genotype that was homozygous for deletion was DD, the genotype that was homozygous for insertion was II, and the genotype that was heterozygous was DI. These genotypes apply below.
| RDVD_osat_9_1 | |||||
|---|---|---|---|---|---|
| Phenotype | Genotypic (LSM ± SE) | ||||
| II | DI | DD | P value | nII, nDI, nDD | |
| Culm length (cm) | 718,32,1007 | ||||
| Culm number (count) | 718,32,1008 | ||||
| Grain length (mm) | 8.61 ± 0.04 | 8.53 ± 0.26 | 8.62 ± 0.03 | 0.7766 | 827,18,1163 |
| Grain width (mm) | 827,18,1163 | ||||
| Ligule length (mm) | 712,32,992 | ||||
| Panicle length (cm) | 25.12 ± 0.17 | 25.28 ± 0.71 | 24.77 ± 0.10 | 0.0579 | 716,32,1007 |
| Seedling height (cm) | 710,32,994 | ||||
| Thousand grain weight (g) | 782,48,1007 | ||||
| Time to flowering (day) | 845,37,1213 | ||||
| RDVI_osat_16 | |||||
| Phenotype | Genotypic (LSM ± SE) | ||||
| DD | DI | II | P value | nDD, nDI, nII | |
| Culm length (cm) | 112.05 ± 0.80 | 120.17 ± 15.26 | 114.29 ± 1.45 | 0.322 | 1368,6,383 |
| Culm number (count) | 1368,6,384 | ||||
| Grain length (mm) | 8.61 ± 0.03 | 8.30 ± 0.65 | 8.65 ± 0.05 | 0.6747 | 1535,5,468 |
| Grain width (mm) | 1535,5,468 | ||||
| Ligule length (mm) | 1352,6,378 | ||||
| Panicle length (cm) | 24.89 ± 0.11 | 25.13 ± 0.87 | 25.02 ± 0.17 | 0.9974 | 1366,6,383 |
| Seedling height (cm) | 38.34 ± 0.32 | 39.83 ± 1.66 | 39.60 ± 0.60 | 0.1955 | 1353,6,377 |
| Thousand grain weight (g) | 1408,12,417 | ||||
| Time to flowering (day) | 99.71 ± 0.57 | 98.71 ± 9.11 | 102.90 ± 1.14 | 0.0732 | 1624,7,464 |
| RDVI_osat_27_1 | |||||
| Phenotype | Genotypic (LSM ± SE) | ||||
| DD | DI | II | P value | nDD, nDI, nII | |
| Culm length (cm) | 112.32 ± 0.83 | 115.06 ± 6.62 | 113.08 ± 1.34 | 0.8511 | 1236,16,505 |
| Culm number (count) | 1236,16,506 | ||||
| Grain length (mm) | 8.62 ± 0.03 | 8.51 ± 0.25 | 8.63 ± 0.04 | 0.7492 | 1351,12,645 |
| Grain width (mm) | 1351,12,645 | ||||
| Ligule length (mm) | 1224,16,496 | ||||
| Panicle length (cm) | 24.91 ± 0.12 | 25.11 ± 0.84 | 24.94 ± 0.15 | 0.952 | 1232,16,507 |
| Seedling height (cm) | 38.62 ± 0.34 | 40.19 ± 2.79 | 38.58 ± 0.51 | 0.8919 | 1224,16,496 |
| Thousand grain weight (g) | 25.27 ± 0.13 | 23.84 ± 0.93 | 24.81 ± 0.17 | 0.0864 | 1268,19,550 |
| Time to flowering (day) | 1445,18,632 | ||||
Effects of three RDVs in the distinct loci of one retrogene (retro_osat_34) on the agronomic traits and grain size of rice from the 3,000 Rice Genomes Project based on statistical analysis.
| RDVD_osat_34_2 | |||||
|---|---|---|---|---|---|
| Phenotype | Genotypic (LSM ± SE) | ||||
| II | DI | DD | P value | nII, nDI, nDD | |
| Culm length (cm) | 1186,27,544 | ||||
| Culm number (count) | 1186,27,545 | ||||
| Grain length (mm) | 1360,22,626 | ||||
| Grain width (mm) | 1360,22,626 | ||||
| Ligule length (mm) | 1173,27,536 | ||||
| Panicle length (cm) | 25.00 ± 0.12 | 24.62 ± 0.58 | 24.78 ± 0.13 | 0.2194 | 1185,27,543 |
| Seedling height (cm) | 1173,27,536 | ||||
| Thousand grain weight (g) | 1314,32,491 | ||||
| Time to flowering (day) | 99.93 ± 0.59 | 98.41 ± 4.40 | 101.57 ± 1.00 | 0.6254 | 1414,34,647 |
| RDVD_osat_34_3 | |||||
| Phenotype | Genotypic (LSM ± SE) | ||||
| II | DI | DD | P value | nII, nDI, nDD | |
| Culm length (cm) | 1189,31,537 | ||||
| Culm number (count) | 1189,31,528 | ||||
| Grain length (mm) | 1364,23,621 | ||||
| Grain width (mm) | 1364,23,621 | ||||
| Ligule length (mm) | 1177,31,528 | ||||
| Panicle length (cm) | 24.97 ± 0.12 | 25.06 ± 0.63 | 24.80 ± 0.13 | 0.4736 | 1188,31,536 |
| Seedling height (cm) | 1177,31,528 | ||||
| Thousand grain weight (g) | 1321,36,480 | ||||
| Time to flowering (day) | 99.83 ± 0.59 | 100.24 ± 4.22 | 101.73 ± 1.01 | 0.7257 | 1421,38,636 |
| RDVD_osat_34_4 | |||||
| Phenotype | Genotypic (LSM ± SE) | ||||
| II | DI | DD | P value | nII, nDI, nDD | |
| Culm length (cm) | 1507,6,244 | ||||
| Culm number (count) | 1508,6,244 | ||||
| Grain length (mm) | 1727,4,277 | ||||
| Grain width (mm) | 1727,4,277 | ||||
| Ligule length (mm) | 1490,5,241 | ||||
| Panicle length (cm) | 24.93 ± 0.10 | 24.33 ± 1.23 | 24.91 ± 0.19 | 0.9053 | 1506,6,243 |
| Seedling height (cm) | 1489,5,242 | ||||
| Thousand grain weight (g) | 1637,4,196 | ||||
| Time to flowering (day) | 99.76 ± 0.53 | 110.71 ± 10.16 | 104.26 ± 1.59 | 0.0424 | 1801,7,287 |
Fig. 6Expression pattern of retrogenes with their parental genes (A) Heat map of the expression pattern of retrogenes. (B) Comparison of the expression patterns between the above four typical retrogenes and their parental genes.