| Literature DB >> 22042334 |
Hiroaki Sakai1, Hiroshi Mizuno, Yoshihiro Kawahara, Hironobu Wakimoto, Hiroshi Ikawa, Hiroyuki Kawahigashi, Hiroyuki Kanamori, Takashi Matsumoto, Takeshi Itoh, Brandon S Gaut.
Abstract
Gene duplication occurs by either DNA- or RNA-based processes; the latter duplicates single genes via retroposition of messenger RNA. The expression of a retroposed gene copy (retrocopy) is expected to be uncorrelated with its source gene because upstream promoter regions are usually not part of the retroposition process. In contrast, DNA-based duplication often encompasses both the coding and the intergenic (promoter) regions; hence, expression is often correlated, at least initially, between DNA-based duplicates. In this study, we identified 150 retrocopies in rice (Oryza sativa L. ssp japonica), most of which represent ancient retroposition events. We measured their expression from high-throughput RNA sequencing (RNAseq) data generated from seven tissues. At least 66% of the retrocopies were expressed but at lower levels than their source genes. However, the tissue specificity of retrogenes was similar to their source genes, and expression between retrocopies and source genes was correlated across tissues. The level of correlation was similar between RNA- and DNA-based duplicates, and they decreased over time at statistically indistinguishable rates. We extended these observations to previously identified retrocopies in Arabidopsis thaliana, suggesting they may be general features of the process of retention of plant retrogenes.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22042334 PMCID: PMC3240961 DOI: 10.1093/gbe/evr111
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FThe distribution of the number of synonymous substitutions per site (dS) for the 150 retrocopies compared with their parental genes.
FExamples of retrogenes with altered structures compared with their parental genes. (A) The retrocopy (bottom) has evolved a new 5′ UTR relative to its parental gene (top). (B) The retrocopy (bottom) evolved a new 3′ exon consisting of a part of protein coding region. (C) An example of an intronization event. There are two cDNA variants in the retroposed region (Os07t0693100-01 and Os07t0693100-02). Part of the Os07t0693100-01 transcript is missing in Os07t0693100-02, presumably due to an intronization event. In all figures, the blue boxes indicate annotated protein coding regions and the gray boxes indicate UTRs. Red histograms under the retrogene structures diagram the numbers of mRNAseq reads associated with the retrogenes.
Number of Short-Reads Derived by RNAseq and Summary of Mapping Results
| Tissue | Total No. of Short Reads | No. of Uniquely Mapped Reads | Total Nucleotide Length of the Mapped Reads (bp) | Fold Coverage of the Uniquely Mapped Reads against RAP-Annotated Regions |
| Callus | 36,642,482 | 23,506,559 | 1,071,781,348 | 20.2 |
| Leaf | 33,537,231 | 19,334,815 | 886,753,671 | 16.7 |
| Panicle (before flowering) | 39,438,983 | 22,845,156 | 1,051,085,731 | 19.8 |
| Panicle (after flowering) | 46,071,816 | 28,541,187 | 1,392,167,801 | 26.2 |
| Root | 33,307,824 | 20,493,666 | 927,208,012 | 17.5 |
| Seed | 48,619,562 | 27,646,800 | 1,364,148,154 | 25.7 |
| Shoot | 40,442,971 | 24,143,731 | 1,135,190,328 | 21.4 |
| Total | 278,060,869 | 166,511,914 | 7,828,335,045 | 147.4 |
FDiagram of gene expression counts for the parental gene (solid line) and the retrogene (dashed line) in seven tissues (CA = callus, LE = leaf, PBF = panicle before flowering, PAF = panicle after flowering, RO = root, SE = seed, SH = shoot). The diagrams beneath each graph represent the structure of the parental gene (top) and its retrocopy (bottom).
FThe relationship between the correlation in gene expression between pairs of genes, as represented by Y [=log((1 + R)/(1 − R))], and molecular divergence (dS). (A) RNA-based duplicates (Retrogenes). (B) DNA-based duplicates.
FAnalyses of retrocopies from Arabidopsis thaliana. (A) The distribution of dS between 48 retrocopies and their parental genes. (B) The relationship between Y [=log((1 + R)/(1 − R))] and molecular divergence (dS) for 24 retroparent pairs with available expression data.
Comparison of the Number of Tissue Specific Genes in Each Tissue between Retrogenes and Parental Genes
| Retrogene | Parental Gene | ||||
| Tissue | No. of Tissue-Specific Genes | No. of Nonspecific Genes | No. of Tissue-Specific Genes | No. Nonspecific Genes | |
| Callus | 15 | 72 | 17 | 66 | 0.60 |
| Leaf | 13 | 74 | 20 | 63 | 0.13 |
| Panicle 1 | 24 | 63 | 27 | 56 | 0.48 |
| Panicle 2 | 26 | 61 | 32 | 51 | 0.24 |
| Root | 34 | 53 | 26 | 57 | 0.29 |
| Seed | 13 | 74 | 12 | 71 | 0.93 |
| Shoot | 22 | 65 | 14 | 69 | 0.18 |
Tissue-specific genes are defined based on Ueda's Akaike's Information Criterion–based method.
Panicle 1, panicle before flowering; panicle 2, panicle after flowering.