| Literature DB >> 23226149 |
Lu Fang1, Feng Cheng, Jian Wu, Xiaowu Wang.
Abstract
Whole genome duplication (WGD) and tandem duplication (TD) are both important modes of gene expansion. However, how WGD influences tandemly duplicated genes is not well studied. We used Brassica rapa, which has undergone an additional genome triplication (WGT) and shares a common ancestor with Arabidopsis thaliana, Arabidopsis lyrata, and Thellungiella parvula, to investigate the impact of genome triplication on tandem gene evolution. We identified 2,137, 1,569, 1,751, and 1,135 tandem gene arrays in B. rapa, A. thaliana, A. lyrata, and T. parvula respectively. Among them, 414 conserved tandem arrays are shared by the three species without WGT, which were also considered as existing in the diploid ancestor of B. rapa. Thus, after genome triplication, B. rapa should have 1,242 tandem arrays according to the 414 conserved tandems. Here, we found 400 out of the 414 tandems had at least one syntenic ortholog in the genome of B. rapa. Furthermore, 294 out of the 400 shared syntenic orthologs maintain tandem arrays (more than one gene for each syntenic hit) in B. rapa. For the 294 tandem arrays, we obtained 426 copies of syntenic paralogous tandems in the triplicated genome of B. rapa. In this study, we demonstrated that tandem arrays in B. rapa were dramatically fractionated after WGT when compared either to non-tandem genes in the B. rapa genome or to the tandem arrays in closely related species that have not experienced a recent whole genome polyploidization event.Entities:
Keywords: Arabidopsis lyrata; Arabidopsis thaliana; Brassica rapa; Thellungiella parvula; tandem duplication; tandem gene evolution; whole genome duplication
Year: 2012 PMID: 23226149 PMCID: PMC3509317 DOI: 10.3389/fpls.2012.00261
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure A1The phylogenetic tree as above comes from website (.
Figure 1Distribution of tandemly repeated gene arrays in the . The number of genes in each tandem array mostly ranged from 2 to 6; data from tandem arrays with more than seven genes was combined. Tandemly repeated gene arrays were identified using the BLASTP program with a threshold of E < 10−20. One unrelated gene among cluster members was tolerated. In both (A) and (B), the frequency of tandem gene number is shown on the vertical axis, and the number of tandemly duplicated genes in the arrays is shown below the horizontal axis. The histogram shows the number of clusters in the genome containing two to n similar gene units in tandem. (A) The distribution of gene number in all tandem arrays of A. thaliana, A. lyrata, B. rapa, and T. parvula. (B) The distribution of gene number in the shared syntenic tandem arrays among A. thaliana, A. lyrata, B. rapa, and T. parvula.
Figure 2Venn diagram showing unique and shared tandem arrays between and among the three dicotyledonous species (.
The distribution of gene numbers in tandem arrays from three sub-genomes of .
| LF | MF1 | MF2 | Total | |
|---|---|---|---|---|
| 0 gene | 88 | 137 | 173 | 398 |
| 1 gene | 141 | 144 | 133 | 418 |
| >1 genes | 185 | 133 | 108 | 426 |
| Total | 414 | 414 | 414 | 1,242 |
“0 gene” refers to the genes in the tandem arrays of the B. rapa’s ancestor that were all lost in the sub-genome of B. rapa.
“1 gene” refers to the genes in the tandem arrays of the B. rapa’s ancestor that were fractionated to a single gene in the sub-genome of B. rapa.
“>1 genes” refers to the tandem arrays of the B. rapa’s ancestor that were maintained in the sub-genome of B. rapa.
Statistical test between the number of syntenic tandem arrays and all syntenic genes among .
| Number of shared tandem arrays | Number of syntenic genes | ||
|---|---|---|---|
| 414 | 15,791 | 1.303e-06 | |
| 426 | 22,630 |
.
.
.
Fisher’s exact test between the number of shared syntenic tandem arrays and all syntenic non-tandem genes in .
| Number of shared tandem arrays | Number of syntenic gene | ||
|---|---|---|---|
| 658 | 18,388 | 3.47e-03 | |
| 917 | 29,538 | ||
| 603 | 18,125 | 1.35e-03 | |
| 854 | 30,250 | ||
| 524 | 17,303 | 1.85e-17 | |
| 524 | 29,433 |
.
Statistical test between the number of tandem arrays and the number of non-tandem genes in the three species, .
| Number of tandem arrays | Number of gene | ||
|---|---|---|---|
| LF of | 185 | 10,145 | 2.08e-05 |
| 414 | 15,791 | ||
| MF1 of | 133 | 6,950 | 8.21e-04 |
| 414 | 15,791 | ||
| MF2 of | 108 | 5,535 | 3.29e-03 |
| 414 | 15,791 |
.
Statistical test between the number of shared syntenic tandem arrays and all syntenic non-tandem genes in .
| Number of shared tandem arrays | Number of syntenic gene | ||
|---|---|---|---|
| 391 | 16,063 | 0.088 | |
| 336 | 15,327 |
.
.