| Literature DB >> 22280555 |
Yuannian Jiao1, Jim Leebens-Mack, Saravanaraj Ayyampalayam, John E Bowers, Michael R McKain, Joel McNeal, Megan Rolf, Daniel R Ruzicka, Eric Wafula, Norman J Wickett, Xiaolei Wu, Yong Zhang, Jun Wang, Yeting Zhang, Eric J Carpenter, Michael K Deyholos, Toni M Kutchan, Andre S Chanderbali, Pamela S Soltis, Dennis W Stevenson, Richard McCombie, J Chris Pires, Gane Ka-Shu Wong, Douglas E Soltis, Claude W Depamphilis.
Abstract
BACKGROUND: Although it is agreed that a major polyploidy event, gamma, occurred within the eudicots, the phylogenetic placement of the event remains unclear.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22280555 PMCID: PMC3334584 DOI: 10.1186/gb-2012-13-1-r3
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Schematic phylogenetic tree of flowering plants. BR1 to BR4 denote potential time points when the γ event may have occurred. BR1, monocots + eudicots duplication; BR2, eudicot-wide duplication; BR3, core eudicot-wide duplication; BR4, rosid-wide duplication.
Summary of datasets for eight sequenced plant genomes included in this study
| Species | Annotation version | Number of annotated genes |
|---|---|---|
| TAIR version 9 | 27,379 | |
| ASGPB release | 25,536 | |
| BGI release | 21,635 | |
| JGI version 2.0 | 41,377 | |
| Phytozome version 1.0 | 55,787 | |
| Genoscope release | 30,434 | |
| RGAP release 6.1 | 56,979 | |
| JGI version 1.4 | 34,496 |
These eight genome sequences were used to construct orthogroups, which were then populated with additional unigenes of asterids, basal eudicots, non-grass monocots, and basal angiosperms. The number of annotated genes in each genome is indicated. ASGPB, Advanced Studies of Genomics, Proteomics and Bioinformatics; JGI, Joint Genome Institute; RGAP, Rice Genome Annotation Project; TAIR, The Arabidopsis Information Resource.
Summary of unigene sequences of asterids, basal eudicots, non-grass monocots, and basal angiosperms included in phylogenetic study
| Species | Lineage | Source | Number of reads/ESTs | Size of data | Assembly method(s) | Number of unigenes |
|---|---|---|---|---|---|---|
| Asterid | NCBI-SRA | 209,745 | 89.7 Mb | MIRA | 22,881 | |
| Asterid | PPGP | 69,545,362 | 5.9 Gb | CLC | 104,904 | |
| Asterid | TIGR PTA | 93,279 | NA | Megablast-CAP3 | 44,662 | |
| Asterid | TIGR PTA | 219,485 | NA | Megablast-CAP3 | 81,072 | |
| Asterid | PlantGDB | 231,012 | NA | Vmatch-PaCE-CAP3 | 39,577 | |
| Basal eudicot | 1KP + SRA | 140,604,904 + 3,709,876 | 10.3 Gb + 1.3 Gb | MIRA-SOAPDenovo-CAP3 | 252,894 | |
| Basal eudicot | 1KP | 134,478,938 | 9.8 Gb | SOAPDenovo-CAP3 | 406,167 | |
| Basal eudicot | 1KP | 157,506,374 | 11.5 Gb | SOAPDenovo-CAP3 | 383,426 | |
| Basal eudicot | 1KP | 89,663,900 | 6.5 Gb | SOAPDenovo-CAP3 | 201,564 | |
| Basal eudicot | NCBI + SRA + 1KP | 14,381 + 559,470 + 133,422,402 | 6.8 Mb + 55 Mb + 9.7 Gb | MIRA-SOAPDenovo-CAP3 | 165,260 | |
| Basal eudicot | 1KP + NCBI | 144,520,360 + 1,692 | 10.5 Gb + 1 Mb | SOAPDenovo- CAP3 | 148,533 | |
| Basal eudicot | 1KP | 29,156,514 | 2.1 Gb | CLC-CAP3 | 46,024 | |
| Basal eudicot | 1KP | 20,139,210 | 1.5 Gb | CLC-CAP3 | 31,472 | |
| Basal eudicot | 1KP | 25,508,642 | 1.9 Gb | CLC-CAP3 | 42,373 | |
| Basal eudicot | PlantGDB | 85,040 | NA | Vmatch-PaCE-CAP3 | 19,615 | |
| Caryophillid | PlantGDB | 27,553 | NA | Vmatch-PaCE-CAP3 | 11,317 | |
| Caryophillid | PlantGDB | 25,883 | NA | Vmatch-PaCE-CAP3 | 18,009 | |
| Monocot | MonATOL + 1KP | 149,320 + 15,427,316 | 44.9 Mb + 1.1 Gb | MIRA-SOAPDenovo-CAP3 | 59,453 | |
| Monocot | MonATOL | 33,100,948 | 2.5 Gb | CLC | 68,489 | |
| Monocot | MonATOL | 59,505,714 | 4.5 Gb | CLC | 58,766 | |
| Monocot | MonATOL | 49,121,506 | 3.7 Gb | CLC | 63,269 | |
| Monocot | MonATOL | 70,733,124 | 5.7 Gb | CLC | 57,980 | |
| Magnoliid | AAGP | 2,336,819 | 683 Mb | MIRA | 132,532 | |
| Magnoliid | AAGP | 3,930,505 | 880 Mb | MIRA | 155,371 | |
| Magnoliid | AAGP | 2,327,654 | 543 Mb | MIRA | 137,923 | |
| Basal angiosperm | AAGP | 3,889,719 | 1.1 Gb | MIRA | 289,773 | |
| Basal angiosperm | AAGP | 2,943,273 | 776 Mb | MIRA | 208394 |
1KP, 1000 Green Plant Transcriptome Project; AAGP, Ancestral Angiosperm Genome Project [44]; MonATOL, Monocot Tree of Life Project [42]; NA, not available; NCBI, National Center for Biotechnology Information; PPGP, Parasitic Plant Genome Project [65]; SRA, Sequence Read Archive; TIGR PTA, The Institute for Genomic Research Plant Transcript Assemblies [66].
Figure 2Exemplar maximum likelihood phylogeny of Ortho 1202. RAxML topology of an orthogroup (Ortho 1202) indicating that the γ paralogs of Vitis were duplicated before the split of rosids and asterids and after the early radiation of eudicots. The scored bootstrap (BS) value for this duplication is over 80%, because nodes #1 and #2 (and/or #3) have BS > 80%. Legend: green star = core eudicot duplication; colored circles = recent independent duplications; numbers = bootstrap support values.
Figure 3Exemplar maximum likelihood phylogeny of Ortho 1083. RAxML topology of an orthogroup (Ortho 1083) indicates that the γ paralogs of Vitis were duplicated before the split of rosids and asterids, and after the early radiation of eudicots. The scored bootstrap (BS) value for this duplication is over 50%, because nodes #1 has BS < 80%. Legend: green star = core eudicot duplication; colored circles = recent independent duplications; numbers = bootstrap support values.
Phylogenetic timing of Vitis γ duplications inferred from orthogroup phylogenetic histories
| BR1 | BR2 | BR3 | BR4 | |||||
|---|---|---|---|---|---|---|---|---|
| Ortho | BS ≥ 80 | BS ≥ 50 | BS ≥ 80 | BS ≥ 50 | BS ≥ 80 | BS ≥ 50 | BS ≥ 80 | BS ≥ 50 |
| Duplications | 0 | 7 | 19 | 70 | 80 | 168 | 4 | 6 |
| Percent | 0% | 2.8% | 18.3% | 27.9% | 77.7% | 67% | 4% | 2.3% |
BRx designations are illustrated in Figure 1. Bootstrap (BS) ≥80 and BS ≥50 are counts of nodes resolved with BS ≥80 or ≥50, respectively.
Figure 4Age distribution of γ duplications. (a) The inferred duplication times for 161 γ duplication nodes that support core eudicot-wide duplication (BS ≥50%) were analyzed by EMMIX to determine whether these duplications occurred randomly over time or within some small timeframe. Each component is written as 'color/mean molecular timing/proportion' where 'color' is the component (curve) color and 'proportion' is the percentage of duplication nodes assigned to the identified component. There is one statistically significant component: green/117 (mya)/1. (b) Distribution of inferred γ duplication times from 66 orthogroups that support a eudicot-wide duplication with BS ≥50%. There is one statistically significant component: blue/133 (mya)/1. (c) Distribution of inferred γ duplication times from combination of (a) and (b) shows one significant component: purple/121 (mya)/1. (d-f) Corresponding distributions of inferred duplication times from orthogroups with BS ≥80%. One significant component in (d), green/116 (mya)/1; one in (e), blue/135 (mya)/1; and one in (f), purple/120 (mya)/1.
Figure 5. Methods for sequence alignment and estimation of Kwere as reported (Cui et al. 2006), but were here limited to paralogous gene pairs retained on syntenic blocks in the Vitis genome. Colored lines superimposed on Kdistribution represent significant duplication components identified by likelihood mixture model as in Figure 4 (Materials and methods). a, Kdistribution of 168 Vitis pairs supporting core eudicot-wide duplication in phylogenetic analysis. One statistically significant component: green/1.03/1. b, Kdistribution of 70 Vitis pairs showing all eudicot-wide duplications on phylogenies. One significant component: blue/1.31/1. c, Kdistribution of combination of Vitis pairs supporting core eudicot- (a) and eudicot-wide duplications (b) on phylogenies. Three significant components: black/0.3/0.01, green/1.02/0.70, blue/1.40/0.29. d, Kdistribution of 2191 paralogous pairs were identified from syntenic block analysis. Four significant components: black/0.12/0.02, green/1.09/0.74, blue/1.85/0.22, yellow/2.7/0.02.