| Literature DB >> 21933779 |
Chung-Shien Wu1, Ya-Nan Wang, Chi-Yao Hsu, Ching-Ping Lin, Shu-Miaw Chaw.
Abstract
The relationships among the extant five gymnosperm groups--gnetophytes, Pinaceae, non-Pinaceae conifers (cupressophytes), Ginkgo, and cycads--remain equivocal. To clarify this issue, we sequenced the chloroplast genomes (cpDNAs) from two cupressophytes, Cephalotaxus wilsoniana and Taiwania cryptomerioides, and 53 common chloroplast protein-coding genes from another three cupressophytes, Agathis dammara, Nageia nagi, and Sciadopitys verticillata, and a non-Cycadaceae cycad, Bowenia serrulata. Comparative analyses of 11 conifer cpDNAs revealed that Pinaceae and cupressophytes each lost a different copy of inverted repeats (IRs), which contrasts with the view that the same IR has been lost in all conifers. Based on our structural finding, the character of an IR loss no longer conflicts with the "gnepines" hypothesis (gnetophytes sister to Pinaceae). Chloroplast phylogenomic analyses of amino acid sequences recovered incongruent topologies using different tree-building methods; however, we demonstrated that high heterotachous genes (genes that have highly different rates in different lineages) contributed to the long-branch attraction (LBA) artifact, resulting in incongruence of phylogenomic estimates. Additionally, amino acid compositions appear more heterogeneous in high than low heterotachous genes among the five gymnosperm groups. Removal of high heterotachous genes alleviated the LBA artifact and yielded congruent and robust tree topologies in which gnetophytes and Pinaceae formed a sister clade to cupressophytes (the gnepines hypothesis) and Ginkgo clustered with cycads. Adding more cupressophyte taxa could not improve the accuracy of chloroplast phylogenomics for the five gymnosperm groups. In contrast, removal of high heterotachous genes from data sets is simple and can increase confidence in evaluating the phylogeny of gymnosperms.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21933779 PMCID: PMC3219958 DOI: 10.1093/gbe/evr095
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Fifty-Three Protein-Coding Genes for Reconstruction of Phylogenetic Trees
| Photosynthetic Electron Transport and Related Processes | Gene Expression | ||||||
| Photosystem II (psb) | Cytochrome b6/f Complex (pet) | Photosystem (psa) | ATP Synthase (atp) | CO2 Fixation | RNA Polymerase (rpo) | Ribosome (rib) | Other |
Abbreviations of functional categories in Figure 2.
Function for assemblage of photosystem I complex (Naver et al. 2001; Ozawa et al. 2009).
FBox plots illustrating the distribution of pairwise ML distances between Amborella and each of the 20 sampled gymnosperm species and differences in heterotachous levels among 11 functional categories of 53 genes. The ML distances were calculated under a GTR + I + Γ model. The HBGP (defined as the mean substitution rate of gnetophytes minus the mean substitution rate of Pinaceae) of each category is indicated. Horizontal lines within boxes denote media.
FComparisons of LSC-IR–adjoined regions among representative cpDNAs of four major gymnosperm groups revealed different IR copies retained in cupressophyte and Pinaceae cpDNAs.
FTrees inferred from the low and high heterotachous data sets (L- and H-data set, respectively). (A) Trees inferred from the L-data set by use of the ML method with a GTR + CAT model, BI with an MBL + CAT + Γ model, and MP, respectively. The MP method generated a single most-parsimonious tree with consistency index (CI) = 0.65 and retention index (RI) = 0.69. Three different methods yielded an identical topology, and only the ML tree is presented. Bootstrap values for ML and MP and posterior probability for BI are arranged in ML/BI/MP. (B–D) show trees based on the H-data set by use of the ML method with a GTR + CAT model, BI with an MBL + CAT + Γ model, and MP, respectively. A single most-parsimonious tree was obtained with CI = 0.69 and RI = 0.73. Supported values estimated from 1,000 bootstrap replicates are shown along branches. Solid circles denote supports greater than 90%. Scales of branch lengths are indicated.
FComparisons of total branch lengths estimated from the L- and H-data sets in each monophyletic group of gymnosperms. In the H-data set, the substitution rate of Pinaceae appears to be slightly elevated. The total branch lengths were calculated from the ML trees shown in figure 3. Numbers above bars denote the values of branch lengths.
FComparisons of amino acid compositions among the five gymnosperm groups. The amino acid compositions (in percentage) appear less biased in the L-data set (A) than in the H-data set (B). The amino acid compositions of the five sampled angiosperms were used as outgroups. Circles along the diagonal line suggest that specific gymnosperms and angiosperms are similar in the compositions of amino acids, whereas circles deviating from the diagonal line indicate biased compositions between specific gymnosperms and angiosperms. Three species of amino acids with extreme biases are indicated.