| Literature DB >> 32063915 |
Carolina Granados Mendoza1, Matthias Jost2, Eric Hágsater3, Susana Magallón1, Cássio van den Berg4, Emily Moriarty Lemmon5, Alan R Lemmon6, Gerardo A Salazar1, Stefan Wanke2.
Abstract
Universal angiosperm enrichment probe sets designed to enrich hundreds of putatively orthologous nuclear single-copy loci are increasingly being applied to infer phylogenetic relationships of different lineages of angiosperms at a range of evolutionary depths. Studies applying such probe sets have focused on testing the universality and performance of the target nuclear loci, but they have not taken advantage of off-target data from other genome compartments generated alongside the nuclear loci. Here we do so to infer phylogenetic relationships in the orchid genus Epidendrum and closely related genera of subtribe Laeliinae. Our aims are to: 1) test the technical viability of applying the plant anchored hybrid enrichment (AHE) method (Angiosperm v.1 probe kit) to our focal group, 2) mine plastid protein coding genes from off-target reads; and 3) evaluate the performance of the target nuclear and off-target plastid loci in resolving and supporting phylogenetic relationships along a range of taxonomical depths. Phylogenetic relationships were inferred from the nuclear data set through coalescent summary and site-based methods, whereas plastid loci were analyzed in a concatenated partitioned matrix under maximum likelihood. The usefulness of target and flanking non-target nuclear regions and plastid loci was assessed through the estimation of their phylogenetic informativeness. Our study successfully applied the plant AHE probe kit to Epidendrum, supporting the universality of this kit in angiosperms. Moreover, it demonstrated the feasibility of mining plastome loci from off-target reads generated with the Angiosperm v.1 probe kit to obtain additional, uniparentally inherited sequence data at no extra sequencing cost. Our analyses detected some strongly supported incongruences between nuclear and plastid data sets at shallow divergences, an indication of potential lineage sorting, hybridization, or introgression events in the group. Lastly, we found that the per site phylogenetic informativeness of the ycf1 plastid gene surpasses that of all other plastid genes and several nuclear loci, making it an excellent candidate for assessing phylogenetic relationships at medium to low taxonomic levels in orchids.Entities:
Keywords: Orchidaceae; anchored hybrid enrichment; coalescent methods; off-target data; phylogenomics; universal probe set
Year: 2020 PMID: 32063915 PMCID: PMC7000662 DOI: 10.3389/fpls.2019.01761
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Taxon sampling and voucher information including collector, and collection number (herbarium code as in http://sweetgum.nybg.org/science/ih/).
| Function | Subtribe | Taxon | Voucher information | Lab # |
|---|---|---|---|---|
| Outgroup | Pleurothallidinae |
| Soto Arenas, 3679 (AMO) | E100 |
| Laeliinae |
| Soto Arenas, 6858 (AMO) | E106 | |
|
| Jiménez Machorro, 2445 (AMO) | E094 | ||
|
| Brieger, 14440 (ESA) | E218 | ||
|
| van den Berg, 2997 HUEFS) | E217 | ||
| Ingroup |
| Hágsater, 14559 (AMO) | E026 | |
|
| Hágsater, 9468 (AMO) | E058 | ||
|
| Salazar Chávez, 7467 (AMO) | E432 | ||
|
| Salazar Chávez, 7566 (AMO) | E147 | ||
|
| Salazar Chávez, 7867 (AMO) | E145 | ||
|
| Soto Arenas, 9087 (AMO) | E063 | ||
|
| Jiménez Machorro, 2763-B (AMO) | E013 | ||
|
| Hágsater, 14573 (AMO) | E079 | ||
|
| Hágsater, 13559 (AMO) | E070 | ||
|
| Salazar Chávez, 7468 (AMO) | E008 | ||
|
| Hágsater, 13963 (AMO) | E065 | ||
|
| Hágsater, 13804 (AMO) | E017 | ||
|
| Hágsater, 14582 (AMO) | E042 | ||
|
| Hágsater, 14591 (AMO) | E015 | ||
|
| Hágsater, 14552 (AMO) | E048 | ||
|
| Salazar Chávez, 6723 (AMO) | E435 | ||
|
| Hágsater, 14587 (AMO) | E044 | ||
|
| Hágsater, 14558 (AMO) | E022 |
Figure 1Attributes of retrieved nuclear loci. (A, B) Histograms showing length and number of species in the alignments, respectively. (C) Number of loci (yellow points and lines) and mean copies recovered (blue points and lines) per species. (D) % of missing data per species, including bases called as N plus missing flanking regions of loci, in terms of number of base pairs (bp, yellow points and lines) and % of reads on target per species (blue points and lines).
Analyzed dataset characteristics.
| Nuclear | Plastid | Combined | |
|---|---|---|---|
|
| 335 | 72 | 407 |
|
| 163-1495 | 90-6990 | 163-6990 |
|
| 223 (66.56) | 66 (91.66) | 289 (71) |
|
| 194,841 | 63,421 | 258,262 |
|
| 27,006 (13.86) | 2610 (4.11) | 29,616 (11.46) |
Figure 2Attributes of retrieved plastid loci. (A, B) Histograms showing length and number of species in the alignments, respectively. (C) Missing data, including bases called as N plus missing flanking regions of loci, in terms of base pairs (bp, yellow points and lines) and number of loci (blue points and lines) per species. (D) % of reads on target as function of % of missing data, in terms of bp.
Figure 4Comparison between topologies obtained from the analyses of the nuclear and plastid data sets. Continuous lines connecting names of terminal indicate congruence between topologies, whereas dotted lines indicate strongly supported (BS > 85 or LPP > 0.85) incongruencies. Blue full circles at the internal nodes of the nuclear trees indicate clades absent in the plastid tree and yellow full circles at the internal nodes of the plastid tree indicate clades absent in the nuclear trees. For ease of visualization trees were converted to cladogram and nodes with BS < 85 or LPP< 0.85 were collapsed.
Figure 3(A) Topology obtained in the nuclear SVDQuartets analysis with branch lengths optimized in RAxML and posteriorly converted to ultrametric (see Materials and Methods section). Nodes denoted by an asterisk (*) received BS < 85%. (B) Net phylogenetic informativeness profiles of nuclear target (light to dark blue), nuclear non-target (light to dark green), and plastid (light to dark yellow) partitions. Yellow and black dashed curves correspond to the ycf1 and matK genes, respectively, discussed in the main text. Distribution of loci maximum net phylogenetic informativeness values and time at which these values were reached is shown with quantiles 2 and 3 to the right and below the informativeness profiles, respectively. Whiskers denote maximum and minimum values. Time scale of the informativeness profiles match that of the ultrametric tree in (A). Vertical dotted lines denote the divergence times at which target (blue) and non-target (green) nuclear and plastid (yellow) partitions were more informative.
Previous studies applying the plant anchored hybrid enrichment (AHE) method (Buddenhagen et al., 2016) sorted by divergence times between the focal group and the closest set of reference species used in the kit design.
| Study | Focal group (family, order) | Closest set of reference species (family, order) | Divergence time (Ma) | # retrieved loci* |
|---|---|---|---|---|
|
|
|
| 40.31 | 448 |
|
|
|
| 40.31 | 498 |
|
|
|
| 682 | 527 |
|
| Cariceae–Dulichieae–Scirpeae clade (Cyperaceae, Poales) |
| 882 | 462 |
|
|
|
| 109.11 | 493 |
| This study |
|
| 114.61 | 335 |
|
|
|
| 117.41 | 498 |
|
|
|
| 117.41 | 450 |
|
|
|
| 139.41 | 233 |
1Family divergence times taken from Magallón et al. (2015).
2Oldest crown-family divergence time reported in http://www.mobot.org/MOBOT/research/APweb.
* including paralogs.