| Literature DB >> 31236313 |
Marybel Soto Gomez1,2, Lisa Pokorny3, Michael B Kantar4, Félix Forest3, Ilia J Leitch3, Barbara Gravendeel5,6,7, Paul Wilkin3, Sean W Graham1,2, Juan Viruel3.
Abstract
PREMISE: We developed a target enrichment panel for phylogenomic studies of Dioscorea, an economically important genus with incompletely resolved relationships.Entities:
Keywords: Dioscorea target capture kit; Hyb‐Seq; herbarium; monocot phylogenomics; ortholog identification; tuber crop wild relatives
Year: 2019 PMID: 31236313 PMCID: PMC6580989 DOI: 10.1002/aps3.11254
Source DB: PubMed Journal: Appl Plant Sci ISSN: 2168-0450 Impact factor: 1.936
Capture efficiency for the 25 taxa enriched using Dioscorea‐specific baits, including DNA concentration of starting library (ng/μL) and designated enrichment pool.
| Species | Starting library concentration (Pool no.) | Quality‐filtered paired reads | Enrichment efficiency (% reads on target) | No. of assembled target genes (%) | Recovered % of full target gene set length | Gene tree occupancy (%) | Total no. of bp retrieved by HybPiper | No. of bp included in analyses |
|---|---|---|---|---|---|---|---|---|
|
| 78.6 (P4) | 576,773 | 57.4 | 256 (98.5) | 91.5 | 257 (97.3) | 404,256 | 308,524 |
|
| 40.6 (P3) | 841,789 | 53.7 | 258 (99.2) | 93.5 | 260 (98.5) | 413,007 | 310,035 |
|
| 51 (P4) | 2,622,322 | 56.3 | 259 (99.6) | 96.6 | 257 (97.3) | 426,513 | 305,152 |
|
| 32.2 (P2) | 1,125,117 | 16.8 | 257 (98.8) | 83.8 | 259 (98.1) | 369,975 | 301,896 |
|
| 46.2 (P3) | 2,829,147 | 47.1 | 253 (97.3) | 81 | 250 (94.7) | 357,693 | 288,868 |
|
| 42.8 (P3) | 791,550 | 56.8 | 258 (99.2) | 92.1 | 258 (97.7) | 406,539 | 307,100 |
|
| 57 (P4) | 469,374 | 42.1 | 256 (98.5) | 88.5 | 257 (97.3) | 390,870 | 307,249 |
|
| 29.8 (P2) | 893,672 | 12.5 | 252 (96.9) | 78.9 | 252 (95.5) | 348,579 | 292,679 |
|
| 11.1 (P1) | 875,971 | 15.2 | 257 (98.8) | 83.9 | 260 (98.5) | 370,539 | 301,947 |
|
| 16.6 (P1) | 787,302 | 19.5 | 253 (97.3) | 83.5 | 254 (96.2) | 368,631 | 299,109 |
|
| 24.2 (P2) | 990,417 | 8.9 | 252 (96.9) | 68.6 | 246 (93.2) | 302,829 | 253,766 |
|
| 37 (P2) | 966,336 | 19.4 | 258 (99.2) | 88.4 | 261 (98.9) | 390,342 | 310,042 |
|
| 58.4 (P4) | 721,396 | 35.1 | 255 (98.1) | 80.9 | 252 (95.5) | 357,378 | 295,031 |
|
| 16.1 (P1) | 505,670 | 12.3 | 157 (60.4) | 38 | 151 (57.2) | 167,802 | 143,767 |
|
| 24.2 (P2) | 823,029 | 11.9 | 245 (94.2) | 71.4 | 245 (92.8) | 315,117 | 274,679 |
|
| 13.6 (P1) | 824,231 | 13.5 | 249 (95.8) | 76.3 | 251 (95.1) | 337,113 | 287,438 |
|
| 77.2 (P4) | 338,083 | 56.2 | 259 (99.6) | 92.2 | 260 (98.5) | 407,328 | 311,701 |
|
| 23.2 (P1) | 1,056,815 | 11.9 | 257 (98.8) | 88.5 | 256 (97) | 390,999 | 288,009 |
|
| 80.4 (P4) | 341,113 | 47.3 | 257 (98.8) | 85.6 | 258 (97.7) | 377,931 | 305,668 |
|
| 74.6 (P3) | 258,300 | 60.5 | 259 (99.6) | 94.8 | 259 (98.1) | 418,878 | 297,467 |
|
| 29 (P2) | 1,489,369 | 5.4 | 244 (93.8) | 61.7 | 241 (91.3) | 272,439 | 245398 |
|
| 24 (P1) | 783,195 | 6.5 | 198 (76.2) | 45.5 | 195 (73.9) | 201,048 | 183,614 |
|
| 39.4 (P3) | 1,000,845 | 60.7 | 259 (99.6) | 93.3 | 260 (98.5) | 412,221 | 311,542 |
|
| 11.8 (P1) | 686,737 | 16.7 | 257 (98.8) | 81.4 | 255 (96.6) | 359,280 | 295,297 |
|
| 52.2 (P3) | 389,519 | 45.4 | 151 (58.1) | 24.2 | 149 (56.4) | 107,082 | 97,020 |
Number of reads produced after quality trimming using Trimmomatic (see text for details).
Percentage calculated based on the 260 low‐ to single‐copy nuclear genes targeted using baits.
Percentage calculated from the total number of on‐target nucleotides retrieved per sample, relative to the full length of the reference target gene set.
Percentage calculated based on the 264 low‐ to single‐copy nuclear genes in the final data set.
Figure 1Proportion of total reference gene length recovered per gene and taxon. Rows correspond to taxa, and columns correspond to a gene target. Target enrichment using the Dioscorea‐specific baits designed here was performed on 25 taxa. Gene recovery for nine species with transcriptome‐based data (not including the four transcriptomes used for bait design) is also shown here, see upper set of taxa in figure. Recovered proportions of the reference gene are indicated by shading.
Target gene retrieval for nine taxa using transcriptome‐based data (excluding four transcriptomes used for bait design).
| Species | Quality‐filtered paired reads | Enrichment efficiency (% reads on target) | No. of assembled target genes (%) | Recovered % of full target gene set length | Gene tree occupancy (%) | Total no. of bp retrieved by HybPiper | No. of bp included in analyses |
|---|---|---|---|---|---|---|---|
|
| 35,322,722 | 1.8 | 254 (97.7) | 95.3 | 254 (96.2) | 420,909 | 306,180 |
|
| 59,713,857 | 1.8 | 257 (98.8) | 95.5 | 257 (97.3) | 421,914 | 308,395 |
|
| 74,192,635 | 2.1 | 256 (98.5) | 92.9 | 257 (97.3) | 410,295 | 308,374 |
|
| 61,649,891 | 2.2 | 257 (98.8) | 91.9 | 258 (97.7) | 405,759 | 307,352 |
|
| 20,582,868 | 1.7 | 252 (96.9) | 87.5 | 252 (95.5) | 386,604 | 300,421 |
|
| 60,869,766 | 2.3 | 258 (99.2) | 94.3 | 258 (97.7) | 416,262 | 309,712 |
|
| 56,326,258 | 2.3 | 258 (99.2) | 95.1 | 258 (97.7) | 420,039 | 309,733 |
|
| 32,346,683 | 1.7 | 254 (97.7) | 92.5 | 255 (96.6) | 408,300 | 305,784 |
|
| 61,696,853 | 1.3 | 251 (96.5) | 87.2 | 250 (94.7) | 385,308 | 297,614 |
The “t” label for D. alata and D. communis indicates they are transcriptome‐based sequences; both taxa are represented by an additional individual in phylogenomic analyses here, D. alata t1 (used for bait design and therefore not included in gene recovery analyses), and D. communis 2 (a DNA‐based, target‐enriched sample, see Table 1).
Number of reads produced after quality trimming using Trimmomatic (see text for details).
Percentage calculated based on the 260 low‐ to single‐copy nuclear genes targeted using baits.
Percentage calculated from the total number of on‐target nucleotides retrieved per sample, relative to the full length of the reference target gene set.
Percentage calculated based on the 264 low‐ to single‐copy nuclear genes in the final data set.
Figure 2Phylogenetic relationships in Dioscorea inferred from coalescent‐based analyses of 264 genes recovered using target enrichment with the Dioscorea‐specific baits designed here. Values next to branches are local posterior probabilities (LPP) and multilocus bootstrap support (MLB), respectively; thick branches have 1.0 LPP and 100% bootstrap support. Lineages in red are major crops; blue‐labeled taxa (D. calcicola, D. nummularia) are previously identified crop wild relatives of the yam crops sampled here. Scale bar shows coalescent units for internal branches (not estimated by ASTRAL for terminal branches).