| Literature DB >> 34336405 |
Roman Ufimov1,2, Vojtěch Zeisek3,4, Soňa Píšová1,3, William J Baker5, Tomáš Fér4, Marcela van Loo1, Christoph Dobeš1, Roswitha Schmickl3,4.
Abstract
PREMISE: Custom probe design for target enrichment in phylogenetics is tedious and often hinders broader phylogenetic synthesis. The universal angiosperm probe set Angiosperms353 may be the solution. Here, we test the relative performance of Angiosperms353 on the Rosaceae subtribe Malinae in comparison with custom probes that we specifically designed for this clade. We then address the impact of bioinformatically altering the performance of Angiosperms353 by replacing the original probe sequences with orthologs extracted from the Malus domestica genome.Entities:
Keywords: Angiosperms353; Malinae; customized probe set; target enrichment; universal probe set
Year: 2021 PMID: 34336405 PMCID: PMC8312748 DOI: 10.1002/aps3.11442
Source DB: PubMed Journal: Appl Plant Sci ISSN: 2168-0450 Impact factor: 1.936
Overview of the probe sets for target enrichment and references for read mapping used in this study.
| Characteristics of the probe set/reference | Malinae481 | Angiosperms353 | Malinae‐optimized | bestHit‐modified |
|---|---|---|---|---|
| Bait set | Malinae481 | Angiosperms353 | Angiosperms353 | Angiosperms353 |
| Applicability | Customized | Universal | Customized | Customized |
| No. of loci |
481 if paralogous loci are treated as single loci; 713 if paralogous loci are represented as two loci | 353 | 353 | 353 |
| No. of sequence representatives per locus |
2–4 if paralogous loci are treated as single loci; 2 if paralogous loci are represented as two loci | 6–18 (mean 13.5) | 1 | 1 |
| Taxonomic affiliation of sequence representatives |
| Selected angiosperms |
| One “best matching” out of the up to 18 angiosperm representatives |
GDDH13 = Golden Delicious; HFTH1 = Hanfu.
Assembly performance for 25 species within the Malinae and the outgroup Prunus tenella using the different probe sets/references and HybPiper, given for the exonic data set. All values are averaged across the species within the Malinae for each probe set/reference.
| Probe set/reference | Malinae (25 species) | Outgroup ( | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Enrichment efficiency in mapped reads (%) | No. (%) of loci with zero data | No. (%) of loci with ≥25% target length | No. (%) of loci with ≥25% target length, ≥75% accessions presence | No. (%) of loci with ≥50% target length, ≥50% accessions presence | No. (%) of loci with ≥50% target length, ≥75% accessions presence | No. (%) of loci with ≥75% target length, ≥50% accessions presence | No. (%) of loci with ≥75% target length, ≥75% accessions presence | Enrichment efficiency in mapped reads (%) | No. (%) of loci with zero data | No. (%) of loci with ≥25% target length | No. (%) of loci with ≥50% target length | No. (%) of loci with ≥75% target length | |
| Malinae481 | 55.4% | 0 |
478 (99.4%) |
476 (99.0%) |
477 (99.2%) |
475 (98.8%) |
475 (98.8%) |
470 (97.7%) | 43.4% |
44 (9.1%) |
305 (63.4%) |
287 (59.7%) |
247 (51.4%) |
| Angiosperms353 (BWA) | 25.0% |
4 (1.1%) |
329 (93.2%) |
322 (91.2%) |
304 (86.1%) |
280 (79.3%) |
218 (61.8%) |
188 (53.3%) | 22.7% |
15 (4.2%) |
336 (95.2%) |
320 (90.7%) |
257 (72.8%) |
| Angiosperms353 (BLASTX) | 11.5% |
8 (2.3%) |
331 (93.8%) |
324 (91.8%) |
314 (89.0%) |
287 (81.3%) |
248 (70.3%) |
206 (58.4%) | 14.9% |
15 (4.3%) |
336 (95.2%) |
320 (90.7%) |
257 (72.8%) |
| Malinae‐optimized | 23.4% |
2 (0.6%) |
332 (94,1%) |
324 (91.8%) |
318 (90.1%) |
305 (86.4%) |
284 (80.5%) |
257 (72.8%) | 23.3% |
9 (2.5%) |
343 (97.2%) |
334 (94.6%) |
307 (87.0%) |
| bestHit‐modified | 12.4% |
74 (21.0%) |
205 (58.0%) |
185 (52.4%) |
161 (45.6%) |
132 (37.4%) |
98 (27.8%) |
81 (22.9%) | 12.6% |
116 (32.9%) |
222 (62.9%) |
188 (53.3%) |
127 (36.0%) |
BWA = Burrows–Wheeler aligner.
In the case of Angiosperms353, we compared HybPiper with the BWA option and the BLASTX option for read mapping.
“Target length” refers to the recovered length per target locus. In the case of Angiosperms353, with multiple sequence representatives of differing lengths for each locus, the average length was calculated.
“Accessions presence” refers to the proportion of accessions with sequence information for each target locus.
FIGURE 1Heatmap of locus recovery using the different probe sets/references: Malinae481 (A), Angiosperms353 (B), Malinae‐optimized (C), bestHit‐modified (D). Each row represents a taxon, each column a locus. The color shading within each heatmap indicates the target length (i.e., recovered length per target locus).
Alignment characteristics for 25 species within the Malinae and the outgroup Prunus tenella using the different probe sets/references and HybPiper, given for the exonic, intronic, and supercontig data sets. Alignment length, proportion of variable sites per alignment, and proportion of parsimony‐informative sites per alignment are given as minimum (min), average (avg), and maximum (max) values.
| Data set | Probe set/ reference | No. of loci | Average no. of taxa per locus |
Alignment length (bp) min/ | Total alignment length (bp) |
Proportion of variable sites per alignment [no. (%)] min/ |
Proportion of parsimony‐informative sites per alignment [no. (%)] min/ | Total number of parsimony‐informative sites |
|---|---|---|---|---|---|---|---|---|
| Exons | Malinae481 | 481 | 25 |
159/
2964 | 680,658 |
96 (9.7%)/
945 (60.4%) |
1 (0.6%)/
485 (23.7%) | 59,369 |
| Angiosperms353 | 344 | 23 |
90/
2677 | 207,717 |
1 (0.5%)/
586 (65.8%) |
0/
171 (59.6%) | 13,561 | |
| Malinae‐optimized | 346 | 24 |
69/
2217 | 199,056 |
7 (2.3%)/
496 (45.0%) |
0/
190 (22.6%) | 12,151 | |
| bestHit‐modified | 256 | 20 |
69/
1263 | 107,851 |
1 (0.8%)/
252 (56.0%) |
0/
117 (18.2%) | 6331 | |
| Introns | Malinae481 | 481 | 24 |
112/
6789 | 364,669 |
50 (19.3%)/
1954 (88.3%) |
11 (2.8%)/
957 (81.4%) | 80,369 |
| Angiosperms353 | 344 | 22 |
292/
5582 | 427,280 |
27 (6.6%)/
2179 (85.7%) |
1 (0.2%)/
912 (64.3%) | 71,108 | |
| Malinae‐optimized | 347 | 24 |
376/
4743 | 463,577 |
18 (4.8%)/
1900 (78.2%) |
1 (0.2%)/
804 (58.5%) | 75,111 | |
| bestHit‐modified | 256 | 19 |
216/
4041 | 254,030 |
19 (2.2%)/
1687 (84.6%) |
1 (0.2%)/
597 (60.3%) | 38,311 | |
| Supercontigs | Malinae481 | 481 | 25 |
475/
8353 | 1,039,356 |
201 (13.4%)/
2051 (56.6%) |
31 (3.7%)/
970 (33.9%) | 133,632 |
| Angiosperms353 | 344 | 23 |
363/
7759 | 636,398 |
35 (4.9%)/
2360 (72.9%) |
2 (0.3%)/
1142 (58.4%) | 82,339 | |
| Malinae‐optimized | 347 | 24 |
509/
6837 | 659,722 |
24 (2.9%)/
2376 (65.0%) |
1 (0.2%)/
1029 (44.5%) | 84,987 | |
| bestHit‐modified | 257 | 20 |
425/
5142 | 360,052 |
21 (2.2%)/
1863 (79.8%) |
0/
642 (56.9%) | 42,790 |
FIGURE 2Scatter plot of alignment length vs. the number of parsimony‐informative sites for the exonic, intronic, and supercontig data sets using the different probe sets/references, excluding the bestHit‐modified reference. (A) Exons. (B) Introns. (C) Supercontigs.
FIGURE 3Comparison of topology and gene tree (in)congruence of ASTRAL species trees. The Malinae‐optimized reference and Malinae481 were used to generate these trees. For each branch, the top number indicates the number of gene trees concordant with the species tree at that node, and the bottom number indicates the number of gene trees in conflict with that node. The pie charts present the proportion of gene trees that support that clade (blue), the proportion that support the main alternative topology for that clade (green), the proportion that support the remaining alternative topologies (red), and the proportion that inform (support or conflict) that clade with <50% bootstrap support (gray).
| Species |
Ploidy level/ ratio to the internal standard or * genome size (2C value, pg) | Type of material for target enrichment | Collection locality | Collection date |
Collector/ collection number | Voucher |
Sequencing platform, read length (bp) [Angiosperms353/ Malinae481] |
NCBI SRA accession number (total number of reads) [Angiosperms353/ Malinae481] |
|---|---|---|---|---|---|---|---|---|
|
| Herbarium | Greece, Crete, Iraklion province, Kato Asites, above St. George Gorgolaini monastery. Altitude 580–650 m a.s.l. | 20 August 1987 |
K. I. Christensen, K. Bruhn Møller, A. Anagnostopoulos, and S. Diemar/ 634 | LE 01021044 |
MiSeq, 250/ MiSeq, 250 |
SRR12879573 (1,274,080)/ SRR12958388 (1,333,160) | |
|
|
2 1.73 ± 10* | Silica‐dried leaves | USA, Louisiana, Ouachita Parish, beside US165 S, 1.4 miles S of junction with I‐20 in Monroe, inside Richwood Corp. Limit. 32.476111°N, 92.083056°W | 16 August 2004 |
C. Reid/ 5206 | TRT 00000027 |
MiSeq, 250/ MiSeq, 250 |
SRR12879572 (1,037,126)/ SRR12958387 (1,423,198) |
|
|
2 1.82 ± 0.19* | Silica‐dried leaves | USA, Massachusetts, Suffolk Co., Boston, Jamaica Plain, Arnold Arboretum, Weld‐Walter Streets site. 42.303056°N, 71.124167°W | 20 June 2002 |
T. A. Dickinson/ 2002‐07A | TRT 00000105 |
MiSeq, 250/ MiSeq, 250 |
SRR12879561 (1,238,568)/ SRR12958376 (1,299,918) |
|
| Herbarium |
Russian Federation, Petropavlovsk‐Kamchatsky, Rybakov prospekt, near building 19a | 30 August 2018 |
D. Gimelbrant/ s.n. | LE 01020830 |
MiSeq, 250/ MiSeq, 250 |
SRR12879554 (1,092,632)/ SRR12958369 (980,132) | |
|
|
2 0.200 | Herbarium | Russian Federation, Kemerovo Oblast, Kuzbass Botanical Garden, exposition ‘Ever‐blooming garden’. Provenance unknown | 14 September 2016 |
V. Zagurskaya Iu. / 10 | LE 01020939 |
MiSeq, 250/ MiSeq, 250 |
SRR12879553 (1,323,144)/ SRR12958368 (1,212,004) |
|
|
2 0.187 | Herbarium | Germany, Saarland, Schiffweiler, Heiligenwald, | 24 April 2018 |
F.‐J. Weicherding/ 015/2018 | WFBVA not barcoded |
MiSeq, 250/ MiSeq, 250 |
SRR12879552 (1,176,534)/ SRR12958367 (1,155,916) |
|
|
2 0.191 | Herbarium | Germany, Saarland, Saarbrücken, Güdingen, shrubland along abandoned railway track. Altitude 194 m a.s.l. 49.205548°N, 7.020184°E | 27 April 2018 |
F.‐J. Weicherding/ 014/2018 |
WFBVA not barcoded |
MiSeq, 250/ MiSeq, 250 |
SRR12879551 (811,336)/ SRR12958366 (1,378,220) |
|
|
2 0.194 | Silica‐dried leaves | Republic of Crimea, Kirovskiy Rayon, vicinity of Stary Krym, tree thicket by Churuk‐su river. Altitude 290 m a.s.l. 45.02121°N, 35.09535°E | 2 September 2016 |
A. Gnutikov and R. Ufimov/ 33.2 | LE 01020926 |
MiSeq, 250/ MiSeq, 250 |
SRR12879550 (1,037,132)/ SRR12958365 (1,300,050) |
|
|
2 0.214 | Silica‐dried leaves | Republic of Korea, Ganghwa‐do, Incheon, Ganghwa‐gun, Hwado‐myeon, Sagi‐ri, near Mani‐san, along roadside. Altitude 54 m a.s.l. 37.61113°N, 126.45354°E | 21 September 2015 |
R. Ufimov/ s.n. | LE 01020531 |
NextSeq, 150/ MiSeq, 250 |
SRR12879549 (2,646,870)/ SRR12958364 (842,976) |
|
|
2 1.43 ± 0.12* | Silica‐dried leaves | Canada, Ontario, Bruce Co., Eastnor Twp., W Barrow Bay, E side Hwy 9 at S slope. Altitude 200 ft. 44.900000°N, 81.205556°W | 7 September 1986 |
T. A. Dickinson/ D1378 | TRT 00012528 |
MiSeq, 250/ MiSeq, 250 |
SRR12879548 (1,073,782)/ SRR12958363 (1,501,078) |
|
| Herbarium | Greece, Arcadia province, Mt. Menalon, ski center above Kardaras. Altitude 1550–1700 m a.s.l. | 28 August 1987 |
K. I. Christensen, K. Bruhn Møller, and A. Anagnostopoulos/ 1718 |
LE 01020857 |
MiSeq, 250/ MiSeq, 250 |
SRR12879571 (1,198,452)/ SRR12958386 (1,277,062) | |
|
|
2 0.255 | Herbarium | Kazakhstan, Turkistan region, Sozak district, 7 km SW Taukent, Karatau Nature Reserve, NE slope of Mt. Bessaz, gorge of Itmuryn river. Altitude 1210 m a.s.l. 43.828153°N, 68.681692°E | 10 June 2018 | A. V. Grebenjuk/ 252ASM750 | LE 01020824 |
MiSeq, 250/ MiSeq, 250 |
SRR12879570 (944,870)/ SRR12958385 (1,205,794) |
|
|
2 1.73 ± 0.13* | Silica‐dried leaves | USA, Alabama, Autauga Co. Jones Bluff, SSW Peace, woods and prairie openings S of dirt road (Autauga Co. Rd. 9) running E from Autauga Co. Rd. 15, S of AL14 between Burnsville and Mulberry, S slope. Altitude 200 ft a.s.l. 32.398889°N, 86.779444°W | 19 April 2003 |
N. Talent, S. Nguyen, T. A. Dickinson, and R. W. Lance/ 2003‐22 | TRT 00021431 |
MiSeq, 250/ MiSeq, 250 |
SRR12879569 (1,284,680)/ SRR12958384 (1,696,294) |
|
|
2 0.164 | Silica‐dried leaves | Costa Rica, San José province, Páramo district, Pérez Zeledón canton, Cerro de la Muerte, Los Quetzales National Park, along main access road to ICE towers. Altitude 3389 m a.s.l. 9.565147°N, 83.755917°W | 18 May 2016 |
T. A. Dickinson and A. K. Dickinson/ 2016‐03 | LE 01020842 |
MiSeq, 250/ MiSeq, 250 |
SRR12879568 (1,305,802)/ SRR12958383 (1,172,096) |
|
|
2 0.176 | Silica‐dried leaves | China, Yunnan, Lanping, Xue‐bang Shan, forest. Altitude 2500 m a.s.l. | 9 August 2015 |
I. Illarionova, L. Wang, and T.‐J. Tong/ TM 1263 | LE 01020832 |
MiSeq, 250/ MiSeq, 250 |
SRR12879567 (1,237,802)/ SRR12958382 (1,355,392) |
|
|
2 0.192 | Herbarium | Kyrgyzstan, Chuy region, Jayyl (Kalinin) district, Tian Shan, N side of Kyrgyz Ala‐Too Range, Kara‐Balta river, on way out of Sosnovka gorge. Altitude 1180 m a.s.l. 42.639217°N, 73.896808°E | 23 July 2018 |
A. V. Grebenjuk/ 385ASM1217–385ASM1233 | LE not barcoded |
MiSeq, 250/ MiSeq, 250 |
SRR12879565 (1,344,534)/ SRR12958380 (1,620,270) |
|
|
2 0.180 | Silica‐dried leaves | Russian Federation, Komarov Botanical Institute of the Russian Academy of Sciences, Arboretum, plot 126. Provenance: unknown. | 2 October 2018 |
R. Ufimov/ 8 | LE 01020853 |
MiSeq, 250/ MiSeq, 250 |
SRR12879564 (1,453,448)/ SRR12958379 (1,658,026) |
|
|
2 0.177 | Silica‐dried leaves | Russian Federation, Komarov Botanical Institute of the Russian Academy of Sciences, Arboretum, plot 122. Provenance: Japan, Toyama Prefecture, Arimine lake. Altitude 1170 m a.s.l. 36.470833°N, 137.428889°E | 2 October 2018 |
R. Ufimov/ 3 | LE not barcoded |
MiSeq, 250/ MiSeq, 250 |
SRR12879566 (1,198,030)/ SRR12958381 (991,498) |
|
| Silica‐dried leaves | Republic of Korea, Jeju‐do, Jeju‐si, Aewol‐eup, Eoeum‐ri. Altitude 610 m a.s.l. 33.37611°N, 126.39333°E | 15 October 2018 |
R. Ufimov and I. Tatanov/ 12‐3 | LE not barcoded |
MiSeq, 250/ MiSeq, 250 |
SRR12879563 (1,221,564)/ SRR12958378 (1,214,624) | |
|
| Silica‐dried leaves | Republic of Korea, Jeju‐do, Jeju‐si, Aewol‐eup, Eoeum‐ri. Altitude 705 m a.s.l. 33.37861°N, 126.38917°E | 15 October 2018 |
R. Ufimov and I. Tatanov/ 14‐15 | LE not barcoded |
MiSeq, 250/ MiSeq, 250 |
SRR12879562 (1,132,796)/ SRR12958377 (1,173,788) | |
|
|
2 0.173 | Silica‐dried leaves | Russian Federation, Stavropol Krai, Pyatigorsk, research station of Komarov Botanical Institute of the Russian Academy of Sciences, Perkalskiy Arboretum. Provenance: unknown | 27 October 2018 |
Z. Dutova/ s.n. | LE 01020851 |
MiSeq, 250/ MiSeq, 250 |
SRR12879559 (1,261,066)/ SRR12958374 (1,556,426) |
|
|
2 0.147 | Herbarium | Kazakhstan, Turkistan region, Akimat of Kentau, 10 km NNE Kentau, Karatau Nature Reserve, gorge of Byresik river, 1–1.5 km upstream river mouth. Altitude 775 m a.s.l. 43.601344°N, 68.602367°E | 27 May 2018 |
A. V. Grebenjuk/ 192KAZ524‐527 | LE not barcoded |
MiSeq, 250/ MiSeq, 250 |
SRR12879558 (1,146,976)/ SRR12958373 (1,585,440) |
|
| Silica‐dried leaves | Russian Federation, Saint Petersburg, Krasnoselsky Rayon, Duderhof heights, Orekhovaya Gora, Nagorny park. 59.698476°N, 30.127742°E | 29 September 2018 |
R. Ufimov/ 2 | LE 01020844 |
MiSeq, 250/ MiSeq, 250 |
SRR12879557 (1,195,478)/ SRR12958372 (1,016,402) | |
|
| Silica‐dried buds | Russian Federation, Komarov Botanical Institute of the Russian Academy of Sciences, Arboretum, plot 131. Provenance: Sakhalin Oblast, wild origin | 6 December 2018 |
R. Ufimov/ s.n | Unvouchered |
MiSeq, 250/ MiSeq, 250 |
SRR12879556 (1,446,204)/ SRR12958371 (1,148,290) | |
|
|
2 0.192 | Herbarium | Kyrgyzstan, Chuy region, Jayyl (Kalinin) district, Suusamyr Aiyl Okmotu, Tian Shan, W edge of Jumgal‐Too Range, Kökömeren river, near confluence of Suusamyr and Zapadniy Karakol rivers. Altitude 1995 m a.s.l. 42.093169°N, 74.123264°E | 31 July 2018 |
A. V. Grebenjuk/ 385ASM1217–385ASM1233 | LE not barcoded |
MiSeq, 250/ MiSeq, 250 |
SRR12879555 (1,179,900)/ SRR12958370 (1,034,408) |
|
|
2 0.446 | Fresh buds | Czech Republic, Prague, Charles University, Botanical Garden of the Faculty of Science, Central European Flora section (calcareous vegetation), ACCID: 2007.02068. Provenance: unknown | 8 February 2019 |
S. Píšová/ s.n. | Unvouchered |
NextSeq, 150/ MiSeq, 250 |
SRR12879560 (3,510,486)/ SRR12958375 (1,205,220) |
NCBI SRA = National Center for Biotechnology Information Sequence Read Archive.
The genome size (2C value, pg) of samples from Talent and Dickinson (2005) are marked with an asterisk. Ploidy was estimated mainly from seed isolates. If seeds were not available, ploidy estimates were obtained from silica‐dried leaves; however, these were of insufficient quality for ploidy estimation in a few cases.
| Probe set/reference | Malinae (25 species) | Outgroup ( | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Enrichment efficiency in mapped reads | No. (%) of loci with zero data | No. (%) of loci with ≥25% missing data | No. (%) of loci with ≥25% missing data, ≥75% species presence | No. (%) of loci with ≥50% missing data, ≥50% species presence | No. (%) of loci with ≥50% missing data, ≥75% species presence | No. (%) of loci with ≥75% missing data, ≥50% species presence | No. (%) of loci with ≥75% missing data, ≥75% species presence | Enrichment efficiency in mapped reads (%) | No. (%) of loci with zero data | No. (%) of loci with ≥25% missing data | No. (%) of loci with ≥50% missing data | No. (%) of loci with ≥75% missing data | |
| Malinae481 | 52.6% | 0 |
479 (99.6%) |
478 (99.4%) |
478 (99.4%) |
469 (97.5%) |
469 (97.5%) |
463 (96.3%) | 39.9% |
29 (6.0%) |
418 (86.9%) |
350 (72.8%) |
214 (44.5%) |
| Malinae‐optimized | 22.2% |
2 (0.5%) |
338 (95.8%) |
336 (95.2%) |
322 (91.2%) |
313 (88.7%) |
270 (76.5%) |
239 (67.7%) | 22.4% |
9 (2.5%) |
338 (95.8%) |
322 (91.2%) |
218 (61.8%) |
| bestHit‐modified | 11.3% |
62 (17.6%) |
148 (41.9%) |
132 (37.4%) |
66 (18.7%) |
58 (16.4%) |
32 (9.1%) |
27 (7.6%) | 11.9% |
92 (26.1%) |
161 (45.6%) |
69 (19.5%) |
28 (7.9%) |
Raw reads were processed using HybPhyloMaker version 1.6.4 (Fér and Schmickl, 2018). The following parameter options were taken: Reads were quality‐trimmed using Trimmomatic version 0.32 (LEADING:20 TRAILING:20 SLIDINGWINDOW:5:20 MINLEN:36). Duplicate reads were removed utilizing FastUniq version 1.1 (Xu et al., 2012). Subsequently, the reads were mapped to a “pseudoreference,” comprising the exonic probe sequences divided by a stretch of 800 Ns between each exon, using BWA. Three pseudoreferences were built from Malinae481, in addition to the Malinae‐optimized and bestHit‐modified reference sequences. As HybPhyloMaker allows only one sequence representative per locus, we used one of the two copies in the case of the paralogous loci of Malinae481, from the Malus sequence representatives. The 51% majority consensus sequences were generated with a minimum read depth of 8× using Kindel version 0.1.4 (Constantinides and Robertson, 2017). Consensus sequences were matched to the probe sequences using BLAT (Kent, 2002) with 80% minimum sequence similarity. To compare the assembly performance between the different reference sequences for read mapping and between HybPhyloMaker and HybPiper, six combinations of filter criteria were applied: ≤75%, ≤50%, and ≤25% of missing data per accession in each alignment, and ≥50% and ≥75% of accessions with sequence information for each alignment.
“Missing data” refers to the proportion of missing data per accession in each alignment.
“Species presence” refers to the proportion of accessions with sequence information per each alignment.
| No. (%) of loci | ||||
|---|---|---|---|---|
| Species | Malinae481 | Angiosperms353 | Malinae‐optimized | bestHit‐modified |
|
| 168 (47.6%) | 72 (20.4%) | 92 (26.1%) | 34 (9.6%) |
|
| 174 (49.3%) | 86 (24.4%) | 110 (31.2%) | 32 (9.1%) |
|
| 171 (48.4%) | 82 (23.2%) | 103 (29.2%) | 32 (9.1%) |
|
| 168 (47.6%) | 81 (23.0%) | 96 (27.2%) | 28 (7.9%) |
|
| 172 (48.7%) | 72 (20.4%) | 97 (27.5%) | 29 (8.2%) |
|
| 167 (47.3%) | 81 (23.0%) | 105 (29.8%) | 31 (8.8%) |
|
| 179 (50.7%) | 82 (23.2%) | 97 (27.5%) | 34 (9.6%) |
|
| 177 (50.1%) | 84 (23.8%) | 109 (30.9%) | 31 (8.8%) |
|
| 164 (46.5%) | 53 (15.0%) | 81 (23.0%) | 25 (7.1%) |
|
| 176 (49.9%) | 88 (24.9%) | 106 (30.0%) | 35 (9.9%) |
|
| 172 (48.7%) | 83 (23.5%) | 105 (29.8%) | 38 (10.8%) |
|
| 172 (48.7%) | 80 (22.7%) | 93 (26.4%) | 35 (9.9%) |
|
| 174 (49.3%) | 81 (23.0%) | 104 (29.5%) | 33 (9.4%) |
|
| 182 (51.6%) | 98 (27.8%) | 132 (37.4%) | 45 (12.8%) |
|
| 198 (56.1%) | 98 (27.8%) | 128 (36.3%) | 42 (11.9%) |
|
| 192 (54.4%) | 97 (27.5%) | 121 (34.3%) | 44 (12.5%) |
|
| 185 (52.4%) | 86 (24.4%) | 121 (34.3%) | 41 (11.6%) |
|
| 188 (53.3%) | 81 (23.0%) | 117 (33.1%) | 35 (9.9%) |
|
| 183 (51.8%) | 94 (26.6%) | 126 (35.7%) | 34 (9.6%) |
|
| 190 (53.8%) | 83 (23.5%) | 109 (30.9%) | 36 (10.2%) |
|
| 195 (55.2%) | 85 (24.1%) | 112 (31.7%) | 38 (10.8%) |
|
| 196 (55.5%) | 92 (26.1%) | 119 (33.7%) | 38 (10.8%) |
|
| 191 (54.1%) | 109 (30.9%) | 139 (39.4%) | 44 (12.5%) |
|
| 199 (56.4%) | 96 (27.2%) | 125 (35.4%) | 42 (11.9%) |
|
| 196 (55.5%) | 94 (26.6%) | 120 (34.0%) | 35 (9.9%) |
|
|
|
|
|
|
|
| 11 (3.1%) | 5 (1.4%) | 2 (0.6%) | 0 |
| Probe sets | Average mapped reads (%) | Missing data (%) |
|---|---|---|
| Angiosperms353 | 1.9 | 19.8 |
| Malinae481 | 2.6 | 18.1 |
| Combined | 2.3 | 0.6 |