| Literature DB >> 25506521 |
Lee A Ripma1, Michael G Simpson1, Kristen Hasenstab-Lehman2.
Abstract
PREMISE OF THE STUDY: As systematists grapple with how to best harness the power of next-generation sequencing (NGS), a deluge of review papers, methods, and analytical tools make choosing the right method difficult. Oreocarya (Boraginaceae), a genus of 63 species, is a good example of a group lacking both species-level resolution and genomic resources. The use of Geneious removes bioinformatic barriers and makes NGS genome skimming accessible to even the least tech-savvy systematists. •Entities:
Keywords: Amsinckiinae; Geneious; Oreocarya; genome skimming; next-generation sequencing (NGS); phylogenomics
Year: 2014 PMID: 25506521 PMCID: PMC4259456 DOI: 10.3732/apps.1400062
Source DB: PubMed Journal: Appl Plant Sci ISSN: 2168-0450 Impact factor: 1.936
Voucher information for collections used in this study.
| Taxon (population identifier) | Accession no. | Collection no. | Geographic coordinates | GenBank accession no. (nrDNA) | GenBank accession no. (cpDNA) |
| SDSU 20050 | Simpson 3665 | 32.94626, −116.29714 | KM213400 | KP096536 | |
| SDSU 19537 | Simpson 3142 | 34.50633, −119.87112 | KM213423 | KP096534 | |
| SDSU 20124 | Ripma 377 | 43.4617, −113.56226 | KM213409 | KP096524 | |
| SDSU 20343 | Kelley 1951 | 46.23503, −115.65855 | KM213413 | KP096521 | |
| SDSU 18956 | Guilliams 602 | 31.79778, −110.8091 | KM213422 | KP096527 | |
| SDSU 20113 | Ripma 379 | 44.54885, −120.33336 | KM213405 | KP096525 | |
| SDSU 20116 | Ripma 390 | 38.61707, −119.83705 | KM213411 | KP096532 | |
| SDSU 20030 | Ripma 307 | 37.3857, −118.1805 | KM213424 | KP096526 | |
| SDSU 20036 | Ripma 306 | 37.2635, −118.15706 | KM213407 | KP096537 | |
| SDSU 20029 | Ripma 303 | 37.74431, −119.02917 | KM213404 | KP096530 | |
| SDSU 20086 | Ripma 374 | 45.66983, −112.81633 | KM213410 | KP096523 | |
| SDSU 20004 | Ripma 312 | 36.4505, −118.163 | KM213408 | KP096528 | |
| SDSU 20055 | Ripma 301 | 37.62715, −119.03075 | KM213402 | KP096540 | |
| SDSU 20094 | Ripma 399 | 37.91395, −119.03845 | KM213417 | KP096542 | |
| SDSU 20079 | Ripma 363 | 37.41913, −118.7518 | KM213419 | KP096541 | |
| SDSU 20098 | Ripma 395 | 38.2858, −119.64189 | KM213406 | KP096543 | |
| SDSU 20123 | Ripma 370 | 43.90865, −117.62695 | KM213420 | KP096522 | |
| SDSU 20242 | Kelley 1466 | 35.34968, −111.74575 | KM213418 | KP096531 | |
| SDSU 20210 | Kelley 1169 | 48.4816, −113.34305 | KM213415 | KP096535 | |
| SDSU 20232 | Kelley 928 | 41.31715, −122.48227 | KM213416 | KP096545 | |
| SDSU 20107 | Ripma 384 | 42.95651, −122.04713 | KM213412 | KP096539 | |
| SDSU 20110 | Ripma 389 | 41.44734, −120.24316 | KM213421 | KP096544 | |
| SDSU 20024 | Ripma 308 | 37.49136, −118.18608 | KM213414 | KP096529 | |
| SDSU 20117 | Ripma 371 | 40.92065, −106.31195 | KM213401 | KP096538 | |
| UC 1965571 | Kelley 1967 | 34.3060, −117.46565 | KM213403 | KP096533 |
mtDNA data available from the Dryad Digital Repository (http://doi.org/10.5061/dryad.50536; Ripma et al., 2014).
Taxa sampled in this study, preservation type, raw and post–quality control read numbers, sequencing depth, and library content for each genome region.
| Taxon (population identifier) | Accession no. | Preservation type | Raw read no. | Read no. post-QC | Reads retained (%) | nrDNA depth | % nrDNA | cpDNA depth | % cpDNA | mtDNA depth | % mtDNA |
| SDSU 20050 | H | 3,515,019 | 2,366,721 | 67.33 | 396.6 | 1.10 | 246.2 | 12.99 | 28.5 | 4.62 | |
| SDSU 19537 | H | 2,553,590 | 2,219,773 | 86.93 | 233.6 | 0.68 | 109.2 | 6.08 | 12.9 | 2.12 | |
| SDSU 20124 | S | 3,634,668 | 2,922,047 | 80.39 | 530.3 | 1.31 | 227.7 | 10.27 | 31.7 | 4.46 | |
| SDSU 20343 | H | 2,271,537 | 2,037,136 | 89.68 | 535.5 | 1.69 | 67.4 | 4.05 | 10.8 | 1.90 | |
| SDSU 18956 | H | 3,281,361 | 2,627,734 | 80.10 | 334.3 | 0.92 | 85.3 | 4.21 | 18.1 | 2.75 | |
| SDSU 20113 | S | 5,549,018 | 4,520,946 | 81.50 | 639.9 | 1.01 | 205.2 | 5.96 | 40.8 | 3.72 | |
| SDSU 20116 | S | 3,662,963 | 2,902,007 | 79.20 | 434.5 | 1.06 | 199.8 | 9.05 | 56.1 | 8.04 | |
| SDSU 20030 | S | 9,697,938 | 7,593,640 | 78.30 | 1129.1 | 1.08 | 329.6 | 5.74 | 50 | 2.74 | |
| SDSU 20036 | S | 5,339,723 | 4,250,549 | 79.60 | 661.3 | 1.12 | 245.1 | 7.60 | 50.4 | 4.92 | |
| SDSU 20029 | S | 4,420,140 | 3,463,693 | 78.40 | 609.4 | 1.26 | 173.7 | 6.57 | 28.3 | 3.33 | |
| SDSU 20086 | S | 4,256,094 | 3,304,770 | 77.60 | 548.1 | 1.18 | 239.9 | 9.57 | 36.2 | 4.52 | |
| SDSU 20004 | S | 9,197,464 | 6,866,170 | 74.70 | 1135.8 | 1.20 | 496 | 9.64 | 102.8 | 6.36 | |
| SDSU 20055 | S | 1,760,164 | 1,423,937 | 80.90 | 338.8 | 1.69 | 142 | 13.02 | 17.8 | 4.99 | |
| SDSU 20094 | S | 5,054,148 | 4,050,809 | 80.10 | 446.9 | 0.79 | 131.6 | 4.24 | 35.9 | 3.64 | |
| SDSU 20079 | S | 5,452,126 | 4,490,708 | 82.40 | 450.7 | 0.72 | 156.5 | 4.56 | 41.9 | 3.85 | |
| SDSU 20098 | S | 3,020,887 | 2,427,679 | 80.40 | 336.1 | 0.98 | 187.1 | 10.11 | 32.8 | 5.56 | |
| SDSU 20123 | S | 6,139,787 | 5,000,894 | 81.50 | 586.3 | 0.84 | 219.3 | 5.76 | 52.7 | 4.38 | |
| SDSU 20242 | H | 5,280,735 | 4,093,537 | 77.50 | 558.3 | 0.95 | 95.9 | 3.05 | 65.1 | 6.66 | |
| SDSU 20210 | H | 4,256,120 | 3,495,577 | 82.10 | 546 | 1.11 | 149.6 | 5.60 | 29.3 | 3.45 | |
| SDSU 20232 | H | 5,112,245 | 3,974,139 | 77.70 | 529.6 | 0.99 | 180.5 | 5.95 | 63.6 | 6.72 | |
| SDSU 20107 | S | 2,774,358 | 2,224,590 | 80.20 | 346.8 | 1.11 | 94.9 | 5.55 | 17.3 | 3.10 | |
| SDSU 20110 | S | 3,096,293 | 2,463,930 | 79.60 | 291.9 | 0.84 | 202.5 | 10.80 | 42.5 | 7.13 | |
| SDSU 20024 | S | 9,309,461 | 7,259,178 | 78.00 | 1563.3 | 1.56 | 402.8 | 7.38 | 75.8 | 4.40 | |
| SDSU 20117 | S | 3,001,782 | 2,342,067 | 78.02 | 406.1 | 1.14 | 169.1 | 8.97 | 37.8 | 6.21 | |
| UC 1965571 | H | 2,151,733 | 1,741,444 | 80.90 | 326.2 | 1.35 | 145.6 | 10.92 | 17.6 | 4.02 | |
| Maximum | 9,697,938 | 7,593,640 | 89.68 | 1,563.30 | 1.69 | 496 | 13.02 | 102.8 | 8.04 | ||
| Minimum | 1,760,164 | 1,423,937 | 67.33 | 233.6 | 0.68 | 67.4 | 3.05 | 10.8 | 1.90 | ||
| Mean (±SE) | 4,551,574 (±435,943) | 3,602,547 (±333,732) | 79.72 (±0.79) | 556.6 (±60.6) | 1.11 (±0.05) | 196.1 (±19.6) | 7.51 (±0.57) | 39.9 (±4.3) | 4.54 (±0.32) | ||
| IQR | 2,318,836 | 1,883,828 | 2.88 | 239.5 | 0.25 | 85.7 | 4.04 | 22.1 | 2.10 |
Note: H = herbarium sheet; IQR = interquartile range; QC = quality control; S = silica gel dried; SE = standard error.
Run 2,1/53 samples, read length 101 bp; all remaining taxa from Run 1, 1/38 samples, read length 100 bp.
Data are not normally distributed but SE is presented to give a general sense of variation around the mean, see also IQR.
Fig. 1.Illustrated workflow for generating the nuclear ribosomal cistron using Geneious.
Fig. 2.Illustrated workflow for generating chloroplast DNA using Geneious.
Fig. 3.Illustrated workflow for generating mitochondrial DNA exons using Geneious.
Final aligned sequence length excluding gaps and ambiguities, variable characters, and parsimony informative characters for the nuclear ribosomal DNA, chloroplast DNA, and mitochondrial DNA.
| Genome region | Aligned sequence length | All taxa: variable characters | All taxa: PICs | All taxa: % PICs | |||
| Nuclear ribosomal DNA | 5866 | 320 | 150 | 2.56 | 68 | 19 | 0.32 |
| Chloroplast DNA | 115,745 | 4101 | 556 | 0.48 | 586 | 104 | 0.09 |
| Mitochondrial exons | 2661 | 1049 | 505 | 18.98 | 919 | 407 | 15.30 |
| Total | 124,272 | 5470 | 1211 | 0.97 | 1573 | 530 | 0.43 |
Note: PICs = Parsimony informative characters.
Multiplexing calculations for the mean and minimum CF and PTG values from Run 1 and Run 2 when the genomic target is the plastome with one copy of the inverted repeat region and a sequencing depth of 30×.
| Parameters | Equation abbreviation | Run 1 mean | Run 1 minimum | Run 2 mean | Run 2 minimum |
| Read length (bp) | 100 | 100 | 101 | 101 | |
| Total reads generated in a lane | 156,000,000 | 156,000,000 | 170,464,908 | 170,464,908 | |
| Lane capacity (nucleotides) | LC | 15,600,000,000 | 15,600,000,000 | 17,046,490,800 | 17,046,490,800 |
| Reads passing quality filters | CF | 0.7960 | 0.7470 | 0.8049 | 0.6733 |
| Proportion mapping to genomic target | PTG | 0.0741 | 0.0305 | 0.0802 | 0.0405 |
| Coverage depth desired | CD | 30 | 30 | 30 | 30 |
| Target genome size | TaG | 124,868 | 124,868 | 124,868 | 124,868 |
| Multiplexing possible | ML | 245.5 | 94.7 | 293.8 | 124.2 |
Fig. 4.The results of maximum likelihood RAxML phylogenetic analyses for 19 Oreocarya and six outgroups obtained from the nuclear ribosomal DNA (panel A), the chloroplast DNA (panel B), all mitochondrial genes (panel C), all data concatenated (panel D), and a STAR species tree (panel E). The support values from 10,000 bootstrap replicates are displayed. Cladograms are shown so relationships are visible, with inset phylograms to show branch lengths.