| Literature DB >> 32457375 |
Morgan R Gostel1, Jose D Zúñiga2, W John Kress3, Vicki A Funk3, Caroline Puente-Lelievre4.
Abstract
DNA barcoding is a valuable tool to support species identification with broad applications from traditional taxonomy, ecology, forensics, food analysis, and environmental science. We introduce Microfluidic Enrichment Barcoding (MEBarcoding) for plant DNA Barcoding, a cost-effective method for high-throughput DNA barcoding. MEBarcoding uses the Fluidigm Access Array to simultaneously amplify targeted regions for 48 DNA samples and hundreds of PCR primer pairs (producing up to 23,040 PCR products) during a single thermal cycling protocol. As a proof of concept, we developed a microfluidic PCR workflow using the Fluidigm Access Array and Illumina MiSeq. We tested 96 samples for each of the four primary DNA barcode loci in plants: rbcL, matK, trnH-psbA, and ITS. This workflow was used to build a reference library for 78 families and 96 genera from all major plant lineages - many currently lacking in public databases. Our results show that this technique is an efficient alternative to traditional PCR and Sanger sequencing to generate large amounts of plant DNA barcodes and build more comprehensive barcode databases.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32457375 PMCID: PMC7250904 DOI: 10.1038/s41598-020-64919-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
PCR Primer sets used in this study, with references.
| Target-Specific Sequence (Forward) | Target Name | Reference |
|---|---|---|
| ATGTCACCACAAACAGAGACTAAAGC | rbcLa-F | Kress & Erickson, 2007 |
| GTAAAATCAAGTCCACCRCG | rbcLa-R | Kress |
| TCGCATGTACCTGCAGTAGC | rbcL-724r | Cowan |
| ATGTCACCACAAACAGAAAC | rbcL-1f | Fay |
| ATGTCACCAAAAACAGAGACT | rbcL_3R-Gym | Wang |
| GGACATACGCAATGCTTTAG | rbcL_2F-Gym | Wang |
| GTTATGCATGAACGTAATGCTC | psbA3_f | Sang |
| CGCGCATGGTGGATTCACAATCC | trnHf_05 | Tate & Simpson, 2003 |
| TCA YCC GGA RAT TTT GGT TCG | matKGym_F1A | Li |
| CGTACAGTACTTTTGTGTTTACGAG | matK3F_KIM f | Ki-Joong Kim, unpubl. |
| ACCCAGTCCATCTGGAAATCTTGGTTC | matK1R_KIM-r | Ki-Joong Kim, unpubl. |
| TAATTTACGATCAATTCATTC | matK-xF | Saslis-Lagoudakis |
| ACAAGAAAGTCGAAGTAT | matK-MALP | Dunning & Savolainen, 2010 |
| CGATCTATTCATTCAATATTTC | matK-390F | Cuénoud |
| ACCCAGTCCATCTGGAAATCTTGGTTC | matK-1RKIM-f | Ki-Joong Kim, unpubl. |
| TCTAGCACACGAAAGTCGAAGT | matK-1326R | Cuénoud |
| CCTTATCATTTAGAGGAAGGAG | ITS5a-fwd | Stanford |
| TCCTCCGCTTATTGATATGC | ITS4 | White |
| GACGCTTCTCCAGACTACAAT | ITS2-S3R | Chen |
| ATGCGATACTTGGTGTGAAT | ITS2-S2F | Chen et al., 2010 |
| GCAATTCACACCAAGTATCGC | ITS-C | Blattner, 1999 |
| GGAAGGAGAAGTCGTAACAAGG | ITS-A | Blattner, 1999 |
Comparison of the number of barcode sequences in the Barcode of Life Data System (BOLD, boldsystems.org) for major lineages of life on Earth with an estimated number of species >10,000.
| Taxonomic rank | Estimated # barcode sequences† | Estimated # spp. | Estimated % species with barcodes |
|---|---|---|---|
| Animalia: Annelida | 4,633 | 17,388‡ | 26.6% |
| Animalia: Arthropoda | 228,051 | 1,257,040‡ | 18.14% |
| Animalia: Chordata | 36,552 | 49,693‡ | 73.56% |
| Animalia: Cnidaria | 2,674 | 10,203‡ | 26.21% |
| Animalia: Mollusca | 15,557 | 80,000‡ | 19.45% |
| Animalia: Nematoda | 1,493 | 25,033‡ | 5.96% |
| Animalia: Platyhelminthes | 681 | 29,487‡ | 2.31% |
| Fungi | 29,168 | 140,000‡ | 20.83% |
| Plantae: Magnoliophyta | 65,340 | 352,000§ | 18.56% |
| Plantae: Bryophyta | 1,870 | 20,000§ | 9.35% |
| Plantae: Lycopodiophyta & Pteridophyta | 3,983 | 13,000§ | 30.64% |
| Plantae: Pinophyta | 775 | 1,000§ | 77.5% |
†Barcode of Life Data Systems, boldsystems.org, accessed 4 June 2019.
‡The Catalog of Life, www.catalogueoflife.org, accessed 4 June 2019.
§The Plant List, www.theplantlist.org, accessed 4 June 2019.
Figure 1Diagram of the microfluidic PCR workflow for MEBarcoding.
Figure 2Comparison of costs from different approaches to plant barcode sequencing methods discussed in this study. Costs are estimated for a large laboratory with equipped with automated instruments for DNA extraction (Autogen) and a full time technician. For all except for Genome Skimming and the Juno, time estimates are from actual time estimates drawn from direct experience by the authors of this study and expenses reflect actual expenses from the budget used in this study, including a full time molecular technician at $15.00/hour. Multiple amplicon high-throughput sequencing (HTS) is meant to reflect methods that use a combination of traditional PCR with HTS (e.g.[21,24]). Genome skimming time and cost estimates are based upon lowest current market rates for library preparation from kits and core facilities and personal communication with genomics core facility lab technicians.
Figure 3A boxplot showing the average number of plant DNA barcode loci recovered from this study, categorized by higher classification for each major lineage of land plant.
Comparison of PCR amplification and sequencing success rate from the traditional PCR and Sanger Sequencing approach and the MEBarcoding results from this study.
| Marker | # successful sequences from MEBarcoding | # successful sequences from Sanger sequencing | Average pairwise distance between MEBarcoding & Sanger |
|---|---|---|---|
| 91/96 | 94 | 99.59% | |
| 58 | 71 | 99.3% | |
| 83 | 88 | 97.56% | |
| ITS | 64 | 67 | 95.57% |
When available, sequences from both approaches were compared and a pairwise identity score is provided.
Figure 4A barplot showing the relative number of sequencing reads generated for each of the 96 plant species sampled in this study. Blue and Red bars correspond to the number of reads from raw and cleaned sequence data, respectively.
Figure 5Boxplots showing the average number of sequence reads per sample from MEBarcoding (microfluidic PCR) for each marker employed in this study, organized by locus.