Ryan A Folk1, Heather R Kates2, Raphael LaFrance2, Douglas E Soltis2,3,4,5, Pamela S Soltis2,4,5, Robert P Guralnick2,5. 1. Department of Biological Sciences Mississippi State University Mississippi State Mississippi USA. 2. Florida Museum of Natural History University of Florida Gainesville Florida USA. 3. Department of Biology University of Florida Gainesville Florida USA. 4. Genetics Institute University of Florida Gainesville Florida USA. 5. Biodiversity Institute University of Florida Gainesville Florida USA.
Abstract
PREMISE: Large phylogenetic data sets have often been restricted to small numbers of loci from GenBank, and a vetted sampling-to-sequencing phylogenomic protocol scaling to thousands of species is not yet available. Here, we report a high-throughput collections-based approach that empowers researchers to explore more branches of the tree of life with numerous loci. METHODS: We developed an integrated Specimen-to-Laboratory Information Management System (SLIMS), connecting sampling and wet lab efforts with progress tracking at each stage. Using unique identifiers encoded in QR codes and a taxonomic database, a research team can sample herbarium specimens, efficiently record the sampling event, and capture specimen images. After sampling in herbaria, images are uploaded to a citizen science platform for metadata generation, and tissue samples are moved through a simple, high-throughput, plate-based herbarium DNA extraction and sequencing protocol. RESULTS: We applied this sampling-to-sequencing workflow to ~15,000 species, producing for the first time a data set with ~50% taxonomic representation of the "nitrogen-fixing clade" of angiosperms. DISCUSSION: The approach we present is appropriate at any taxonomic scale and is extensible to other collection types. The widespread use of large-scale sampling strategies repositions herbaria as accessible but largely untapped resources for broad taxonomic sampling with thousands of species.
PREMISE: Large phylogenetic data sets have often been restricted to small numbers of loci from GenBank, and a vetted sampling-to-sequencing phylogenomic protocol scaling to thousands of species is not yet available. Here, we report a high-throughput collections-based approach that empowers researchers to explore more branches of the tree of life with numerous loci. METHODS: We developed an integrated Specimen-to-Laboratory Information Management System (SLIMS), connecting sampling and wet lab efforts with progress tracking at each stage. Using unique identifiers encoded in QR codes and a taxonomic database, a research team can sample herbarium specimens, efficiently record the sampling event, and capture specimen images. After sampling in herbaria, images are uploaded to a citizen science platform for metadata generation, and tissue samples are moved through a simple, high-throughput, plate-based herbarium DNA extraction and sequencing protocol. RESULTS: We applied this sampling-to-sequencing workflow to ~15,000 species, producing for the first time a data set with ~50% taxonomic representation of the "nitrogen-fixing clade" of angiosperms. DISCUSSION: The approach we present is appropriate at any taxonomic scale and is extensible to other collection types. The widespread use of large-scale sampling strategies repositions herbaria as accessible but largely untapped resources for broad taxonomic sampling with thousands of species.
Authors: Matthew G Johnson; Lisa Pokorny; Steven Dodsworth; Laura R Botigué; Robyn S Cowan; Alison Devault; Wolf L Eiserhardt; Niroshini Epitawalage; Félix Forest; Jan T Kim; James H Leebens-Mack; Ilia J Leitch; Olivier Maurin; Douglas E Soltis; Pamela S Soltis; Gane Ka-Shu Wong; William J Baker; Norman J Wickett Journal: Syst Biol Date: 2019-07-01 Impact factor: 15.683
Authors: Naim Matasci; Ling-Hong Hung; Zhixiang Yan; Eric J Carpenter; Norman J Wickett; Siavash Mirarab; Nam Nguyen; Tandy Warnow; Saravanaraj Ayyampalayam; Michael Barker; J Gordon Burleigh; Matthew A Gitzendanner; Eric Wafula; Joshua P Der; Claude W dePamphilis; Béatrice Roure; Hervé Philippe; Brad R Ruhfel; Nicholas W Miles; Sean W Graham; Sarah Mathews; Barbara Surek; Michael Melkonian; Douglas E Soltis; Pamela S Soltis; Carl Rothfels; Lisa Pokorny; Jonathan A Shaw; Lisa DeGironimo; Dennis W Stevenson; Juan Carlos Villarreal; Tao Chen; Toni M Kutchan; Megan Rolf; Regina S Baucom; Michael K Deyholos; Ram Samudrala; Zhijian Tian; Xiaolei Wu; Xiao Sun; Yong Zhang; Jun Wang; Jim Leebens-Mack; Gane Ka-Shu Wong Journal: Gigascience Date: 2014-10-27 Impact factor: 6.524
Authors: Gil Nelson; Patrick Sweeney; Lisa E Wallace; Richard K Rabeler; Dorothy Allard; Herrick Brown; J Richard Carter; Michael W Denslow; Elizabeth R Ellwood; Charlotte C Germain-Aubrey; Ed Gilbert; Emily Gillespie; Leslie R Goertzen; Ben Legler; D Blaine Marchant; Travis D Marsico; Ashley B Morris; Zack Murrell; Mare Nazaire; Chris Neefus; Shanna Oberreiter; Deborah Paul; Brad R Ruhfel; Thomas Sasek; Joey Shaw; Pamela S Soltis; Kimberly Watson; Andrea Weeks; Austin R Mast Journal: Appl Plant Sci Date: 2015-09-10 Impact factor: 1.936
Authors: Kevin Weitemier; Shannon C K Straub; Richard C Cronn; Mark Fishbein; Roswitha Schmickl; Angela McDonnell; Aaron Liston Journal: Appl Plant Sci Date: 2014-08-29 Impact factor: 1.936
Authors: Gregory W Stull; Michael J Moore; Venkata S Mandala; Norman A Douglas; Heather-Rose Kates; Xinshuai Qi; Samuel F Brockington; Pamela S Soltis; Douglas E Soltis; Matthew A Gitzendanner Journal: Appl Plant Sci Date: 2013-01-31 Impact factor: 1.936
Authors: Heather R Kates; Joshua R Doby; Carol M Siniscalchi; Raphael LaFrance; Douglas E Soltis; Pamela S Soltis; Robert P Guralnick; Ryan A Folk Journal: Front Plant Sci Date: 2021-06-23 Impact factor: 5.753
Authors: Angela J McDonnell; William J Baker; Steven Dodsworth; Félix Forest; Sean W Graham; Matthew G Johnson; Lisa Pokorny; Jennifer Tate; Susann Wicke; Norman J Wickett Journal: Appl Plant Sci Date: 2021-07-26 Impact factor: 1.936
Authors: W John Kress; Douglas E Soltis; Paul J Kersey; Jill L Wegrzyn; James H Leebens-Mack; Morgan R Gostel; Xin Liu; Pamela S Soltis Journal: Proc Natl Acad Sci U S A Date: 2022-01-25 Impact factor: 11.205