Literature DB >> 31952556

DNA barcode trnH-psbA is a promising candidate for efficient identification of forage legumes and grasses.

Miguel Loera-Sánchez1, Bruno Studer1, Roland Kölliker2.   

Abstract

OBJECTIVE: Grasslands are widespread ecosystems that fulfil many functions. Plant species richness (PSR) is known to have beneficial effects on such functions and monitoring PSR is crucial for tracking the effects of land use and agricultural management on these ecosystems. Unfortunately, traditional morphology-based methods are labor-intensive and cannot be adapted for high-throughput assessments. DNA barcoding could aid increasing the throughput of PSR assessments in grasslands. In this proof-of-concept work, we aimed at determining which of three plant DNA barcodes (rbcLa, matK and trnH-psbA) best discriminates 16 key grass and legume species common in temperate sub-alpine grasslands.
RESULTS: Barcode trnH-psbA had a 100% correct assignment rate (CAR) in the five analyzed legumes, followed by rbcLa (93.3%) and matK (55.6%). Barcode trnH-psbA had a 100% CAR in the grasses Cynosurus cristatus, Dactylis glomerata and Trisetum flavescens. However, the closely related Festuca, Lolium and Poa species were not always correctly identified, which led to an overall CAR in grasses of 66.7%, 50.0% and 46.4% for trnH-psbA, matK and rbcLa, respectively. Barcode trnH-psbA is thus the most promising candidate for PSR assessments in permanent grasslands and could greatly support plant biodiversity monitoring on a larger scale.

Entities:  

Keywords:  DNA barcoding; Forages; Grasslands; Species richness

Mesh:

Substances:

Year:  2020        PMID: 31952556      PMCID: PMC6969398          DOI: 10.1186/s13104-020-4897-5

Source DB:  PubMed          Journal:  BMC Res Notes        ISSN: 1756-0500


Introduction

Grasslands are some of the most widespread ecosystems on Earth, covering two-fifth of its land surface [1]. They provide roughage for ruminant livestock production and many other environmental services related to carbon sequestration, water flow regulation and soil stabilization [2, 3]. Plant species richness (PSR) is a component of biodiversity with major effects on the ecosystem functioning of grasslands. In experimental grassland plant communities, high levels of PSR stabilize yields and confer tolerance against environmental stressors [4]. Similar effects have been observed in semi-natural grasslands, which are composed of a limited number of species and are an important component of sustainable livestock production [5]. Assessing PSR is thus crucial for tracking its changes and effects on ecosystem services. However, such assessments have traditionally relied on morphology-based surveys that are labor-intensive and require trained taxonomists, limiting their use for surveying PSR over large scales and long time periods [3]. Furthermore, grasses and legumes (the two plant families of major economic relevance in temperate grasslands) can be taxonomically assessed with highest precision only when certain distinctive morphological characters are on display (e.g., flowering bodies and leaves). Still, some grass and legume species are difficult to distinguish from closely related species. A standardized, precise, high-throughput solution for PSR surveys in grasslands is therefore desirable for large-scale assessments of changes in PSR. DNA barcoding is a methodology that has been successfully applied for standardizing and increasing the throughput of PSR surveys in ecological studies [6, 7]. DNA barcodes are organellar or nuclear loci that show a high degree of species-level conservation [8, 9]. By comparing newly sequenced DNA barcodes to reference databases, it is possible to assign an unknown biological sample to its correct taxonomy. An international effort is currently in place to maintain a well-curated, public reference database of DNA barcodes (The Barcode Of Life Datasystems database, BOLD [10]). In animals, the DNA barcode of choice is the mitochondrial COI gene, which can reproducibly differentiate most of the major animal phyla [8]. In plants, in contrast, there is no single DNA barcode with comparable success [11]. Most plant DNA barcodes are located in the chloroplast genome, either within coding sequences (such as rbcLa and matK) or in intergenic regions (such as trnH-psbA) [11, 12], although some nuclear loci have also been used as DNA barcodes, e.g., the internal transcribed spacer of the ribosomal DNA (ITS) [13]. More than one barcode per plant individual are typically sequenced and used for taxonomical assignments [11, 12]. However, sequencing more than one DNA barcode per plant may not be technically feasible in higher throughput settings, particularly when analyzing mixed-species samples. The aim of the present study was to determine the best DNA barcode sequences for forage species by screening the BOLD database for promising candidates and sequencing three DNA barcodes (rbcLa, matK and trnH-psbA) from multiple cultivars of 16 forage plant species that are common in sub-alpine grasslands.

Main text

Methods

Plant material and DNA extraction

Seeds of 2–3 cultivars of 16 forage species (Alopecurus pratensis L., Arrhenaterum elatius L., Cynosurus cristatus L., Dactylis glomerata L., Festuca pratensis Huds., F. rubra L., Lolium perenne L., L. multiflorum Lam., Lotus corniculatus L., Medicago sativa L., Phleum pratense L., Poa pratensis L., Trifolium pratense L., T. repens L. and Trisetum flavescens L.), kindly provided by Agroscope, Zurich, Switzerland were used for the study (Table 1). Seeds were germinated and transferred into pot trays (77 wells, 50 cm × 32 cm, with compost as substrate). The species selected are predominant components of sub-alpine grasslands and hold great potential for multifunctional, species-rich agriculture [14, 15]. Plants were grown for 3 weeks after which DNA was extracted from three plants per species. For grasses, three leaf fragments of ~ 1 cm and for legumes three young leaflets were harvested. The plant material was freeze-dried for 48 h and pulverized in a QIAGEN TissueLyser II (QIAGEN, Hilden, Germany). DNA was extracted using the NucleoSpin® II kit (Macherey–Nagel, Düren, Germany) and its integrity visually inspected by agarose gel electrophoresis (1% w/v). DNA purity and concentration were determined with a NanoDrop™ spectrophotometer (ThermoFisher Scientific, Waltham, MA, USA).
Table 1

Size of barcode sequences obtained for 48 plants from 16 different species of forage grasses and legumes

BOLD Process IDSpeciesCultivarSequence size in bp [number of n’s in sequence]
rbcLamatKtrnH-psbA
SWFRG013-19Alopecurus pratensis‘Alko’ (Saatszucht Steinach, DE)0a838 [0]0a
SWFRG014-19Alopecurus pratensis‘Alopex’ (Agroscope, CH)0a663 [0]0a
SWFRG029-19Alopecurus pratensis‘Alopex’ (Agroscope, CH)552 [0]185 [0]569 [0]
SWFRG015-19Arrhenatherum elatius‘Arone’ (Saatszucht Steinach, DE)549 [0]436 [0]512 [0]
SWFRG016-19Arrhenatherum elatius‘Median’ (DLF Životice, CZ)550 [0]610 [0]0a
SWFRG031-19Arrhenatherum elatius‘Median’ (DLF Životice, CZ)0a825 [0]0a
SWFRG030-19Cynosurus cristatus‘Cresta’ (Agroscope, CH)541 [0]513 [0]466 [0]
SWFRG045-19Cynosurus cristatus‘Lena’ (HBLF, AT)585 [0]531 [0]564 [0]
SWFRG046-19Cynosurus cristatus‘Rožnovská’ (OSEVA PRO, CZ)529 [0]870 [0]569 [0]
SWFRG001-19Dactylis glomerata‘Barexcel’ (Barenbrug, NL)577 [0]640 [0]519 [0]
SWFRG002-19Dactylis glomerata‘Brennus’ (R2n, FR)546 [0]865 [13]576 [0]
SWFRG017-19Dactylis glomerata‘Reda’ (Agroscope, CH)534 [0]866 [13]561 [0]
SWFRG003-19Festuca pratensis‘Cosmolit’ (Saatszucht Steinach, DE)547 [0]579 [0]558 [0]
SWFRG004-19Festuca pratensis‘Paradisia’ (Agroscope, CH)549 [0]888 [7]566 [0]
SWFRG019-19Festuca pratensis‘Pradel’ (Agroscope, CH)552 [0]590 [0]553 [0]
SWFRG007-19Festuca rubra‘Echo’ (DLF-Trifolium, DK)559 [0]886 [3]274 [21]
SWFRG008-19Festuca rubra‘Pran Solas’ (Schweizer, CH)588 [0]869 [0]570 [0]
SWFRG023-19Festuca rubra‘Roland’ (Saatszucht Steinach, DE)558 [0]543 [0]594 [0]
SWFRG005-19Lolium multiflorum‘Axis’ (Agroscope, CH)581 [0]874 [0]551 [0]
SWFRG006-19Lolium multiflorum‘Caribu’ (Agroscope, CH)577 [0]884 [8]567 [7]
SWFRG021-19Lolium multiflorum‘Zebra’ (Agroscope, CH)571 [0]586 [0]551 [0]
SWFRG009-19Lolium perenne‘Arara’ (Agroscope, CH)547 [0]883 [3]539 [5]
SWFRG010-19Lolium perenne‘Arvella’ (Agroscope, CH)582 [0]481 [0]567 [15]
SWFRG025-19Lolium perenne‘Lipresso’ (Euro Grass, DE)488 [0]835 [0]614 [0]
SWFRG024-19Lotus corniculatus‘Lotar’ (OSEVA UNI, SK)544 [0]399 [0]294 [0]
SWFRG039-19Lotus corniculatus‘Lotar’ (OSEVA UNI, SK)548 [0]509 [0]414 [0]
SWFRG040-19Lotus corniculatus‘Polom’ (CVRV, VÚRV, CZ)502 [0]702 [0]412 [0]
SWFRG022-19Medicago sativa‘Artemis’ (Barenbrug, NL)580 [0]410 [0]268 [14]
SWFRG037-19Medicago sativa‘Catera’ (Saatszucht Steinach, DE)526 [0]435 [0]438 [3]
SWFRG038-19Medicago sativa‘Sanditi’ (Barenbrug, NL)548 [0]432 [0]445 [2]
SWFRG028-19Onobrychis viciifolia‘Perdix’ (Agroscope, CH)550 [0]576 [0]289 [0]
SWFRG043-19Onobrychis viciifolia‘Perly’ (Agroscope, CH)582 [0]627 [0]284 [0]
SWFRG044-19Onobrychis viciifolia‘Višňovský’ (Agrogen, CZ)543 [0]694 [10]287 [0]
SWFRG032-19Phleum pratense‘Anjo’ (ILVO, BE)540 [0]0a0a
SWFRG047-19Phleum pratense‘Tiller’ (DLF-Trifolium, DK)576 [0]527 [5]584 [0]
SWFRG048-19Phleum pratense‘Toro’ (CRA-FLC, IT)0a516 [1]0a
SWFRG011-19Poa pratensis‘Likollo’ (DSV, DE)470 [0]865 [0]540 [0]
SWFRG012-19Poa pratensis‘Nixe’ (Saatszucht Steinach, DE)571 [0]868 [0]576 [0]
SWFRG027-19Poa pratensis‘Tommy’ (DLF-Trifolium, DK)0a489 [0]0a
SWFRG020-19Trifolium pratense‘Bonus’ (Selgen, CZ)549 [0]0a410 [0]
SWFRG035-19Trifolium pratense‘Diplomat’ (DSV, DE)564 [0]556 [0]485 [0]
SWFRG036-19Trifolium pratense‘Pavo’ (Agroscope, CH)514 [0]430 [0]496 [0]
SWFRG026-19Trifolium repens‘Beaumont’ (CW 090; Barenbrug, NL)550 [0]419 [0]448 [0]
SWFRG041-19Trifolium repens‘Bombus’ (Agroscope, CH)579 [0]481 [0]471 [0]
SWFRG042-19Trifolium repens‘Hebe’ (Svalöf-Weibull, SE)571 [0]444 [0]447 [0]
SWFRG018-19Trisetum flavescens‘Gunther’ (HBLFA, AT)504 [0]859 [6]570 [0]
SWFRG033-19Trisetum flavescens‘Gunther’ (HBLFA, AT)575 [0]586 [4]571 [0]
SWFRG034-19Trisetum flavescens‘Trisett51’ (Saatszucht Steinach, DE)558 [0]887 [4]568 [0]
Total sequences43 (89.58%)46 (95.83%)41 (85.42%)

Repeatedly unsuccessful PCR

Size of barcode sequences obtained for 48 plants from 16 different species of forage grasses and legumes Repeatedly unsuccessful PCR

DNA barcode amplification and sequencing

The BOLD database was screened for DNA barcode sequences of the selected species and close relatives; barcodes rbcLa, matK and trnH-psbA were selected as candidates because they reported the most available sequences. Those DNA barcodes are mainly located in the chloroplast genome and are not known to have paralogs that can interfere with taxonomic assignments, as is the case for some nuclear loci such as ITS [13]. Primer sequences for the three barcodes were obtained from BOLD [10] and were optimized for amplification in the target plant families (Additional file 1: Table S1). Each PCR reaction consisted of 15 ng of template DNA, 1× flexi buffer (Promega, Madison, WI, USA), 2 mM MgCl2, 200 µM dNTPs, each primer at 0.4 µM, 0.75 units of GoTaq® G2 Flexi DNA Polymerase (Promega, Madison, WI, USA) and water to a final volume of 30 µL. For rbcLa, PCR conditions were 5 min at 94 °C followed by 33 cycles of 40 s at 94 °C, 1 min at 55 °C and 40 s at 72 °C, followed by a final extension cycle of 10 min at 72 °C. For matK and trnH-psbA, a 5 min at 94 °C followed by 50 cycles of 40 s at 94 °C, 1 min at 54 °C and 40 s at 72 °C followed by a final extension cycle of 10 min at 72 °C were used. The integrity of the amplicons was visually inspected by agarose gel electrophoresis (1% w/v). Amplicons were purified in a MultiScreen PCR96 filter plate (Merck, Darmstadt, Germany). Sequencing reactions were prepared with 1× BigDye™ Terminator 3.1 Reaction Mix (ThermoFisher Scientific, Waltham, MA, USA), 1× BigDye™ 3.1 Sequencing Buffer, forward or reverse primer at 0.16 µM and 800 ng of purified amplicon to a final volume of 5 µL. The same primers used for PCR were used for sequencing. Capillary electrophoresis was performed on a 3130 ABI (ThermoFisher Scientific, Waltham, MA, USA). The resulting traces were quality filtered and merged using GAP4 [16] with the default settings. All traces and sequences were uploaded to BOLD v4 (project code: SWFRG; http://www.boldsystems.org/index.php/Public_SearchTerms).

Taxonomical assignments

Sequences of matK, rbcLa and trnH-psbA were downloaded from BOLD v4 on May 23, 2019 [10]. Only sequences from the Poaceae and Fabaceae families with no contaminants and longer than 200 bp were included. In total, 6232 rbcLa, 11,971 matK and 1236 trnH-psbA sequences were present in the downloaded fasta files, which also include the plants from the BOLD project SWFRG (Additional file 1: Table S2). The taxonomical identifiers of the BOLD fasta files were reformatted to remove spaces and rearrange their informative fields in a consistent manner (fasta_name_reformat.py script from https://github.com/mloera/forage-barcoding). Each barcode-specific fasta file was then used to make a blast database and the SWFRG sequences were queried in their corresponding database with blastn using the flag outfmt = 6 (i.e., tabular format). The resulting blast output tables were parsed with the blastn_matcher.R script from the above-mentioned GitHub repository. The script removes self-hits and corrects some misspellings in the taxonomy of queries and hits. The script then compares the taxonomy of the queries and hits at the species- and genus-level. A “match” was called when the taxonomy of a query sequence is equal to the taxonomy of the highest scoring hit or hits (Additional file 1: Table S3). A “taxonomical assignment rate” for each barcode was then calculated as the ratio between the sum of its correct taxonomical assignments and the total number of query sequences.

Results and discussion

PCR and sequencing results

The primer sequences of trnH-psbA and matK were adapted to allow for amplification within the target species, while the primer sequences of rbcLa did not need any modification (Additional file 1: Table S1). From the 48 processed specimens, 130 sequences were obtained (46 for matK, 43 for rbcLa and 41 for trnH-psbA-) after repeating and optimizing failed amplifications. The size of the sequences ranged from 470 to 588 bp for rbcLa, 185 to 888 bp for matK and 268 to 614 bp for trnH-psbA (Table 1). Barcode trnH-psbA had a 100% correct assignment rate (CAR) in legumes, followed by rbcLa (93.3%) matK (57.1%; Table 2). The highest CAR for grasses was 65.4% with trnH-psbA, followed by matK (48.4%) and rbcLa (46.4%). Overall, genus-level CARs were 69.8%, 73.3% and 90.2% for rbcLa, matK and trnH-psbA, respectively. Legumes had also the highest assignment rate on the genus level (100% correct assignments for all barcodes; Table 2), while correct assignments for grass genera were 53.6%, 61.3% and 84.6% for barcodes rbcLa, matK and trnH-psbA, respectively.
Table 2

Species- and genus-level assignment success by barcode

BarcodeSpecies-level assignment rateGenus-level assignment rate
Overall (%)Grasses (%)Legumes (%)Overall (%)Grasses (%)Legumes (%)
rbcLa62.846.493.369.853.6100.0
matK51.148.457.173.361.3100.00
trnH-psbA78.065.4100.090.284.6100.0
Species- and genus-level assignment success by barcode The low CARs for grass DNA barcodes could be due to various factors. Some grass species, such as Poa spp., are notoriously hard to discriminate morphologically and their phylogeny is subject to controversy [17, 18]. This could have resulted in misidentified reference sequences. Another factor is the high genetic similarity between some grass taxa. For example, the genetic similarity of some species of the Festuca-Lolium complex is reported to be > 90%, as calculated from transcriptomic data of orthologous genes [19]. This may result in a higher proportion of incorrect taxonomic assignments for such grass species [20]. Barcode trnH-psbA makes for a good candidate for large-scale DNA barcoding of forage legumes and some grasses, such as C. cristatus, D. glomerata and T. flavescens (Table 3). However, further work is needed to produce reference sequences in more forage species and cultivars. Overall, our results provide the basic tools to implement DNA barcoding in forage species (i.e., family-specific primer pairs and a standard bioinformatic workflow for taxonomic assignments) and can help in choosing an appropriate DNA barcode for high-throughput applications. Such high-throughput applications could greatly enhance the biodiversity-monitoring protocols that are used to study the ecology of grasslands, its dynamics and its interplay with agriculture.
Table 3

Species-level taxonomic assignment success by family, query species and barcode sequence

FamilyQuery speciesmatKrbcLatrnH-psbA
PoaceaeAlopecurus pratensis0/30/10/1
Arrhenatherum elatius1/30/21/1
Cynosurus cristatus3/32/33/3
Dactylis glomerata3/32/33/3
Festuca pratensis2/30/31/3
Festuca rubra1/33/32/3
Lolium multiflorum0/32/31/3
Lolium perenne2/30/32/3
Phleum pratense1/21/21/1
Poa pratensis0/31/20/2
Trisetum flavescens2/32/33/3
FabaceaeLotus corniculatus1/33/33/3
Medicago sativa2/33/33/3
Onobrychis viciifolia2/33/33/3
Trifolium pratense2/22/33/3
Trifolium repens1/33/33/3

Italics indicate 100% taxonomic assignment success

Species-level taxonomic assignment success by family, query species and barcode sequence Italics indicate 100% taxonomic assignment success

Limitations

This is exploratory work focused on the most common forage plant species from sub-alpine temperate grasslands; further work is needed to address other forage species from different kinds of grasslands. As a proof of concept, three specimens per species were analyzed. Additional file 1: Table S1. PCR primers used in this study. Table S2. Overview of the Barcoding of Life Datasystems (BOLD, [10]) reference barcode sequences used for taxonomical assignments. Table S3. Highest scoring blastn hits for the plant specimens of the BOLD project “SWFRG”.
  12 in total

1.  Biological identifications through DNA barcodes.

Authors:  Paul D N Hebert; Alina Cywinska; Shelley L Ball; Jeremy R deWaard
Journal:  Proc Biol Sci       Date:  2003-02-07       Impact factor: 5.349

2.  Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species.

Authors:  Paul D N Hebert; Sujeevan Ratnasingham; Jeremy R deWaard
Journal:  Proc Biol Sci       Date:  2003-08-07       Impact factor: 5.349

3.  A DNA barcode for land plants.

Authors: 
Journal:  Proc Natl Acad Sci U S A       Date:  2009-07-30       Impact factor: 11.205

Review 4.  DNA barcodes for ecology, evolution, and conservation.

Authors:  W John Kress; Carlos García-Robledo; Maria Uriarte; David L Erickson
Journal:  Trends Ecol Evol       Date:  2014-11-19       Impact factor: 17.712

5.  Genome relationships in polyploid Poa pratensis and other Poa species inferred from phylogenetic analysis of nuclear and chloroplast DNA sequences.

Authors:  Jason T Patterson; Steven R Larson; Paul G Johnson
Journal:  Genome       Date:  2005-02       Impact factor: 2.166

6.  DNA barcoding: error rates based on comprehensive sampling.

Authors:  Christopher P Meyer; Gustav Paulay
Journal:  PLoS Biol       Date:  2005-11-29       Impact factor: 8.029

7.  Comparative transcriptome analysis within the Lolium/Festuca species complex reveals high sequence conservation.

Authors:  Adrian Czaban; Sapna Sharma; Stephen L Byrne; Manuel Spannagl; Klaus F X Mayer; Torben Asp
Journal:  BMC Genomics       Date:  2015-03-28       Impact factor: 3.969

8.  Global evidence of positive biodiversity effects on spatial ecosystem stability in natural grasslands.

Authors:  Yongfan Wang; Marc W Cadotte; Yuxin Chen; Lauchlan H Fraser; Yuhua Zhang; Fengmin Huang; Shan Luo; Nayun Shi; Michel Loreau
Journal:  Nat Commun       Date:  2019-07-19       Impact factor: 14.919

9.  bold: The Barcode of Life Data System (http://www.barcodinglife.org).

Authors:  Sujeevan Ratnasingham; Paul D N Hebert
Journal:  Mol Ecol Notes       Date:  2007-05-01

10.  Potential of legume-based grassland-livestock systems in Europe: a review.

Authors:  A Lüscher; I Mueller-Harvey; J F Soussana; R M Rees; J L Peyraud
Journal:  Grass Forage Sci       Date:  2014-04-16       Impact factor: 2.630

View more
  4 in total

1.  A multispecies amplicon sequencing approach for genetic diversity assessments in grassland plant species.

Authors:  Miguel Loera-Sánchez; Bruno Studer; Roland Kölliker
Journal:  Mol Ecol Resour       Date:  2022-01-05       Impact factor: 8.678

Review 2.  Life barcoded by DNA barcodes.

Authors:  Mali Guo; Chaohai Yuan; Leyan Tao; Yafei Cai; Wei Zhang
Journal:  Conserv Genet Resour       Date:  2022-08-15       Impact factor: 0.991

3.  Screening of universal DNA barcodes for identifying grass species of Gramineae.

Authors:  Jianli Wang; Zhenfei Yan; Peng Zhong; Zhongbao Shen; Guofeng Yang; Lichao Ma
Journal:  Front Plant Sci       Date:  2022-09-07       Impact factor: 6.627

4.  Genome skimming and exploration of DNA barcodes for Taiwan endemic cypresses.

Authors:  Chung-Shien Wu; Edi Sudianto; Yu-Mei Hung; Bo-Cyun Wang; Chiun-Jr Huang; Chi-Tsong Chen; Shu-Miaw Chaw
Journal:  Sci Rep       Date:  2020-11-26       Impact factor: 4.379

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.