| Literature DB >> 17169982 |
Pierre Taberlet1, Eric Coissac, François Pompanon, Ludovic Gielly, Christian Miquel, Alice Valentini, Thierry Vermat, Gérard Corthier, Christian Brochmann, Eske Willerslev.
Abstract
DNA barcoding should provide rapid, accurate and automatable species identifications by using a standardized DNA region as a tag. Based on sequences available in GenBank and sequences produced for this study, we evaluated the resolution power of the whole chloroplast trnL (UAA) intron (254-767 bp) and of a shorter fragment of this intron (the P6 loop, 10-143 bp) amplified with highly conserved primers. The main limitation of the whole trnL intron for DNA barcoding remains its relatively low resolution (67.3% of the species from GenBank unambiguously identified). The resolution of the P6 loop is lower (19.5% identified) but remains higher than those of existing alternative systems. The resolution is much higher in specific contexts such as species originating from a single ecosystem, or commonly eaten plants. Despite the relatively low resolution, the whole trnL intron and its P6 loop have many advantages: the primers are highly conserved, and the amplification system is very robust. The P6 loop can even be amplified when using highly degraded DNA from processed food or from permafrost samples, and has the potential to be extensively used in food industry, in forensic science, in diet analyses based on feces and in ancient DNA studies.Entities:
Mesh:
Substances:
Year: 2006 PMID: 17169982 PMCID: PMC1807943 DOI: 10.1093/nar/gkl938
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1Position of the primers c, d, g and h on the chloroplast trnL (UAA) gene. The P6 loop amplified with primer g and h is indicated in green.
Sequences of the two universal primer pairs amplifying the trnL (UAA) intron
| Name | Code | Sequence 5′–3′ |
|---|---|---|
| A49325 | CGAAATCGGTAGACGCTACG | |
| B49863 | GGGGATAGAGGGACTTGAAC | |
| A49425 | GGGCAATCCTGAGCCAA | |
| B49466 | CCATTGAGTCTCTGCACCTATC |
Length of the amplified fragment with primers c–d in tobacco: 456 bp. Length of the amplified fragment with primers g–h in tobacco: 40 bp. The code denotes the 3′-most base pairs in the published tobacco cpDNA sequence (23). Primers c and d are from Taberlet et al. (19). Primer g and h were designed for this study (France patent no 2 876 378; April 14, 2006).
Figure 2Positions of the primers c and d on the secondary structure of the trnL (UAA) exon (A) and of the primers g and h on the secondary structure of the trnL (UAA) intron (B) for Nymphaea odorata [modified from Ref. (33)]. Highly conserved elements of the catalytic core (P, Q, R1, R2 and S) are located in grey boxes. The P6 loop, amplified with primers g and h, is identified by green letters. The 3′ ends of each of the four primers c, d, g and h are marked out by an arrow and their positions are identified by red letters.
Sequence variation of priming site for primer c, d, g and h
Only variants at a frequency higher than 0.005 are indicated. A total of 1014 and 14 145 GenBank entries were used for the primer pairs c–d and g–h, respectively. %: percentage of sequence variants found in GenBank. Species: Example of species corresponding to the sequence variant. Acc. no.: accession number in GenBank.
Percentages of species, genera and families identified using the chloroplast trnL (UAA) intron, the P6 loop of this intron and comparison with another primer pairs
| cpDNA gene and dataset | Length variation (bp)a | No. of species/genera/ families analyzedb | Species (%) | Genus (%) | Family (%) |
|---|---|---|---|---|---|
| Chloroplast | 254–767 | 706/366/119 | 67.28 | 86.34 | 100.00 |
| Chloroplast | 355–653 | 103/47/24 | 85.44 | 100.00 | 100.00 |
| P6 loop of | 10–143 | 11 404/4225/310 | 19.48 | 41.40 | 79.35 |
| P6 loop of | 22–83 | 106/48/25 | 47.17 | 89.58 | 100.00 |
| P6 loop of | 22–65 | 72/64/37 | 77.78 | 87.50 | 100.00 |
| P6 loop of | 10–127 | 1524/1525/244 | 24.02 | 59.48 | 90.57 |
| 91–98 | 1524/1525/244 | 15.09 | 37.51 | 68.03 |
Note that these estimates were made by taking into account genera with more than two species for the species identification, families with more than two genera for genus identification, and orders with more than two families for family identification.
aLength in base pairs excluding primers.
bExcluding families with a single genera, genera with a single species and species alone in a genus except for food dataset.
cBased on species in common between the g–h and the h1aF–h2aR datasets.
Example of P6 loop [trnL (UAA)] sequences of commonly eaten plant species amplified with primers g and h
| Common name | Scientific name | P6 loop sequence amplified with primers | Acc. no. |
|---|---|---|---|
| Cacao | ATCCTATTATTTTATTATTTTACGAAACTAAACAAAGGTTCAGCAAG CGAGAATAATAAAAAAAG | EF010969 | |
| Beet | CTCCTTTTTTCAAAAGAAAAAAAATAAGGATTCCGAAAACAAGAATAAAAAAAAAG | EF010967 | |
| Sugarcane | ATCCCCTTTTTTGAAAAAACAAGTGGTTCTCAAACTAGAACCCAAAGGAAAAG | AY116253 | |
| Wheat | ATCCGTGTTTTGAGAAAACAAGGGGTTCTCGAACTAGAATACAAAGGAAAAG | AB042240 | |
| Rye | ATCCGTGTTTTGAGAAAACAAGGGGTTCTCGAACTAGAATACAAAGGAAAAG | AF519162 | |
| Rice | ATCCATGTTTTGAGAAAACAAGCGGTTCTCGAACTAGAACCCAAAGGAAAAG | X15901 | |
| Millet | ATCCCTTTTTTGAAAAAACAAGTGGTTCTCAAACTAGAACCCAAAGGAAAAG | AY142738 | |
| Strawberry | ATCCCGTTTTATGAAAACAAACAAGGGTTTCAGAAAGCGAGAATAAATAAAG | EF010971 | |
| Apricot | ATCCTGTTTTATTAAAACAAACAAGGGTTTCATAAACCGAGAATAAAAAAG | EF010968 | |
| Sour cherry | ATCCTGTTTTATTAAAACAAACAAGGGTTTCATAAACCGAGAATAAAAAAG | EF010970 | |
| Maize | ATCCCTTTTTTGAAAAACAAGTGGTTCTCAAACTAGAACCCAAAGGAAAAG | NC_001666 | |
| Garden pea | ATCCTTCTTTCTGAAAACAAATAAAAGTTCAGAAAGTGAAAATCAAAAAAG | EF010972 | |
| Common bean | ATCCCGTTTTCTGAAAAAAAGAAAAATTCAGAAAGTGATAATAAAAAAGG | AY077945 | |
| Johnson grass | ATCCACTTTTTTCAAAAAAGTGGTTCTCAAACTAGAACCCAAAGGAAAAG | AY116244 | |
| Lettuce | ATCACGTTTTCCGAAAACAAACAACGGTTCAGAAAGCGAAAATCAAAAAG | U82042 | |
| Sunflower | ATCACGTTTTCCGAAAACAAACAAAGGTTCAGAAAGCGAAAATAAAAAAG | U82038 | |
| Wild oat | ATCCGTGTTTTGAGAGGGGGGTTCTCGAACTAGAATACAAAGGAAAAG | X75695 | |
| Barley | ATCCGTGTTTTGAGAAGGGATTCTCGAACTAGAATACAAAGGAAAAG | X74574 | |
| Potato | ATCCTGTTTTCTGAAAACAAACAAAGGTTCAGAAAAAAAG | EF010973 | |
| Tomato | ATCCTGTTTTCTGAAAACAAACCAAGGTTCAGAAAAAAAG | AY098703 | |
| Egg plant | ATCCTGTTTTCTCAAAACAAACAAAGGTTCAGAAAAAAAG | AY266240 | |
| Radish | ATCCTGAGTTACGCGAACAAACCAGAGTTTAGAAAGCGG | AF451576 | |
| Cabbage | ATCCTGGGTTACGCGAACAAAACAGAGTTTAGAAAGCGG | AF451574 |
Figure 3Example of multi-peak profiles obtained after capillary electrophoresis of the fluorescent PCR products obtained using the g and h primers. (A) Permafrost sample drilled from Main River Ice Bluff (N.E. Siberia, 64.06N, 171.11E), between 21 050 and 25 440 years old (uncalibrated 14C years, based on AMS dating of plant macrofossils from the section); g fluorescent primer; each peak represents at least one arctic plant species. (B) Human feces sample; h fluorescent primer; three of the four main peaks have been identified after cloning and sequencing: peak 1, nonidentified; peak 2, banana (Musa acuminata); peak 3, lettuce (Lactuca sativa); and peak 4, cacao (Theobroma cacao).
Sequences obtained after cloning the PCR product from the lyophilized potage
| Sequence obtained 5′–3′ | Species | Number of clones |
|---|---|---|
| ATCTTTATTTTTTGAAAAACAAGGGTTTAAAAAAGAGAATAAAAAAG | Leek ( | 19 |
| ATCCTGTTTTCTGAAAACAAACAAAGGTTCAGAAAAAAAG | Potato ( | 3 |
| ATCTTTCTTTTTTGAAAAACAAGGGTTTAAAAAAGAGAATAAAAAAG | Onion ( | 1 |
Note that onion and leek belong to the same genus Allium, and that their sequences differ by a single substitution.