| Literature DB >> 31811161 |
Jeremy R deWaard1, Sujeevan Ratnasingham1, Evgeny V Zakharov1, Alex V Borisenko1, Dirk Steinke1, Angela C Telfer1, Kate H J Perez1, Jayme E Sones1, Monica R Young1, Valerie Levesque-Beaudin1, Crystal N Sobel1, Arusyak Abrahamyan1, Kyrylo Bessonov1,2, Gergin Blagoev1, Stephanie L deWaard1, Chris Ho1, Natalia V Ivanova1, Kara K S Layton1,3, Liuqiong Lu1, Ramya Manjunath1, Jaclyn T A McKeown1, Megan A Milton1, Renee Miskie1, Norm Monkhouse1, Suresh Naik1, Nadya Nikolova1, Mikko Pentinsaari1, Sean W J Prosser1, Adriana E Radulovici1, Claudia Steinke1, Connor P Warne1, Paul D N Hebert4.
Abstract
The reliable taxonomic identification of organisms through DNA sequence data requires a well parameterized library of curated reference sequences. However, it is estimated that just 15% of described animal species are represented in public sequence repositories. To begin to address this deficiency, we provide DNA barcodes for 1,500,003 animal specimens collected from 23 terrestrial and aquatic ecozones at sites across Canada, a nation that comprises 7% of the planet's land surface. In total, 14 phyla, 43 classes, 163 orders, 1123 families, 6186 genera, and 64,264 Barcode Index Numbers (BINs; a proxy for species) are represented. Species-level taxonomy was available for 38% of the specimens, but higher proportions were assigned to a genus (69.5%) and a family (99.9%). Voucher specimens and DNA extracts are archived at the Centre for Biodiversity Genomics where they are available for further research. The corresponding sequence and taxonomic data can be accessed through the Barcode of Life Data System, GenBank, the Global Biodiversity Information Facility, and the Global Genome Biodiversity Network Data Portal.Entities:
Mesh:
Year: 2019 PMID: 31811161 PMCID: PMC6897906 DOI: 10.1038/s41597-019-0320-2
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1Overview of the study design to create and maintain the curated reference DNA barcode library for Canadian invertebrates.
Fig. 2Geographic coverage of the Canadian specimen data release. (a) The sampling density for the complete dataset of 1,500,003 specimen records. (b) The sampling intensity for the 47 National Parks, National Park Reserves, and National Urban Parks of Canada. Numbers correspond to Parks in Online-only Table 1.
Fig. 3Breakdown of the Canadian specimen data release by sampling program. N = number of records.
Summary data and BOLD datasets for each park or program in the ‘National Parks’ and ‘Other Localities’ subsets. Park numbers correspond to those in Fig. 2.
| No. | National Park (+BOLD dataset) | Province/Territory | Park Area (km²) | Sampling Year | Sampling Type | No. of Sampling Events | No. of Specimens |
|---|---|---|---|---|---|---|---|
| 1 | Vuntut National Park DS-BBVNP1 | YT | 4,345 | 2014 | Malaise | 2 | 2,007 |
| 2 | Ivvavik National Park DS-BBINP1 | YT | 10,168 | 2014 | Malaise | 8 | 5,217 |
| 3 | Kluane National Park Reserve DS-BBKLNP1 | YT | 22,013 | 2014 | Malaise/SS/General | 56 | 49,337 |
| 4 | Gwaii Haanas National Park Reserve DS-BBGHNP1 | BC | 1,470 | 2014 | Malaise | 11 | 854 |
| 5 | Nááts’ihch’oh National Park Reserve N/A | NT | 4,850 | N/A | N/A | N/A | N/A |
| 6 | Nahanni National Park Reserve DS-BBNNP1 | NT | 30,050 | 2014 | Malaise | 7 | 17,464 |
| 7 | Pacific Rim National Park Reserve DATASET-BBPRNP1 | BC | 511 | 2012, 2014 | Malaise/SS/General | 252 | 14,463 |
| 8 | Gulf Islands National Park Reserve DS-BBGINP1 | BC | 33 | 2012, 2014 | Malaise/SS/General | 90 | 30,005 |
| 9 | Tuktut Nogait National Park DS-BBTNNP1 | NT | 18,100 | 2014 | Malaise | 2 | 10,971 |
| 10 | Aulavik National Park DS-BBALNP1 | NT | 12,200 | 2014 | Malaise | 2 | 2,225 |
| 11 | Mount Revelstoke National Park DATASET-BBMRNP1 | BC | 260 | 2012, 2014 | Malaise/General | 126 | 18,857 |
| 12 | Jasper National Park DATASET-BBJNP1 | AB | 10,878 | 2012 | Malaise/SS/General | 483 | 48,405 |
| 13 | Glacier National Park DATASET-BBGCNP1 | BC | 1,349 | 2012, 2014 | Malaise/SS/General | 131 | 23,617 |
| 14 | Yoho National Park DATASET-BBYNP1 | BC | 1,313 | 2014 | Malaise | 173 | 14,143 |
| 15 | Kootenay National Park DATASET-BBKTNP1 | BC | 1,406 | 2014 | Malaise/SS/General | 171 | 15,622 |
| 16 | Banff National Park DATASET-BBBNP1 | AB | 6,641 | 2012, 2014 | Malaise/SS/General | 303 | 45,842 |
| 17 | Waterton Lakes National Park DATASET-BBWLNP1 | AB | 505 | 2012 | Malaise/SS/General | 445 | 53,095 |
| 18 | Wood Buffalo National Park DS-BBWBNP1 | AB/NT | 44,807 | 2012 | Malaise/General | 26 | 8,137 |
| 19 | Elk Island National Park DATASET-BBEINP1 | AB | 194 | 2012 | Malaise/SS/General | 267 | 48,640 |
| 20 | Grasslands National Park DATASET-BBGNP1 | SK | 907 | 2012, 2014 | Malaise/SS/General | 99 | 39,703 |
| 21 | Prince Albert National Park DATASET-BBPANP1 | SK | 3,874 | 2012 | Malaise/SS/General | 395 | 41,142 |
| 22 | Qausuittuq National Park | NU | 11,000 | N/A | N/A | N/A | N/A |
| 23 | Riding Mountain National Park DATASET-BBRMNP1 | MB | 2,969 | 2012 | Malaise | 312 | 18,195 |
| 24 | Wapusk National Park DS-BBWNP1 | MB | 11,475 | 2014 | Malaise | 8 | 20,793 |
| 25 | Ukkusiksalik National Park N/A | NU | 20,500 | N/A | N/A | N/A | N/A |
| 26 | Pukaskwa National Park DATASET-BBPNP1 | ON | 1,878 | 2013 | Malaise | 188 | 22,715 |
| 27 | Point Pelee National Park DATASET-BBPPNP1 | ON | 15 | 2012, 2014 | Malaise/SS/General | 582 | 32,288 |
| 28 | Bruce Peninsula National Park DATASET-BBBPNP1 | ON | 154 | 2012, 2014 | Malaise/SS/General | 126 | 8,675 |
| 29 | Sirmilik National Park DS-BBSNP1 | NU | 22,200 | 2014 | Malaise | 2 | 1,096 |
| 30 | Georgian Bay Islands National Park DS-BBGBNP1 | ON | 13.5 | 2013, 2014 | Malaise/SS/General | 16 | 20,920 |
| 31 | Rouge National Urban Park DS-BBRNUP1 | ON | 79 | 2013, 2014 | Malaise/SS/General | 352 | 51,162 |
| 32 | Thousand Islands National Park DS-BBTINP1 | ON | 24.4 | 2012, 2014 | Malaise/SS/General | 31 | 36,165 |
| 33 | La Mauricie National Park DS-BBLMNP1 | QC | 536 | 2013 | Malaise | 19 | 20,433 |
| 34 | Quttinirpaaq National Park DS-BBQNP1 | NU | 37,775 | 2014 | Malaise | 10 | 9,065 |
| 35 | Auyuittuq National Park DS-BBAYNP1 | NU | 19,089 | 2014 | Malaise | 1 | 715 |
| 36 | Kejimkujik National Park DATASET-BBKJNP1 | NS | 404 | 2013 | Malaise/SS/General | 209 | 32,506 |
| 37 | Fundy National Park DATASET-BBFNP1 | NB | 207 | 2013 | Malaise/SS/General | 169 | 28,815 |
| 38 | Kouchibouguac National Park DATASET-BBKCNP1 | NB | 238 | 2013 | Malaise/SS/General | 183 | 21,420 |
| 39 | Forillon National Park DS-BBFONP1 | QC | 244 | 2013 | Malaise | 21 | 27,319 |
| 40 | Torngat Mountain National Park DS-BBTMNP1 | NL | 9,700 | 2013, 2014 | Malaise | 9 | 19,880 |
| 41 | Mingan Archipelago National Park Reserve DS-BBMANP1 | QC | 151 | 2013 | Malaise | 17 | 16,978 |
| 42 | Prince Edward Island National Park DS-BBPEINP1 | PE | 22 | 2013 | Malaise/SS/General | 69 | 24,443 |
| 43 | Cape Breton Highlands National Park DATASET-BBCBNP1 | NS | 949 | 2013 | Malaise/SS/General | 273 | 27,103 |
| 44 | Sable Island National Park Reserve DS-BBSINP1 | NS | 30 | 2014 | Malaise | 173 | 13,020 |
| 45 | Mealy Mountains National Park Reserve N/A | NL | 10,700 | N/A | N/A | N/A | N/A |
| 46 | Gros Morne National Park DATASET-BBGMNP1 | NL | 1,805 | 2013 | Malaise/SS/General | 177 | 40,234 |
| 47 | Terra Nova National Park DATASET-BBTNNP1 | NL | 400 | 2013 | Malaise/SS/General | 194 | 18,484 |
| — | Global Malaise Canada DS-GMPC1, DS-GMPC2 | CAN | N/A | 2012–2017 | Malaise | 151 | 166,835 |
| — | School Malaise Trap Program DS-SMTPC | CAN | N/A | 2013–2017 | Malaise | 407 | 93,378 |
| — | ATBIs and bioblitzes DS-ATBIB | ON/MB | N/A | 2006–2017 | Malaise/SS/General | 18,453 | 83,277 |
| — | Other collections DS-OLOCC1, DS-OLOCC2 | CAN | N/A | 2006–2017 | Malaise/SS/General | 28,829 | 154,343 |
Summary data by major taxon represented within the dataset.
| Subset Taxon | Specimens | BINs | # of Families | # of Named Species | Specimens to Family (%) | Specimens to Species (%) |
|---|---|---|---|---|---|---|
| 1,002,170 | 49,501 | 818 | 10,563 | 100 | 35 | |
| Araneae | 26,440 | 1,136 | 34 | 808 | 100 | 100 |
| Acari | 34,776 | 3,449 | 155 | 102 | 99 | 5 |
| Collembola | 30,723 | 781 | 16 | 54 | 100 | 45 |
| Coleoptera | 37,013 | 2,872 | 89 | 1,828 | 100 | 78 |
| Diptera | 616,492 | 23,330 | 101 | 2,593 | 100 | 28 |
| Hemiptera | 47,233 | 1,809 | 59 | 907 | 100 | 58 |
| Hymenoptera | 129,469 | 11,372 | 62 | 1,173 | 100 | 17 |
| Lepidoptera | 49,967 | 3,003 | 71 | 2,233 | 100 | 82 |
| Other insects | 24,964 | 1,195 | 95 | 644 | 100 | 59 |
| Other invertebrates | 5,093 | 554 | 137 | 221 | 99 | 70 |
| 497,833 | 36,094 | 1,038 | 10,548 | 100 | 44 | |
| Araneae | 14,414 | 972 | 34 | 788 | 100 | 99 |
| Acari | 22,221 | 3,127 | 160 | 153 | 99 | 16 |
| Collembola | 14,094 | 562 | 18 | 78 | 100 | 61 |
| Coleoptera | 18,551 | 2,109 | 84 | 1,433 | 100 | 77 |
| Diptera | 258,723 | 13,442 | 100 | 2,289 | 100 | 35 |
| Hemiptera | 25,672 | 1,553 | 63 | 819 | 100 | 58 |
| Hymenoptera | 73,992 | 8,605 | 59 | 1,207 | 100 | 22 |
| Lepidoptera | 41,289 | 2,943 | 68 | 2,402 | 100 | 87 |
| Other insects | 19,452 | 1,242 | 110 | 685 | 100 | 76 |
| Other invertebrates | 9,425 | 1,539 | 342 | 694 | 98 | 63 |
Fig. 4Specimen and BIN summaries for the two subsets in this data release. (a) ‘National Parks’ subset. Numbers correspond to those in Fig. 2 ranked by BIN count, and (b) ‘Other Localities’ subset. Numbers in parentheses correspond to Parks in Fig. 2. ‘Unique BINs’ refer to BINs collected only in that specific national park or collecting program.
| Measurement(s) | DNA • digital imaging |
| Technology Type(s) | taxonomic diversity assessment by targeted gene survey • digital curation |
| Factor Type(s) | species • geographic location • habitat |
| Sample Characteristic - Organism | Metazoa |
| Sample Characteristic - Environment | terrestrial biome • freshwater biome • marine biome |
| Sample Characteristic - Location | Canada |