| Literature DB >> 35959477 |
Mykle L Hoban1, Jonathan Whitney2, Allen G Collins3, Christopher Meyer4, Katherine R Murphy5, Abigail J Reft3, Katherine E Bemis3.
Abstract
DNA barcoding is critical to conservation and biodiversity research, yet public reference databases are incomplete. Existing barcode databases are biased toward cytochrome oxidase subunit I (COI) and frequently lack associated voucher specimens or geospatial metadata, which can hinder reliable species assignments. The emergence of metabarcoding approaches such as environmental DNA (eDNA) has necessitated multiple marker techniques combined with barcode reference databases backed by voucher specimens. Reference barcodes have traditionally been generated by Sanger sequencing, however sequencing multiple markers is costly for large numbers of specimens, requires multiple separate PCR reactions, and limits resulting sequences to targeted regions. High-throughput sequencing techniques such as genome skimming enable assembly of complete mitogenomes, which contain the most commonly used barcoding loci (e.g., COI, 12S, 16S), as well as nuclear ribosomal repeat regions (e.g., ITS1&2, 18S). We evaluated the feasibility of genome skimming to generate barcode references databases for marine fishes by assembling complete mitogenomes and nuclear ribosomal repeats. We tested genome skimming across a taxonomically diverse selection of 12 marine fish species from the collections of the National Museum of Natural History, Smithsonian Institution. We generated two sequencing libraries per species to test the impact of shearing method (enzymatic or mechanical), extraction method (kit-based or automated), and input DNA concentration. We produced complete mitogenomes for all non-chondrichthyans (11/12 species) and assembled nuclear ribosomal repeats (18S-ITS1-5.8S-ITS2-28S) for all taxa. The quality and completeness of mitogenome assemblies was not impacted by shearing method, extraction method or input DNA concentration. Our results reaffirm that genome skimming is an efficient and (at scale) cost-effective method to generate all mitochondrial and common nuclear DNA barcoding loci for multiple species simultaneously, which has great potential to scale for future projects and facilitate completing barcode reference databases for marine fishes. ©2022 Hoban et al.Entities:
Keywords: Collections; DNA barcoding; Fishes; Genome skimming; Metabarcoding; Mitochondrial genomes; Museum
Year: 2022 PMID: 35959477 PMCID: PMC9359134 DOI: 10.7717/peerj.13790
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 3.061
Figure 1Species included in this MiSeq-based pilot study.
(A) Gymnura altavela, Spiny Butterfly Ray, length unknown. (B) Gymnothorax fimbriatus, Fimbriated moray, USNM 395396, 850 mm TL. (C) Gymnothorax undulatus, Undulated moray, USNM 442319, 132 mm TL. (D) Saurida nebulosa, Clouded Lizardfish, USNM 442473, 56.2 mm SL. (E) Brosme brosme, Cusk, length unknown. (F) Myripristis vittata, Whitetip Soldierfish, USNM 411102, 120.1 mm SL. (G) Neoniphon sammara, Sammara Squirrelfish, USNM 442483, 130 mm SL. (H) Tylosurus crocodilus, Houndfish, USNM 442362, 13.6 mm SL. (I) Scomberoides lysan, Doublespotted Queenfish, USNM 442297, 22.3 mm SL. (J) Forcipiger flavissimus, Longnose Butterflyfish, USNM 411089, 129.1 mm SL. (K) Ostracion whitleyi, Whitley’s Boxfish, USNM 411029, 81.2 mm SL. (L) Canthigaster amboinensis, Ambon Toby, USNM 442417, 64 mm SL. All photographs except A and E are the individuals for which we sequenced the mitogenome. Photographs A and E by Donald D. Flescher, NOAA; photographs B, F, J, and K by Jeff Williams, NMNH; and photographs C, D, G, H, I, and L by Diane Pitassy NMNH.
Summary of species and museum specimens included in this study.
Species in this and subsequent tables are arranged alphabetically by taxonomic order, family, and scientific name, with the chondrichthyan presented separately.
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| Myliobatiformes | Gymnuridae | AutoGen | 1.80 | 433343 | 11 Sep. 2006 |
| |
| Anguilliformes | Muraenidae | BioSprint | 2.31 | 395396 | 15 Oct. 2008 |
| |
| Anguilliformes | Muraenidae | AutoGen | 2.31 | 442319 | 26 May 2017 |
| |
| Aulopiformes | Synodontidae | AutoGen | 1.53 | 442473 | 1 Jun. 2017 |
| |
| Beloniformes | Belonidae | AutoGen | 1.00 | 442362 | 28 May 2017 |
| |
| Beryciformes | Holocentridae | BioSprint | 0.90 | 411102 | 16 Oct. 2008 |
| |
| Beryciformes | Holocentridae | AutoGen | 0.80 | 442483 | 31 May 2017 |
| |
| Gadiformes | Lotidae | AutoGen | 0.41 | 433199 | 20 Apr. 2008 |
| |
| Perciformes | Carangidae | AutoGen | 0.73 | 442297 | 25 May 2017 |
| |
| Perciformes | Chaetodontidae | BioSprint | 0.72 | 411089 | 17 Oct. 2008 |
| |
| Tetraodontiformes | Ostraciidae | BioSprint | 0.98 | 411029 | 15 Oct. 2008 |
| |
| Tetraodontiformes | Tetraodontidae | AutoGen | 0.41 | 442417 | 30 May 2017 |
|
Notes.
Genome size estimates were available for this exact species on NCBI and/or genomesize.com.
Genome size estimates were calculated based on an average of available congeners or confamilials on NCBI and/or genomesize.com.
Genome size estimate for this species was based on an average of members of Batoidea available on NCBI and/or genomesize.com.
GenBank accession numbers for assembled mitogenomes and ribosomal repeat regions.
|
|
|
|
|
|
|---|---|---|---|---|
|
|
| 19,022 |
|
|
|
|
| 16,567 |
|
|
|
|
| 16,566 |
|
|
|
|
| 16,717 |
|
|
|
|
| 16,533 |
|
|
|
|
| 16,520 |
|
|
|
|
| 16,743 |
|
|
|
|
| 16,483 |
|
|
|
|
| 16,767 |
|
|
|
|
| 16,600 |
|
|
|
|
| 16,461 |
|
|
|
|
| 16,444 |
|
|
Notes.
Based on nearly-complete mitogenome assembly.
Library quantification and sequencing results; values shown are for both shearing methods (mechanical; enzymatic).
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|
|
| 170 | 318; 326 | 2.50; 1.98 | 2,193,690; 2,224,022 | 0.18; 0.19 | 201; 1,141 | 0.01; 0.05 | 1.6; 8.9 |
|
| 78 | 353; 370 | 0.498; 1.31 | 1,522,912; 1,809,632 | 0.10; 0.12 | 2,336; 2,647 | 0.15; 0.15 | 20.7; 23.0 |
|
| 51 | 356; 379 | 0.984; 2.82 | 2,146,906; 5,168,856 | 0.14; 0.34 | 984; 2,245 | 0.05; 0.04 | 8.7; 19.5 |
|
| 27.6 | 353; 391 | 0.382; 1.87 | 2,120,606; 3,174,282 | 0.21; 0.31 | 5,290; 5,603 | 0.25; 0.18 | 47.1; 48.7 |
|
| 4.6 | 380; 390 | 0.156; 0.27 | 463,424; 2,451,640 | 0.07; 0.37 | 1,065; 5,507 | 0.23; 0.22 | 9.4; 48.6 |
|
| 25.1 | 337; 354 | 0.352; 1.42 | 1,290,468; 2,342,102 | 0.21; 0.39 | 754; 1,615 | 0.06; 0.07 | 6.7; 13.8 |
|
| 17.1 | 352; 375 | 0.286; 0.876 | 2,276,566; 4,265,046 | 0.43; 0.80 | 2,169; 3,957 | 0.10; 0.09 | 19.3; 34.6 |
|
| 41 | 334; 392 | 0.366; 1.79 | 1,027,598; 1,635,836 | 0.37; 0.69 | 3,321; 5,148 | 0.32; 0.31 | 29.4; 45.1 |
|
| 33.9 | 340; 378 | 0.344; 1.30 | 2,621,818; 4,818,598 | 0.54; 0.99 | 7,249; 12,324 | 0.28; 0.26 | 64.2; 107.9 |
|
| 109 | 351; 378 | 1.06; 2.96 | 1,993,702; 2,116,356 | 0.41; 0.44 | 1,193; 1,311 | 0.06; 0.06 | 10.5; 11.1 |
|
| 86.5 | 340; 371 | 1.32; 3.34 | 2,054,668; 2,473,712 | 0.31; 0.38 | 2,369; 3,069 | 0.12; 0.12 | 20.596; 27.089 |
|
| 19.1 | 331; 371 | 0.224; 0.678 | 1,880,384; 2,868,978 | 0.68; 1.04 | 6,070; 8,672 | 0.32; 0.30 | 53.132; 76.469 |
Figure 2Assembled and annotated mitogenome of Canthigaster amboinensis, Ambon Toby, USNM 442417, 64 mm SL.
Photograph by Diane Pitassy, NMNH.
Figure 3Results of phylogenetic analysis of 52 fish mitogenomes.
Tree shown is the result of the Bayesian analysis confirming that 12 focal taxa (shown in red) are correctly placed among confamilials in corresponding families. Node support values <95% are shown for nodes at family- and genus-level splits. Bayesian posterior probability is as labeled, and ML bootstrap support is indicated by the color of the node symbols. Unlabeled family- and genus-level nodes had 100% posterior probability and bootstrap support.