| Literature DB >> 32077619 |
Randolph Z B Quek1, Sudhanshi S Jain1, Mei Lin Neo1,2, Greg W Rouse3, Danwei Huang1,2.
Abstract
Despite the ecological and economic significance of stony corals (Scleractinia), a robust understanding of their phylogeny remains elusive due to patchy taxonomic and genetic sampling, as well as the limited availability of informative markers. To increase the number of genetic loci available for phylogenomic analyses in Scleractinia, we designed 15,919 DNA enrichment baits targeting 605 orthogroups (mean 565 ± SD 366 bp) over 1,139 exon regions. A further 236 and 62 barcoding baits were designed for COI and histone H3 genes respectively for quality and contamination checks. Hybrid capture using these baits was performed on 18 coral species spanning the presently understood scleractinian phylogeny, with two corallimorpharians as outgroup. On average, 74% of all loci targeted were successfully captured for each species. Barcoding baits were matched unambiguously to their respective samples and revealed low levels of cross-contamination in accordance with expectation. We put the data through a series of stringent filtering steps to ensure only scleractinian and phylogenetically informative loci were retained, and the final probe set comprised 13,479 baits, targeting 452 loci (mean 531 ± SD 307 bp) across 865 exon regions. Maximum likelihood, Bayesian and species tree analyses recovered maximally supported, topologically congruent trees consistent with previous phylogenomic reconstructions. The phylogenomic method presented here allows for consistent capture of orthologous loci among divergent coral taxa, facilitating the pooling of data from different studies and increasing the phylogenetic sampling of scleractinians in the future.Entities:
Keywords: coral reef; exon; genome sampling; hybrid capture; multilocus data; phylogenomics
Mesh:
Year: 2020 PMID: 32077619 PMCID: PMC7468246 DOI: 10.1111/1755-0998.13150
Source DB: PubMed Journal: Mol Ecol Resour ISSN: 1755-098X Impact factor: 7.090
Reference genomes used for putatively scleractinian transcript identification and bait design
| Species | Access | Reference |
|---|---|---|
|
| ||
|
|
| Shinzato et al. ( |
|
|
| ReFuGe |
|
|
| Ying et al. ( |
|
|
| Ying et al. ( |
|
|
| Ying et al. ( |
|
|
| |
|
| GCF_002042975.1 | |
|
|
| Cunning et al. ( |
|
|
| Ying et al. ( |
|
|
| Voolstra et al. ( |
|
| ||
|
|
| Shoguchi et al. ( |
|
|
| Liu et al. ( |
|
|
| Shoguchi et al. ( |
|
|
| Liu et al. ( |
|
|
| Aranda et al. ( |
|
|
| Shoguchi et al. ( |
Reefgenomics.org: Liew, Aranda, and Voolstra (2016).
Summary statistics of loci assembled per sample for both exons‐only and exons + supercontigs data sets
| Species | Number of loci (#/%) | Locus length range (bp) (exons‐only/supercontigs + exons) | Mean locus length (± |
|---|---|---|---|
|
| |||
|
| 386/85.40 | 93–1,923/156–4,356 | 443 ± 223/847 ± 597 |
|
| 401/88.71 | 87–1,920/117–4,414 | 444 ± 228/802 ± 555 |
|
| 381/84.29 | 93–1,611/126–4,694 | 432 ± 194/853 ± 594 |
|
| 393/86.95 | 93–1,395/99–4,479 | 439 ± 199/860 ± 608 |
|
| 349/77.21 | 93–1,209/126–3,690 | 425 ± 194/834 ± 560 |
|
| 389/86.06 | 93–1,572/117–3,619 | 435 ± 203/794 ± 520 |
|
| 323/71.46 | 81–1,878/135–4,247 | 425 ± 236/798 ± 569 |
|
| 383/84.73 | 90–1,674/192–3,950 | 443 ± 215/834 ± 583 |
|
| 354/78.32 | 93–1,383/147–5,192 | 437 ± 209/823 ± 564 |
|
| 258/57.08 | 123–1,587/153–3,531 | 473 ± 234/833 ± 523 |
|
| |||
|
| 263/58.19 | 111–1,938/144–5,286 | 490 ± 238/1,005 ± 714 |
|
| 219/48.45 | 72–1,413/189–2,818 | 424 ± 209/774 ± 507 |
|
| 343/75.88 | 90–2,895/105–6,837 | 513 ± 322/998 ± 765 |
|
| 343/75.88 | 99–2,697/105–8,843 | 521 ± 310/1,053 ± 871 |
|
| 277/61.28 | 54–1,521/114–5,123 | 430 ± 214/693 ± 498 |
|
| 325/71.90 | 66–2,889/147–4,947 | 475 ± 288/899 ± 659 |
|
| 311/68.81 | 54–2,208/63–6,595 | 461 ± 259/875 ± 676 |
|
| 329/72.79 | 96–3,798/111–5,537 | 463 ± 312/934 ± 693 |
|
| |||
|
| 43/9.51 | 183–717/183–1,773 | 375 ± 144/541 ± 352 |
|
| 47/10.40 | 138–759/138–1,291 | 356 ± 123/448 ± 226 |
Percentage of loci is based on total number of loci (n = 452). A supercontig includes both exon and intron regions in a sequence.
Samples used in identity checks with COI and histone H3 barcodes.
FIGURE 1Coverage of loci captured by target‐enrichment baits as determined by hybpiper post‐filtering (blue = absent, red = present)
Concatenated matrix statistics for both exons‐only and exons + supercontigs data sets
| Data set | Missing data (%) | Concatenated matrix length (bp) | Mean locus length (± | Locus length range (bp) | Parsimony informative sites (#/%) |
|---|---|---|---|---|---|
| Exons‐only | 30.86 | 201,137 | 456 ± 233 | 6–2133 | 68,997/34.30 |
| Exons + supercontigs | 32.43 | 287,749 | 636 ± 351 | 54–2491 | 119,095/41.39 |
Missing data percentages as defined in Quek and Huang (2019).
A supercontig includes both exon and intron regions in a sequence.
FIGURE 2Maximum likelihood phylogeny of Scleractinia for exons‐only data set (minimum taxon occupancy of three scleractinian taxa per locus; 30.86% missing data; 452 loci over 865 exon regions; 201,137 bp) with Rhodactis as outgroup. All nodes have maximum bootstrap values and posterior probabilities