| Literature DB >> 33561233 |
Jack P Hruska1, Joseph D Manthey1.
Abstract
The northern flicker, Colaptes auratus, is a widely distributed North American woodpecker and a long-standing focal species for the study of ecology, behavior, phenotypic differentiation, and hybridization. We present here a highly contiguous de novo genome assembly of C. auratus, the first such assembly for the species and the first published chromosome-level assembly for woodpeckers (Picidae). The assembly was generated using a combination of short-read Chromium 10× and long-read PacBio sequencing, and further scaffolded with chromatin conformation capture (Hi-C) reads. The resulting genome assembly is 1.378 Gb in size, with a scaffold N50 of 11 and a scaffold L50 of 43.948 Mb. This assembly contains 87.4-91.7% of genes present across four sets of universal single-copy orthologs found in tetrapods and birds. We annotated the assembly both for genes and repetitive content, identifying 18,745 genes and a prevalence of ∼28.0% repetitive elements. Lastly, we used fourfold degenerate sites from neutrally evolving genes to estimate a mutation rate for C. auratus, which we estimated to be 4.007 × 10-9 substitutions/site/year, about 1.5× times faster than an earlier mutation rate estimate of the family. The highly contiguous assembly and annotations we report will serve as a resource for future studies on the genomics of C. auratus and comparative evolution of woodpeckers.Entities:
Keywords: zzm321990 Colaptes auratuszzm321990 ; Hi-C; PacBio; genome assembly; woodpeckers
Year: 2021 PMID: 33561233 PMCID: PMC8022726 DOI: 10.1093/g3journal/jkaa026
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Characteristics of the Caur_TTU_1.0 assembly. (A) Hi-C scaffolding contact map. Relative contact between contigs is indicated by the intensity of red. Blue squares indicate scaffold boundaries. (B) Synteny map of Caur_TTU_1.0 (right; orange) scaffolds to Gallus gallus chromosomes (left; blue). (C) Proportions of CDS (top panel) and repetitive elements (bottom panel) across 100-kbp sliding nonoverlapping windows of the Chicken-aligned Caur_TTU_1.0 scaffolds. Lines indicate mean values across 10 sliding nonoverlapping windows.
Genome assembly metrics calculated using BBMap
| Statistic | Caur_TTU_1.0 |
|---|---|
|
| 2,369 / 9,565 |
|
| 117.313 Mbp / 15.844 Mbp |
|
| 1.378 Gbp |
|
| 11 |
|
| 33 / 4,370 |
|
| 43.948 Mbp / 826.96 Kbp |
|
| 14.604 Mbp / 50.09 Kbp |
|
| 44.93 |
BUSCO output using tetrapoda_odb9, tetrapoda_odb10, aves_od9, and aves_odb10 databases
| tetrapoda_odb9 | tetrapoda_odb10 | aves_odb9 | aves_odb10 | |
|---|---|---|---|---|
|
| 3,623 (91.7%) | 4,670 (87.9%) | 4,416 (89.9%) | 7,294 (87.4%) |
|
| 3,594 (91.0%) | 4,617 (86.9%) | 4,342 (88.3 %) | 7,224 (86.6%) |
|
| 29 (0.7%) | 53 (1.0%) | 74 (1.5 %) | 70 (0.8%) |
|
| 147 (3.7%) | 124 (2.3%) | 227 (4.6 %) | 219 (2.6%) |
|
| 180 (4.6 %) | 516 (9.8%) | 272 (5.6 %) | 825 (10.0%) |
|
| 3,950 | 5,310 | 4,915 | 8,338 |
Figure 2Caur_TTU_1.0 divergence landscape of TE classes. Relative abundance and age of each class are shown.