| Literature DB >> 33595645 |
Jonas Oppenheimer1, Benjamin D Rosen2, Michael P Heaton3, Brian L Vander Ley4, Wade R Shafer5, Fred T Schuetze6, Brad Stroud7, Larry A Kuehn3, Jennifer C McClure8, Jennifer P Barfield9, Harvey D Blackburn10, Theodore S Kalbfleisch11, Derek M Bickhart8, Kimberly M Davenport12, Kristen L Kuhn3, Richard E Green1, Beth Shapiro13,14, Timothy P L Smith3.
Abstract
Bison are an icon of the American West and an ecologically, commercially, and culturally important species. Despite numbering in the hundreds of thousands today, conservation concerns remain for the species, including the impact on genetic diversity of a severe bottleneck around the turn of the 20th century and genetic introgression from domestic cattle. Genetic diversity and admixture are best evaluated at genome-wide scale, for which a high-quality reference is necessary. Here, we use trio binning of long reads from a bison-Simmental cattle (Bos taurus taurus) male F1 hybrid to sequence and assemble the genome of the American plains bison (Bison bison bison). The male haplotype genome is chromosome-scale, with a total length of 2.65 Gb across 775 scaffolds (839 contigs) and a scaffold N50 of 87.8 Mb. Our bison genome is ~13× more contiguous overall and ~3400× more contiguous at the contig level than the current bison reference genome. The bison genome sequence presented here (ARS-UCSC_bison1.0) will enable new research into the evolutionary history of this iconic megafauna species and provide a new tool for the management of bison populations in federal and commercial herds. © The American Genetic Association. 2021.Entities:
Keywords: Genome resources; bovine; interspecies hybrid; nanopore sequencing; trio binning
Year: 2021 PMID: 33595645 PMCID: PMC8006816 DOI: 10.1093/jhered/esab003
Source DB: PubMed Journal: J Hered ISSN: 0022-1503 Impact factor: 2.645
Figure 1.Schematic showing trio binning and assembly process.
List of programs used for the assembly
| Assembly | Program | Version |
|---|---|---|
| | jellyfish | 1.1.11 |
| Heterozygosity estimation | GenomeScope | 1 |
| Read trimming | Trimmomatic | 0.38 |
| | meryl | 1 |
| Read binning, error correction, read trimming | Canu | 1.8 |
| Unitigging | Canu | 1.9 |
| Scaffolding and polishing | ||
| Contig polishing | Nanopolish | 0.11.1 |
| Remove low-coverage, duplicated contigs | purge_dups | 1.0.1 |
| Long read, genome–genome alignment | minimap2 | 2.16 |
| Aligning short reads to genome | bwa | 0.7.17 |
| Scaffolding | Salsa | 2.2 |
| Visualizing genome–genome alignment | D-Genies | 1.2.0 |
| SAM/BAM file manipulation | samtools | 1.9 |
| Estimate Hi-C library quality | hi_qc | Downloaded 29 June 2019 |
| Generate Hi-C contact matrix | PretextMap | 0.1 |
| Visualize Hi-C contact matrix | PretextView | 0.01 |
| Fasta manipulation | CombineFasta | 0.0.16 |
| Variant calling | freebayes | 1.3.1-1-g5eb71a3-dirty |
| Evaluation | ||
| | Merqury | 1 |
| Identify conserved orthologs | BUSCO | v4 |
| | Merfin | Downloaded October 2020 |
| Read mapping statistics | Lumpy-sv | 0.3.0 |
| Alignment feature response curve | FRC_align | 1.0.0 |
| VCF/BCF file manipulation | bcftools | 1.9 |
| Variant calling | paftools.js | (minimap2 v2.16) |
| Annotation | ||
| Genome annotation liftover | Liftoff | 1.5.1 |
Figure 2.Ideogram of bison genome assembly karyotype, showing placement of contigs within chromosomes as alternating colors (such that color alternates at gaps). Chromosomes shown entirely in black represent those contained within single contigs.
Assembly statistics for final assembly, ARS-UCSC_bison1.0, and current bison reference, Bison_UMD1.0
| ARS-UCSC_bison1.0 | Bison_UMD1.0 | ||
|---|---|---|---|
| Genome size (Gb) | 2.65 | 2.83 | |
| Contig number | 839 | 470 415 | |
| Scaffold number | 775 | 128 431 | |
| Contig N50 | 68.5 Mb | 20.0 Kb | |
| Scaffold N50 | 87.8 Mb | 6.87 Mb | |
| Scaffold L50 | 11 | 124 | |
| Gaps (in chromosomes) | 64 (49) | 341,984 (NA) | |
| Largest contig | 136.1 Mb | 203.8 Kb | |
| BUSCO (%) | Complete | 88.6 | 85.6 |
| Duplicated | 1.0 | 0.9 | |
| Fragmented | 2.4 | 3.9 | |
| Missing | 8.0 | 9.6 | |
|
|
| 38.88 | 32.21 |
|
| 91.35% | 93.0% |