| Literature DB >> 33305796 |
Kevin R Bredemeyer1,2, Andrew J Harris1,2, Gang Li3, Le Zhao3, Nicole M Foley1, Melody Roelke-Parker4, Stephen J O'Brien5,6, Leslie A Lyons7, Wesley C Warren8, William J Murphy1,2.
Abstract
In addition to including one of the most popular companion animals, species from the cat family Felidae serve as a powerful system for genetic analysis of inherited and infectious disease, as well as for the study of phenotypic evolution and speciation. Previous diploid-based genome assemblies for the domestic cat have served as the primary reference for genomic studies within the cat family. However, these versions suffered from poor resolution of complex and highly repetitive regions, with substantial amounts of unplaced sequence that is polymorphic or copy number variable. We sequenced the genome of a female F1 Bengal hybrid cat, the offspring of a domestic cat (Felis catus) x Asian leopard cat (Prionailurus bengalensis) cross, with PacBio long sequence reads and used Illumina sequence reads from the parents to phase >99.9% of the reads into the 2 species' haplotypes. De novo assembly of the phased reads produced highly continuous haploid genome assemblies for the domestic cat and Asian leopard cat, with contig N50 statistics exceeding 83 Mb for both genomes. Whole-genome alignments reveal the Felis and Prionailurus genomes are colinear, and the cytogenetic differences between the homologous F1 and E4 chromosomes represent a case of centromere repositioning in the absence of a chromosomal inversion. Both assemblies offer significant improvements over the previous domestic cat reference genome, with a 100% increase in contiguity and the capture of the vast majority of chromosome arms in 1 or 2 large contigs. We further demonstrated that comparably accurate F1 haplotype phasing can be achieved with members of the same species when one or both parents of the trio are not available. These novel genome resources will empower studies of feline precision medicine, adaptation, and speciation. © The American Genetic Association. 2020.Entities:
Keywords: Felidae; PacBio; genome; interspecies hybrid; trio-binning
Year: 2021 PMID: 33305796 PMCID: PMC8006817 DOI: 10.1093/jhered/esaa057
Source DB: PubMed Journal: J Hered ISSN: 0022-1503 Impact factor: 2.645
Assembly pipeline and software usage
| Assembly and Polishing | Software | Version |
|---|---|---|
| Haplotype Binning | Canu | v1.8 |
|
| NextDenovo | v2.2-beta.0 |
| Contig Polishing | NextPolish | v1.3.0 |
| Benchmarking | ||
| Basic Assembly Stats | QUAST | v5.0.2 |
| Assembly Completeness | BUSCO | v4.0.6 |
| Dotplot Generation | Nucmer | v4.0.0beta2 |
| Dotplot Visualization | Dot | n/a |
| Scaffolding | ||
| Hi-C Read Haplotyping |
| 0.2.0 |
| Hi-C Mapping for SALSA |
| n/a |
| Hi-C Scaffolding | SALSA2 | v2.2 |
| Ref-Based Scaffolding | RagTag | v1.0.1 |
| Hi-C Contact Map Generation | Juicer | v1.5.7 |
| Manual Assembly Inspection | Juicebox Assembly Tools | v1.11.08 |
| Annotation | ||
| Repeat Assessment | RepeatMasker | v4.0.9 |
| Structural Variant Analysis | Assemblytics | v1.2.1 |
| Annotation Liftover | Liftoff | v1.4.2 |
Software citations are listed in the text.
Assembly statistics and benchmarks
| Species | Domestic cat (2n = 38) | Asian leopard cat (2n = 38) |
|---|---|---|
| Read Count | 6,342,174 | 6,519,732 |
| Base Count (bp) | 109,251,556,255 | 112,023,028,516 |
| Subread N50 (bp) | 25,541 | 25,585 |
| Contig Assembly | ||
| Total Contigs | 123 | 132 |
| Largest Contig (bp) | 205,171,639 | 240,846,738 |
| Ungapped Assembly Length (bp) | 2,422,283,418 | 2,435,689,660 |
| N50 (bp) | 83,875,697 | 83,696,501 |
| BUSCO (mammalia_odb10) | ||
| Single-Copy | 8,563 | 8,589 |
| Duplicated | 20 | 21 |
| Complete | 8,583 | 8,610 |
| Percent Complete | 93.03% | 93.32% |
| Fragmented | 166 | 153 |
| Missing | 477 | 463 |
| Percent Present (Comp+Frag) | 94.83% | 94.98% |
| Scaffold Assembly Stats | ||
| Total Scaffolds | 71 | 83 |
| Primary Assembly Length (bp) | 2,422,299,418 | 2,435,702,060 |
| Total Gaps | 60 | 56 |
| N50 Scaffold (bp) | 147,603,332 | 148,587,958 |
Figure 1.Alignment of domestic cat and Asian leopard cat single haplotype assembly contigs to felCat9. All ideograms are based on the domestic cat (Cho et al. 1997; Davis et al. 2009) except for the modified F1 to E4 chromosome unique to the species of the genera Prionailurus, Acinonyx, and Puma (Graphodatsky et al. 2020). G-banding is represented by dark bars and centromeres by red bars. Domestic cat contigs are depicted as orange bars above the ideogram, and Asian leopard cat contigs are depicted as blue bars below the ideogram.
Figure 2.Read count distribution of single-replacement crosses and Chromosome A1 p-distance plots for both the domestic and Asian leopard cat reference sequences. (a) -distance traces for the biological parents and test sample short read data from both species mapped to the single haplotype domestic cat genome assembly. The Asian leopard cat samples show a clear separation from the traces of the domestic cat samples which lie close to 0. (b) p-distance traces for the biological parents and test sample short read data from both species mapped to the single haplotype Asian leopard cat genome assembly. In contrast to the p-distance traces for the domestic cat assembly, the Asian leopard cat traces lie close to 0, while the divergent domestic cat sample traces lie well above, indicating uniformly elevated divergence from the Asian leopard cat assembly. The consistent separation of reads from the 2 species in both (a) and (b) demonstrates that TrioCanu has properly phased the F1-hybrid long-read data into their appropriate parental haplotypes. (c) Read count distributions of single-replacement crosses post-haplotype phasing. (*) = Biological parents. See Supplementary Tables 1 and 2 for individual sample IDs.