| Literature DB >> 36185367 |
Tianming Lan1,2, Haimeng Li2,3, Shangchen Yang4, Minhui Shi2,3, Lei Han5, Sunil Kumar Sahu2, Yaxian Lu5, Jiangang Wang2, Mengchao Zhou5, Hui Liu6, Junxuan Huang2, Qing Wang2,3, Yixin Zhu2,3, Li Wang4, Yanchun Xu1,5, Chuyu Lin7, Huan Liu1,2,8, Zhijun Hou1,5.
Abstract
The raccoon dog (Nyctereutes procyonoides) is an invasive canid species native to East Asia with several distinct characteristics. Here, we report a chromosome-scale genome of the raccoon dog with high contiguity, completeness, and accuracy. The intact taste receptor genes, expanded gene families, and positively selected genes related to digestion, absorption, foraging, and detoxification likely support the omnivory of raccoon dogs. Several positively selected genes and raccoon dog-specific mutations in TDRD6 and ZP3 genes may explain their high reproductivity. Enriched GO terms in energy metabolism and positively selected immune genes were speculated to be closely related to the diverse immune system of raccoon dogs. In addition, we found that several expanded gene families and positively selected genes related to lipid metabolism and insulin resistance may contribute to winter sleep of the raccoon dog. This high-quality genome provides a valuable resource for understanding the evolutionary characteristics of this species.Entities:
Keywords: Animals; Genetics; Genomics; Zoology
Year: 2022 PMID: 36185367 PMCID: PMC9523411 DOI: 10.1016/j.isci.2022.105117
Source DB: PubMed Journal: iScience ISSN: 2589-0042
Figure 1The distribution of native (green) and introduced/invasive (pink) areas of raccoon dogs, the landscape of the raccoon dog genome, and the chromosome-scale synteny analysis of the raccoon dog and domestic dog
(A) The map describes the current distribution of raccoon dogs; here, we show both the native and introduced areas (https://www.cabi.org/isc/datasheet/72656).
(B) The genomic landscape of the raccoon dog genome. A: The 27 chromosomes of the raccoon dog genome; B: population level genetic diversity (π) calculated by a 500 kb window; C: SNP density across the genome (500 kb window); D: sequencing depth (X) calculated by 500 kb window; E: GC content (%); F: gene density calculated by 500 kb window.
(C) The chromosome-scale synteny analysis between the raccoon dog genome and the domestic dog genome, which was visualized using RectChr v1.27 (https://github.com/BGI-shenzhen/RectChr).
Statistics for the sequencing data, genome assembly, and annotation of the raccoon dog genome
| Item | Category | Number |
|---|---|---|
| Sequencing data | PacBio (Gb) | 385.94 |
| WGS (Gb) | 175.52 | |
| Hi-C (Gb) | 203.52 | |
| RNA-seq (5 organs) (Gb) | 101.70 | |
| Assembly | Estimated genome size (Gb) | 3.21 |
| Assembled genome size (Gb) | 2.38 | |
| Contig N50 (Mb) | 41.87 | |
| Scaffold N50 (Mb) | 83.70 | |
| Longest scaffold (Mb) | 177.96 | |
| Annotation | GC content (%) | 41.33 |
| Repeat sequences (%) | 35.11 | |
| Number of protein-coding genes | 20,000 | |
| Number of functionally annotated genes | 19,973 |
Figure 2Identification of the X chromosome and Y-linked scaffold
(A) Synteny analysis of genes on the X chromosome between raccoon dog and domestic dog. Red lines indicate the genes in the dog genome mapped to the positive strand of the raccoon dog genome, and blue lines indicate the genes in the dog genome mapped to the negative strand of the raccoon dog genome.
(B) Synteny analysis of genes on the Y chromosome between raccoon dog and domestic dog. The red line and blue line are the same as those in (A).
(C) The ratio of sequencing depth between female and male individuals in a 500 bp window across the X chromosome.
(D) The ratio of the sequencing depth of each chromosome-scale scaffold between female and male individuals. The red dot represents the X chromosome, the green dot represents the Y-linked scaffold, and the blue dots are scaffolds from autosomes. The expected ratio is 1:1, the X chromosome is expected to have a higher ratio and the Y chromosome is expected to have a lower ratio.
(E) The ratio of sequencing depth between female and male individuals in a 500 bp window across the Y-linked scaffold.
Figure 3Comparative genomics analysis and enrichment analysis of expanded gene families in the raccoon dog genome
(A) The phylogenetic relationship of 18 species and the estimated divergence time. Numbers on the branch of the phylogenetic tree represent the significantly expanded (blue) and contracted (red) gene families.
(B–D) Clusters of significantly overrepresented GO items for biological process (B), cellular component (C), and molecular function (D) by REVIGO for expanded gene families in the raccoon dog genome. Semantic similar GO terms clustered together.
(E) Significantly enriched KEGG pathways in the raccoon dog genome compared with the other 17 species. Blue: pathways related to the omnivorous diet. Orange: pathways related to energy metabolism. Purple: pathways related to immunity.
Figure 4The possible genetic basis for detoxification and high reproduction in the raccoon dog
(A) The glutathione-mediated detoxification pathway. The ellipse in red represents enzymes encoded by expanded gene families.
(B) The phylogenetic tree of the GST gene family was constructed by the maximum likelihood method. Blue: Domestic dog; Red: Raccoon dog; Green: Arctic fox; Purple: Red fox. The GSTP1 genes in the raccoon dog genome are obviously expanded.
(C) Raccoon dog-specific amino acid changes in the TDRD6 gene. N793D was found in the Tudor_SF superfamily.
(D) Raccoon dog-specific amino acid changes in the ZP3 gene. Two substitutions were found to be in the ZP domain region.
(E) Three-dimensional view of the ZP3 protein, highlighting raccoon dog-specific amino acid changes. The zoomed in pink amino acids are the raccoon dog-specific amino acids and the green amino acids are the those predicted from the dog. The bar plot shows the residual volume of amino acids.
Figure 5Population history and genome-wide heterozygosity
(A) The genomic heterozygosity of the raccoon dog and 17 other species. The genomic heterozygosity of 17 species was collected from published data.
(B) The population history of the raccoon dog inferred by PSMC with 100 bootstraps. The red line represents the estimated effective population size (Ne), and the 100 thin red lines represent the PSMC estimates of 100 randomly resampled from the original sequence. Tsurf: atmospheric surface air temperature relative to the present. The mutation rate (μ) and generation interval (g) used here were 1.0 × 10−8 and 3 years, respectively.
(C) The recent population history of the raccoon dog inferred by MSMC2 with four individuals. LGM: Last Glacial Maximum. We used the same μ and g as used in the PSMC analysis.
(D) The recent population history of the raccoon dog inferred by SMC++ with 38 individuals. We used the same μ and g as used in the PSMC analysis.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| TRlzol reagent | Invitrogen, USA | Cat#15596 - 026 |
| Monarch HMW DNA Extraction Kit | NEB, Ipswich, England | Cat#T3060L |
| SMRTbell Template Prep Kit 1.0 | Pacific Biosciences, CA, USA | Cat#100-259-100 |
| Qiagen Blood & Cell Culture DNA Mini Kit | Qiagen, United States | Cat#13323 |
| This paper | CNSA: CNP0002053 | |
| Ensembl | ARS-UCD1.2 | |
| Ensembl | ASM966005v1 | |
| NCBI | GCF_003709585.1_Aci_jub_2 | |
| NCBI | GCF_003584765.1_ASM358476v1 | |
| Ensembl | Sscrofa11.1 | |
| Ensembl | PanLeo1.0 | |
| Ensembl | PanPar1.0 | |
| NCBI | GCF_018345385.1_ASM1834538v1 | |
| Ensembl | UrsMar_1.0 | |
| Ensembl | ASM325472v1 | |
| Ensembl | PanTig1.0 | |
| Ensembl | VulVul2.2 | |
| Ensembl | GRCh38 | |
| Ensembl | Felis_catus_9.0 | |
| Ensembl | EquCab3.0 | |
| Ensembl | CanFam3.1 | |
| Ensembl | OryCun2.0 | |
| Canu (v2.0) | ( | |
| NextPolish (v1.3.1) | ( | |
| Purge_dups (v1.2.5) | ( | |
| LRScaf (v1.1.8) | ( | |
| BWA (v0.7.17) | ( | |
| Juicer (v1.5) | ( | |
| 3d-DNA (v190716) | ( | |
| BUSCO (v5.2.2) | ( | |
| LTR finder (v1.0.6) | ( | |
| MITE-hunter (v4.07) | ( | |
| RepeatModeler2 (v2.0.1) | ( | |
| RepeatMasker (v4.0.5) | ( | |
| Tandem Repeats Finder (v4.07) | ( | |
| GlimmerHMM (v3.0.1) | ( | |
| Augustus (v3.0.3) | ( | |
| SNAP (v11/29/2013) | ( | |
| Trimmomatic (v0.30) | ( | |
| Trinity (v2.13.2) | ( | |
| PASA (v2.0.2) | ( | |
| Blastall (v2.2.26) | ( | |
| GeneWise (v2.4.1) | ( | |
| MAKER pipeline (v3.01.03) | ( | |
| tRNAscan-SE (v2.0.9) | ( | |
| INFERNAL (v1.1.1) | ( | |
| InterProScan (v5.52-86.0) | ( | |
| MAFFT (v.7.310) | ( | |
| PAL2NAL (v14) | ( | |
| Trimal (v1.4.1) | ( | |
| IQTREE (v1.6.12) | ( | |
| Treefam (v1.4) | ( | |
| CAFE (v4.2.1) | ( | |
| PAML (v4.8) | ( | |
| Picard (v2.1.1) | N/A | |
| Sentieon (v202010.01) | ( | |
| VCFtools (v4.1) | ( | |
| PSMC (v0.6.5) | ( | |
| MSMC2 (v2.1.1) | ( | |
| SMC++ (v1.15.4) | ( | |
| Raw and analyzed data | This paper | CNSA: CNP0002053 |