| Literature DB >> 34549895 |
Lei Han1,2, Tianming Lan3,4, Desheng Li5, Haimeng Li3,6, Linhua Deng5, Zhiwei Peng1, Shaowen He7, Yanqiang Zhou1, Ruobing Han1, Lingling Li1, Yaxian Lu1, Haorong Lu8,9, Qing Wang10, Shangchen Yang11, Yixin Zhu10, Yunting Huang8,9, Xiaofang Cheng12, Jieyao Yu8,9, Yulong Wang1, Heting Sun13, Hongliang Chai1, Huanming Yang3, Xun Xu3,8, Michael Lisby4, Quan Liu1,14, Karsten Kristiansen4,15, Huan Liu3,4, Zhijun Hou1,2.
Abstract
Helminth diseases have long been a threat to the health of humans and animals. Roundworms are important organisms for studying parasitic mechanisms, disease transmission and prevention. The study of parasites in the giant panda is of importance for understanding how roundworms adapt to the host. Here, we report a high-quality chromosome-scale genome of Baylisascaris schroederi with a genome size of 253.60 Mb and 19,262 predicted protein-coding genes. We found that gene families related to epidermal chitin synthesis and environmental information processes in the roundworm genome have expanded significantly. Furthermore, we demonstrated unique genes involved in essential amino acid metabolism in the B. schroederi genome, inferred to be essential for the adaptation to the giant panda-specific diet. In addition, under different deworming pressures, we found that four resistance-related genes (glc-1, nrf-6, bre-4 and ced-7) were under strong positive selection in a captive population. Finally, 23 known drug targets and 47 potential drug target proteins were identified. The genome provides a unique reference for inferring the early evolution of roundworms and their adaptation to the host. Population genetic analysis and drug sensitivity prediction provide insights revealing the impact of deworming history on population genetic structure of importance for disease prevention.Entities:
Keywords: zzm321990Baylisascaris schroederizzm321990; adaptation; anthelmintics; genetic diversity; roundworms
Mesh:
Substances:
Year: 2021 PMID: 34549895 PMCID: PMC9298223 DOI: 10.1111/1755-0998.13504
Source DB: PubMed Journal: Mol Ecol Resour ISSN: 1755-098X Impact factor: 8.678
FIGURE 1The phylogenetic relationships among six nematodes and genomic characteristics and synteny of B. schroederi. (a) Genomic characterization of B. schroederi genome. The figure shows the gene number, repeat content, GC content, sequencing coverage and scaffolds from the centre to the edge. (b) Synteny of B. schroederi with P. univalens and T. canis at the gene level. Different colours represent different synteny blocks. (c) Upset plot showing the intersection of gene family expansions in nematodes. Each row represents a nematode. Black circles and vertical lines between the rows represent the intersection of expanded families between species. The barplot indicates the total gene family count in each intersection. (d) Time‐calibrated maximum likelihood phylogenetic tree of six nematodes. The estimated divergent times are shown at the bifurcations. The numbers below the nodes represent the number of gene families significantly expanded, maintained, and contracted, respectively
Summary of the features of the B. schroederi genome
| Description |
|
|
|
|
|
|---|---|---|---|---|---|
| Genome size (bp) | 253,610,985 | 281,639,769 | 265,545,801 | 253,353,821 | 317,115,901 |
| Number of scaffolds; contigs | 75; 536 | 2778; 15,567 (>1000 bp) | 31,538; 40,509 | 1274 | 22,857; 51,969 |
| Average length of scaffolds; contigs (bp) | 3,381,480; 436,058 | – | 8420; 6506 | 198,864; 6918 | 13,874; 5747 |
| Gap length (bp; % of genome) | 261,088 (0.10%) | 13,257,555 (4.70%) | 1,980,846 (0.7%) | 966,507 (0.38%) | 18,474,532 (5.82%) |
| N50 of scaffolds; contigs (bp) | 12,324,682; 1,221,088 | 888,870; 42,126 | 290,558; 46,632 | 1,825,986; 20,376 | 375,067; 16,980 |
| N90 of scaffolds; contigs (bp) | 6,864,504; 192,063 | 104,281; 7439 | 48,674; 10,466 | 204,976; 2991 | 66,363; 3511 |
| Genome GC content (%) | 37.30 | 37.26 | 37.85 | 39.07 | 39.95 |
| Repetitive sequences (%) | 9.53 | – | 4.4 | – | 13.5 |
The published genome information of B. schroederi (not released) (Hu et al., 2020).
FIGURE 2The expansion and contraction of roundworm gene families. (a) Significant increases and decreases in roundworm gene families. The solid circle and the solid triangle represent the top KEGG pathways that are enriched in the expanded gene families of Ascariasis compared with C. elegans or M. hapla, respectively. The open circle and the open triangle represent the top KEGG pathways that are enriched in the contracted gene families of Ascariasis compared with C. elegans or M. hapla, respectively. (b) GO function enrichment and gene copy number of the significantly expanded gene families in roundworms; (c) The proportion of GO functional genes in the gene family with significant expansion (or contraction) in roundworms compared to the total number of expansion (or contraction) genes. The red asterisk represents the p‐value of statistical Sidak's multiple comparisons tests of expansion and contraction of genes comparing with C. elegans or M. hapla (one asterisk represent 10−1)
FIGURE 3Expansion and contraction of B. schroederi gene families compared with three roundworms (A. suum P. univalens and T. canis). (a) Enrichment of the KEGG pathway in some significantly expanded gene families of B. schroederi. The proportion represents the ratio of the number of expanded genes located in the pathway (target genes) to all genes in the pathway (background genes). (b) REVIGO clusters of significantly overrepresented GO items for significantly expanded gene families in B. schroederi. The position of the bubbles is based on semantic similarity of GO terms. (c) Enrichment of KEGG pathways in B. schroederi's unique gene families. (d) Heatmap showing the gene families of B. schroederi that are significantly expanded or contracted (p < .01). The x‐axis represents the four roundworms of Ascariasis, whereas the y‐axis represents the families
FIGURE 4Life history of B. schroederi and the effect of actin gene on muscle contraction. (a) Life history of B. schroederi. L1 and L2 represent in vitro developmental stages, and L2 larvae enter the host body after developing into the infective stage. L3 and L4 represent the stage of internal organ migration of the larva. Stage L5 larvae return to the small intestine and develop into adult worms through sexual maturation. (b) Schematic diagram of anatomical cross‐section of B. schroederi; (c) Multiple signaling pathways are involved in actin polymerization, and genes in red are positively selected genes (PSGs). It shows significant expansion of three key gene families involved in actin polymerization
FIGURE 5Demographic history of the B. schroederi reconstructed from the reference and population resequencing genomes. (a) The red and purple line represent the estimated effective population size of B. schroederi and host, respectively. The 100 grey curves of B. schroederi and host represent the PSMC estimates for 100 sequences randomly resampled from the original sequence. Generation time (g) of e and giant panda were 0.17 and 12 years, respectively. The neutral mutation rate per generation (µ) of B. schroederi and giant panda were 0.9 × 10−8 and 1.3 × 10−8, respectively. The black line shows the MAR of Chinese loess. (b) Longitudinal change of the effective population size of the B. schroederi populations. The effective population sizes (N e) were estimated using the MSMC2 method. QLI, Qinling population; SC, Sichuan population
FIGURE 6Population structure and relationships of Sichuan (SC) in comparison to Qinling (QLI) population. (a) The geographic distribution of the sampling locations for QLI and SC populations. (b) Principal component analysis (PCA) analysis of two populations. (c) A maximum likelihood (ML) phylogenetic tree with 100 bootstrap tests constructed using whole‐genome SNPs information. We used P. univale as the outgroup. (d) Population structure of SC and QLI populations (K from 2 to 5). The y‐axis quantifies the proportion of the individual's genome from inferred ancestral population, and x‐axis shows the different individuals
FIGURE 7Analysis of natural selection in captive populations. (a) Genomic regions with selection sweep signals in captive (SC) and wild (QLI) B. schroederi population. Distribution of ln ratio (θπ, wild(QLI)/θπ, captive(SC)) and F ST of 50 kb windows with 10 kb steps. Red dots represent windows fulfilling the selected regions requirement (corresponding to Z test p < .005, where F ST > 0.21 and ln ratio >0.34). (b) Plot of iHS showing loci under positive selection of captive (SC) population. SNPs with |iHS|≥iHSm (3.89, top 1%) were shown above the dashed horizontal line. Nucleotide diversity around glc‐1, nrf‐6, ABC transporter ced7 and bre‐4 loci using 10‐kb sliding windows were displayed above the genes. The decay of haplotype homozygosity around a focal marker were displayed on the right side of the figure. The furcation structures represent the complete information contained in the concept of extended shared haplotypes EHH (Sabeti et al., 2002). The root (focal marker) is indicated by a vertical dashed line. The thickness of the lines corresponds to the number of scaffolds sharing a haplotype. (c) XP‐EHH from each SNP core showing the same nucleotide between the subject and the comparison target, also transformed to p‐values and plotted in logarithmic scale
FIGURE 8The position of known and potential drug target genes on superscaffolds. Different colours indicate different known drugs, and black indicates potential drug targets. The chemical structural formulas of 23 known drugs are drawn. The circles following the potential drug targets represent the six criteria, with a red solid circle indicating match condition and a hollow circle indicating mismatch condition. The six criteria were: (1) Similarity with ChEMBL known drug targets having a highly conserved alignment (>80%). (2) Lack of human homologues. (3) Related to lethal, L3 arrest, flaccid, molt defect or sterile phenotype. (4) A predicted metabolic chokepoint. (5) A predicted excretory/secretory protein (EP). (6) The protein has a structure in the PDBe. Potential drug target proteins encoding genes on each superscaffold and corresponding scores are marked (black)