| Literature DB >> 34839581 |
Xiaolei Ding1,2,3, Yunfei Guo3,4, Jianren Ye1,2, Xiaoqin Wu1,2, Sixi Lin1,2, Fengmao Chen1,2, Lihua Zhu1,2, Lin Huang1,2, Xiaofeng Song5, Yi Zhang1,2, Ling Dai1,2, Xiaotong Xi1,2, Jinsi Huang1,2, Kai Wang3,4,6,7, Ben Fan1,2, De-Wei Li8.
Abstract
BACKGROUND: Bursaphelenchus xylophilus, the pinewood nematode, kills millions of pine trees worldwide every year, and causes enormous economic and ecological losses. Despite extensive research on population variation, there is little understanding of the population-wide variation spectrum in China.Entities:
Keywords: Bursaphelenchus xylophilus; SNPs; chromosome-level assembly; epidemic tracking; population structure
Mesh:
Year: 2021 PMID: 34839581 PMCID: PMC9300093 DOI: 10.1002/ps.6738
Source DB: PubMed Journal: Pest Manag Sci ISSN: 1526-498X Impact factor: 4.462
Figure 1Summary of genome assembly and annotation results between AH1 and Japan (JP). (a) De novo BioNano assembly and chromosome contact matrix of AH1 assembly. Scaffold number is on top left corner. BioNano scaffolding assembly is in green, while Pacbio based contigs are in blue. (b) Dot‐plot of AH1 assembly against JP. (c) Gene length distribution between the two assemblies. (d) Gene ontology enrichment of Bursaphelenchus xylophilus novel genes on AH1 (coverage <50%, identity >85%).
Comparison between the AH1 and Japan (JP) de novo assemblies on Bursaphelenchus xylophilus
| Strain | Ka4C1* (JP) | AMA3 (AH1) |
|---|---|---|
| Accession | GCA_000231135.1 | N/A |
| Sequencing platform | Illumina | PacBio SMRT |
| Scaffolding platform | N/A |
BioNano Irys Chip Hi‐C |
| PacBio long‐read mapping rate | 77.26% | 81.16% |
| Size | 74 561 461 | 77 273 647 |
| Number of contigs | 10 432 | 129 |
| Longest contigs | 200 348 | 11 860 753 |
| Contig N50 | 18 150 | 5 825 809 |
| Median contig length | 2459 | 33 909 |
| Number of CHR | N/A | 6 |
| Longest CHR | N/A | 12 450 865 |
| CHR N50 | N/A | 12 382 933 |
| Consensus accuracy | N/A | 99.16% |
| Complete BUSCOs | 727 (74.03%) | 770 (78.41%) |
| Alignments per long read | 3.16 | 1.37 |
| Exons | 80 655 | 79 734 |
| Genes | 12 568 | 12 197 |
| Gene density (genes/Mb) | 150.33 | 158.19 |
| Average gene size | 1451.13 | 1453.79 |
| Average exon size | 225.12 | 222.39 |
Contigs downloaded from NCBI; scaffolds downloaded from WormBase.
Measured against JP.
Figure 2Evaluations of AH1 and Japan (JP) assembly based on the same RNA‐Seq data. (a) Transcript length distributions generated from RNA‐Seq data using AH1 and JP assemblies. (b) RNA‐Seq based sequence similarity analysis between AH1 (query) and JP (reference) transcripts using BLAST. x axis indicates query sequence coverage rate. (c) PCR validation of 10 novel transcripts identified in the AH1 assembly only. Lane M, DNA 1000 marker; lanes 1–10 indicate 10 novel transcripts, see Table S6 for details. (d) PCR validation of six exon‐skipping events identified in the AH1 assembly. Lane M, DNA 1000 marker; lanes 1–6 indicate six exon‐skipping events, see Table S7 for details.
Figure 3Overview of all SNPs found in 181 Bursaphelenchus xylophilus strains. (a) Distribution of minor allele frequencies. (b) Locations of all SNPs in genes. (c) Homozygosity, missing SNP, SNP count, transition and transversion distributions of the SNPs found in 181 strains. (d) Box plots of SNP genotypes in 181 strains. (e) SNP counts distribution among 181 strains.
General statistics of identified Bursaphelenchus xylophilus SNPs from different localities
| Localities | Max. SNPs | Min. SNPs | Mean # of SNPs | Mean # of homozygotes | Mean # of missing SNPs | Mean # of private SNPs | Mean # of transitions | Mean # of transversions |
|---|---|---|---|---|---|---|---|---|
| All | 3 582 068 | 3207 | 455 754 | 266 880 | 542 777 | 20 852 | 254 579 | 205 283 |
| AH | 366 651 | 60 374 | 235 453 | 132 461 | 514 392 | 168 | 123 758 | 114 527 |
| CQ | 721 468 | 80 982 | 310 675 | 99 959 | 254 007 | 285 | 169 054 | 146 460 |
| FJ | 482 391 | 45 248 | 186 645 | 101 044 | 538 885 | 353 | 99 889 | 89 740 |
| GD | 1 635 962 | 78 245 | 1 079 572 | 748 667 | 558 571 | 2748 | 619 309 | 465 212 |
| GX | 1 057 731 | 67 137 | 562 434 | 40 250 | 526 319 | 512 | 313 534 | 252 513 |
| GZ | 173 487 | 53 962 | 105 157 | 53 743 | 575 461 | 97 | 58 272 | 47 602 |
| HB | 457 945 | 123 881 | 266 438 | 137 711 | 474 623 | 418 | 140 753 | 130 031 |
| HEN | 426 875 | 264 140 | 363 367 | 215 066 | 315 153 | 383 | 190 737 | 180 544 |
| HN | 1 104 349 | 59 855 | 435 275 | 274 241 | 553 499 | 440 | 240 784 | 197 461 |
| JS | 342 897 | 29 015 | 166 309 | 117 667 | 691 097 | 211 | 86 225 | 81 611 |
| JX | 374 490 | 61 856 | 175 007 | 112 946 | 583 442 | 112 | 93 274 | 83 121 |
| SC | 3 101 663 | 71 990 | 558 622 | 385 098 | 684 350 | 189 559 | 319 074 | 247 130 |
| SD | 443 330 | 50 149 | 224 630 | 77 558 | 625 420 | 293 | 116 049 | 113 535 |
| SX | 393 143 | 43 939 | 277 210 | 148 247 | 355 590 | 249 | 148 906 | 131 058 |
| USA | 3 582 068 | 3 582 068 | 3 582 068 | 1 095 820 | 249 350 | 1 290 757 | 2 106 722 | 1 527 444 |
| YN | 292 635 | 292 635 | 292 635 | 213 359 | 363 485 | 32 | 155 039 | 140 586 |
| ZJ | 1 224 018 | 71 775 | 417 282 | 190 182 | 545 389 | 3140 | 231 354 | 189 767 |
The acronyms for different provinces are used here to represent the B. xylophilus sampling areas, e.g, AH for Anhui, CQ for Chongqing, FJ for Fujian, GD for Guangdong, GX for Guangxi, GZ for Guizhou, HB for Hubei, HEN for Henan, HN for Hunan, JS for Jiangsu, SC for Sichuan, SD for Shandong, SX for Shaanxi, USA for the United States, YN for Yunnan, ZJ for Zhejiang.
Figure 4Population structures of Bursaphelenchus xylophilus in China. (a) Geographical locations (green dots) of B. xylophilus strains collected in China. Different temperature zones are highlighted with gradient colors and the arrows represent possible migration events. (b) PCA results of 181 strains based on 3137 SNP markers. (c) Hierarchical clustering results of all 181 strains. (d) Population splitting tree of 181 strains based on sampling provinces. Samples from the same temperature zones are shown in circles of different colors. (e) Introgression analysis revealed possible B. xylophilus migration routes in China.
Figure 5Genome‐wide identification of geographical associated SNPs and experimental validations. (a) Manhattan plot for SNPs highly associated with different population structures. Blue lines indicate genome‐wide suggestive line = −log10(1e‐05); red lines indicate genome‐wide significance lines = −log10(1.6e‐09); green boxes indicate SNPs on GPCR genes. (b) Hierarchical tree graphs of over‐represented GO terms for genes affected by geographical associated SNPs. Boxes in the graphs represent GO terms and significant terms (adjusted P value ≤0.05) are highlighted. (c) Screenshots for pyro‐sequencing validation of candidate SNPs (Contig007:3426052) in Pop1. (d) Screenshots for pyro‐sequencing validation of candidate SNPs (Contig001:790035) in Pop4.
Figure 6Distribution of distances of true vs predicted coordinates in 240 iterations of bootstrapping.