| Literature DB >> 34258218 |
Jin Luo1,2, Qiaoyun Ren1, Wenge Liu1, Xiangrui Li2, Mingxin Song3, Guiquan Guan1, Jianxun Luo1, Guangyuan Liu1.
Abstract
Ticks are important vectors that facilitate the transmission of a broad range of micropathogens to vertebrates, including humans. Because of their role in disease transmission, it has become increasingly important to identify and characterize the micropathogen profiles of tick populations. The objective of the present study was to survey the micropathogens of ticks by third-generation metagenomic sequencing using the PacBio Sequel platform. Approximately 46.481 Gbp of raw micropathogen sequence data were obtained from samples from four different regions of Heilongjiang Province, China. The clean consensus sequences were compared with host sequences and filtered at 90% similarity. Most of the identified genomes represent previously unsequenced strains. The draft genomes contain an average of 397,746 proteins predicted to be associated with micropathogens, over 30% of which do not have an adequate match in public databases. In these data, Anaplasma phagocytophilum and Coxiella burnetii were detected in all samples, while Borrelia burgdorferi was detected only in Ixodes persulcatus ticks from G1 samples. Viruses are a key component of micropathogen populations. In the present study, Simian foamy virus, Pustyn virus and Crimean-Congo haemorrhagic fever orthonairovirus were detected in different samples, and more than 10-30% of the viral community in all samples comprised unknown viruses. Deep metagenomic shotgun sequencing has emerged as a powerful tool to investigate the composition and function of complex microbial communities. Thus, our dataset substantially improves the coverage of tick micropathogen genomes in public databases and represents a valuable resource for micropathogen discovery and for studies of tick-borne diseases.Entities:
Keywords: Metagenomic; Microbial communities; Micropathogens; Third-generation sequencing; Ticks
Year: 2021 PMID: 34258218 PMCID: PMC8253887 DOI: 10.1016/j.ijppaw.2021.06.003
Source DB: PubMed Journal: Int J Parasitol Parasites Wildl ISSN: 2213-2244 Impact factor: 2.674
Fig. 1The genomic DNA library was generated according to the PacBio Sequel sample preparation instructions.
Sequencing data statistics for each sample.
| Samples | Raw Bases (Gbp) | Raw CCS | Low quality reads | Clean CCS | Host Removal | Bases (Mbp) | N50 Length (bp) |
|---|---|---|---|---|---|---|---|
| G1 | 14.876 | 525,604 | 142,425 | 383,179 | 348,745 | 494.500 | 1417 |
| G2 | 9.844 | 380,783 | 185,783 | 195,000 | 153,240 | 368,792 | 2406 |
| G3 | 10.798 | 457,749 | 246,077 | 211,672 | 145,837 | 359,253 | 2463 |
| G4 | 10.963 | 442,502 | 225,154 | 217,348 | 162,561 | 405.766 | 2496 |
| Average | 11.620 | 451,659 | 199,860 | 251,799 | 202,596 | 407,078 | 2196 |
Note: Sample: Sample name. Raw Bases: Raw bases of subreads (Gbp). Raw CCS: Sequence consistency analysis and preliminary quality control of subreads were performed to obtain the number of original CCS data. Clean CCS: The number of CCS results obtained by further performing a series of data quality controls on the original CCS data. Host Removal: The host filtering sequence for Clean CCS is the final sequence set to enter subsequent analyses.
Oligonucleotides used as primers for PCR analysis of micropathogens.
| Micropathogens | Primer names | Primer sequence (5′–3′) | References |
|---|---|---|---|
| SFV | 5′-CCTGGATGCAGAGTTGGATC-3′ | Reid | |
| SFV | 5′-CACGAATTTCCTGTAAAAAGA-3′ | ||
| CCHFVF | 5′-TGGACACCTTCACAAACTC-3′ | ||
| CCHFV536R | 5′-GACAAATTCCCTGCACCA-3′ | ||
| CbUF | 5′-AAGGATCCAATTAACCGTTGTAGTT-3′ | ||
| CbUR1042 | 5′-CGGAATTCTCACTCTTTCCTATGTT-3′ | ||
| BBUF | 5′- CACGA CTT TCT TCG CCT TAA AGC-3′ | Maggi | |
| BBUR | 5′- GTT AAG CTC TTA TTC GCT GAT GGT A-3′ | ||
| APH | 5′-ATGAATTACAGAGAATTGCTTGTAGG-3′ | ||
| APH | 5′-TTAATTGAAAGCAAATCTTGCTCCTATG-3′ |
Fig. 2Summary of the tag length distribution following the sequencing of genes from the four samples.
Gene prediction statistics of each sample.
| Samples | Total Number | Total Length (bp) | Max Length (bp) | Min Length (bp) |
|---|---|---|---|---|
| G 1 | 513,897 | 56,853,886 | 4371 | 60 |
| G 2 | 360,245 | 42,224,994 | 3726 | 60 |
| G 3 | 333,896 | 36,617,219 | 4212 | 60 |
| G 4 | 382,946 | 43,895,857 | 4851 | 60 |
| Average | 397,746 | 44,897,989 | 4290 | 60 |
Note: MetageneMark was used to directly predict the genes of the CCS reads to avoid the introduction of error into the assembly.
Functional classification of the eggNOG annotation results of the four samples.
| Functional Category | Description | Samples and Gene Number | |||
|---|---|---|---|---|---|
| Group 1 (G1) | Group 2 (G2) | Group 3 (G3) | Group 4 (G4) | ||
| A | RNA processing and modification | 24 | 24 | 15 | 21 |
| B | Chromatin structure and dynamics | 320 | 443 | 238 | 545 |
| C | Energy production and conversion | 964 | 2047 | 795 | 618 |
| D | Cell cycle control, cell division, chromosome partitioning | 208 | 377 | 175 | 163 |
| E | Amino acid transport and metabolism | 898 | 3288 | 930 | 643 |
| F | Nucleotide transport and metabolism | 536 | 975 | 343 | 409 |
| G | Carbohydrate transport and metabolism | 609 | 1815 | 615 | 450 |
| H | Coenzyme transport and metabolism | 691 | 1195 | 310 | 466 |
| I | Lipid transport and metabolism | 590 | 1251 | 510 | 444 |
| J | Translation, ribosomal structure and biogenesis | 1408 | 2082 | 829 | 949 |
| K | Transcription | 617 | 2047 | 701 | 462 |
| L | Replication, recombination and repair | 18547 | 14150 | 9432 | 16255 |
| M | Cell wall/membrane/envelope biogenesis | 751 | 2047 | 642 | 435 |
| N | Cell motility | 8 | 324 | 100 | 19 |
| O | Posttranslational modification, protein turnover, chaperones | 1158 | 1484 | 793 | 832 |
| P | Inorganic ion transport and metabolism | 336 | 1961 | 670 | 337 |
| Q | Secondary metabolites biosynthesis, transport and catabolism | 208 | 791 | 245 | 211 |
| R | General function prediction only | 0 | 0 | 0 | 0 |
| S | Function unknown | 42726 | 30115 | 21232 | 26982 |
| T | Signal transduction mechanisms | 407 | 1811 | 656 | 385 |
| U | Intracellular trafficking, secretion, and vesicular transport | 654 | 1051 | 501 | 538 |
| V | Defence mechanisms | 215 | 588 | 257 | 170 |
| W | Extracellular structures | 0 | 5 | 9 | 0 |
| Y | Nuclear structure | 0 | 1 | 0 | 0 |
| Z | Cytoskeleton | 115 | 69 | 87 | 88 |
Fig. 3Venn diagram based on the eggNOG database. Note: The corresponding functional categories and non-supervised orthologous group (NOG) numbers were obtained from the eggNOG database.
Fig. 4Venn diagram based on the KEGG Orthology database. Note: DIAMOND was used to compare gene sequences with the KEGG database, and the corresponding metabolic pathway information and KEGG orthology results of the genes were obtained from the KEGG database.
Fig. 5Heat maps of community composition based on genera. Note: X-axis, template name; Y-axis, genus. The darker the blue colour is, the higher the enrichment of the genus in the sample.
Fig. 6Bar chart of the relative abundances of genera.
Fig. 7Population distribution of micropathogens in different samples. Note: “%” represents the proportion of micropathogen in the total community from each sample.
Fig. 8Bar chart of the relative abundances of bacteria.
Fig. 9Pie chart of the distribution of viral abundances in different samples. Note: “%” represents the proportion of the virus in the total community from each sample, and the different colours represent different viruses. A: The primary micropathogens analysed in G1. B: The primary micropathogens analysed in G2. C: The main micropathogens analysed in G3. D: The main micropathogens analysed in G4.
Fig. 10Phylogenetic analysis of the isolated bacteria/viruses. Reference oligonucleotide sequences were selected by BLAST searches of the NCBI nt database. (A) Subtrees of the experimental sequences from the Borrelia burgdorferi 16S rRNA gene. (B) Subtrees of the experimental sequences from the Coxiella burnetii 16S rRNA gene. (C) Subtrees of the experimental sequences from the Anaplasma phagocytophilum MSP4 gene. (D) Subtrees of the experimental sequences from the Simian foamy virus pathogen. (E) Subtrees of the experimental sequences from the Crimean-Congo haemorrhagic fever orthonairovirus segment-S gene.