| Literature DB >> 28797301 |
Minique Hilda de Castro1,2,3, Daniel de Klerk1, Ronel Pienaar1,4, D Jasper G Rees2,3, Ben J Mans5,6,7.
Abstract
BACKGROUND: Ticks secrete a diverse mixture of secretory proteins into the host to evade its immune response and facilitate blood-feeding, making secretory proteins attractive targets for the production of recombinant anti-tick vaccines. The largely neglected tick species, Rhipicephalus zambeziensis, is an efficient vector of Theileria parva in southern Africa but its available sequence information is limited. Next generation sequencing has advanced sequence availability for ticks in recent years and has assisted the characterisation of secretory proteins. This study focused on the de novo assembly and annotation of the salivary gland transcriptome of R. zambeziensis and the temporal expression of secretory protein transcripts in female and male ticks, before the onset of feeding and during early and late feeding.Entities:
Keywords: De novo transcriptome assembly; Differential expression; Next generation sequencing; Rhipicephalus zambeziensis; Secretory proteins; Sialotranscriptomics; Tick salivary glands
Mesh:
Substances:
Year: 2017 PMID: 28797301 PMCID: PMC5553602 DOI: 10.1186/s13071-017-2312-4
Source DB: PubMed Journal: Parasit Vectors ISSN: 1756-3305 Impact factor: 3.876
Summary of the R. zambeziensis transcriptome assembly and annotation statistics
| Transcriptome statistics | Valuea |
|---|---|
| Transcriptome assembly statistics | |
| Total number of transcripts | 23,631 |
| Number of transcripts > 500 bp | 19,903 |
| Number of transcripts > 1 kb | 13,330 |
| Number of transcripts > 10 kb | 80 |
| Shortest transcript length (bp) | 201 |
| Longest transcript length (bp) | 17,108 |
| Mean length of transcripts (bp) | 1793.5 |
| Median length of transcripts (bp) | 1193 |
| Transcript N50 (bp) | 2807 |
| Total bases in assembly (Mb) | 42.4 |
| Ambiguous base calls (Ns) | 0 |
| GC content (%) | 49 |
| Number of non-redundant predicted proteins | 13,584 |
| Transcriptome annotation statisticsb | |
| BLASTx against NR | 12,756 (54.0%) |
| BLASTx against UniProtKB/TrEMBL | 16,451 (69.6%) |
| BLASTx against UniProtKB/Swiss-Prot | 10,572 (44.7%) |
| BLASTx against TSA-NR | 16,711 (70.7%) |
| BLASTx against | 11,804 (50.0%) |
| BLASTx against EuKaryotic Orthologous Groups (KOG) | 9620 (40.7%) |
| BLASTx against AcariDB | 18,245 (77.2%) |
| Assigned with Gene Ontology (GO) termsc | 11,360 (48.1%) |
| Assigned with Enzyme Commission (EC) numbersc | 3493 (14.8%) |
| Assigned with KEGG orthology (KO) identifiersd | 4869 (20.6%) |
| Annotated in at least one database | 18,311 (77.5%) |
aValue indicating either the number of transcripts, proteins or bases, the transcript length or percentage, as indicated in the table
bNumber (and %) of transcripts annotated based on significant matches (E-value < E-05) against databases as detailed in the Methods section
cGO terms and EC numbers assigned with Blast2GO
dFrom the KEGG (Kyoto Encyclopedia of Genes and Genomes) Automatic Annotation Server (KAAS) using the I. scapularis genome
Fig. 1Comparisons of the predicted R. zambeziensis proteins to proteins of two closely related Rhipicephalus species. a Length distribution of the predicted proteins of R. zambeziensis, R. appendiculatus and R. pulchellus. The number of proteins is indicated based on a protein length sliding window of 20 amino acids (aa), showing a maximum length of 1000 aa. b Pfam domain comparison of the three Rhipicephalus species. Datasets used: 13,584 predicted proteins from the assembled R. zambeziensis transcriptome, 12,761 R. appendiculatus proteins [37] and 11,227 R. pulchellus proteins [34]. Blue represents R. zambeziensis; red, R. appendiculatus; and green, R. pulchellus
Fig. 2Classification and expression analysis in the R. zambeziensis transcriptome. a Proportions of predicted protein-coding (indicated by dark blue colouring) and predicted non protein-coding (light blue) transcripts and their contribution to total expression. b Proportions of protein numbers (blue) and expression contribution (red) of the four predicted protein classes to the total protein-coding fraction of the transcriptome. Expression was estimated by transcripts per million (TPM)
Characterisation of the secretory protein families in the R. zambeziensis transcriptome
| Secretory protein family | Number of family members | Proportion of family members (%) | Average TPMa value | Sum of the family expression (TPMa) | Proportion of family expression (%) |
|---|---|---|---|---|---|
| Lipocalin | 588 | 22.90 | 39.88 | 23,452.37 | 5.62 |
| Bovine pancreatic trypsin inhibitor | 307 | 11.95 | 34.07 | 10,458.51 | 2.51 |
| Reprolysin | 213 | 8.29 | 19.36 | 4123.57 | 0.99 |
| TIL domain | 177 | 6.89 | 38.05 | 6735.14 | 1.61 |
| Glycine rich superfamily | 161 | 6.27 | 1842.35 | 296,618.84 | 71.07 |
| Basic tail secreted protein | 111 | 4.32 | 43.04 | 4777.46 | 1.14 |
| 8.9 kDa family | 91 | 3.54 | 74.27 | 6758.57 | 1.62 |
| Digestive system (including Serine proteases) | 87 | 3.39 | 9.57 | 832.96 | 0.20 |
| Mucin | 62 | 2.41 | 47.14 | 2922.86 | 0.70 |
| 28 kDa Metastriate family | 56 | 2.18 | 31.67 | 1773.45 | 0.42 |
| Evasin | 55 | 2.14 | 33.33 | 1832.92 | 0.44 |
| Secretory - unknown function | 46 | 1.79 | 455.62 | 20,958.56 | 5.02 |
| Folding, sorting and degradation (including Cathepsins) | 46 | 1.79 | 88.22 | 4058.19 | 0.97 |
| Cystatin | 45 | 1.75 | 17.27 | 777.09 | 0.19 |
| Gluzincin | 43 | 1.67 | 7.34 | 315.45 | 0.08 |
| Serpin | 33 | 1.29 | 6.30 | 207.96 | 0.05 |
| Ixodegrin B | 32 | 1.25 | 55.31 | 1770.02 | 0.42 |
| One of each family | 28 | 1.09 | 3.71 | 103.68 | 0.02 |
| Carboxypeptidase inhibitor | 27 | 1.05 | 16.01 | 432.37 | 0.10 |
| Chitin-binding proteins | 26 | 1.01 | 20.30 | 527.89 | 0.13 |
| 5′-Nucleotidase | 25 | 0.97 | 6.56 | 163.87 | 0.04 |
| Transport and catabolism | 24 | 0.93 | 35.23 | 845.47 | 0.20 |
| 24 kDa family | 21 | 0.82 | 24.99 | 524.87 | 0.13 |
| 7 dB family | 19 | 0.74 | 11.34 | 215.53 | 0.05 |
| DA-P36 family | 19 | 0.74 | 10.15 | 192.90 | 0.05 |
| Defensin | 19 | 0.74 | 211.69 | 4022.20 | 0.96 |
| ML domain | 17 | 0.66 | 417.39 | 7095.58 | 1.70 |
| Antigen 5 family | 14 | 0.55 | 57.55 | 805.64 | 0.19 |
| Microplusin | 14 | 0.55 | 47.61 | 666.59 | 0.16 |
| 8 kDa Amblyomma family | 13 | 0.51 | 24.52 | 318.79 | 0.08 |
| Sphingomyelinase | 11 | 0.43 | 2.93 | 32.22 | 0.01 |
| Glycan biosynthesis and metabolism | 10 | 0.39 | 9.36 | 93.63 | 0.02 |
| Lipid metabolism | 10 | 0.39 | 5.88 | 58.76 | 0.01 |
| Transcription | 10 | 0.39 | 3.75 | 37.52 | 0.01 |
| Translation | 8 | 0.31 | 10.62 | 84.92 | 0.02 |
| Serine/threonine protein kinase | 8 | 0.31 | 6.59 | 52.70 | 0.01 |
| Carbohydrate metabolism | 8 | 0.31 | 3.53 | 28.21 | 0.01 |
| Thyropin | 7 | 0.27 | 22.67 | 158.66 | 0.04 |
| Fibrinogen-related domain | 7 | 0.27 | 20.84 | 145.85 | 0.03 |
| Glutathione metabolism | 7 | 0.27 | 16.63 | 116.41 | 0.03 |
| Metalloprotease | 7 | 0.27 | 8.28 | 57.93 | 0.01 |
| Dermacentor 9 kDa expansion | 6 | 0.23 | 11.81 | 70.86 | 0.02 |
| Replication and repair | 6 | 0.23 | 5.56 | 33.37 | 0.01 |
| Immunoglobulin G binding protein A | 5 | 0.19 | 989.88 | 4949.38 | 1.19 |
| Phospholipase A2 | 5 | 0.19 | 9.54 | 47.69 | 0.01 |
| Kazal/vWf domain | 4 | 0.16 | 4.78 | 19.11 | 0.005 |
| Hirudin | 3 | 0.12 | 108.72 | 326.17 | 0.08 |
| SALP15/Ixostatin | 3 | 0.12 | 35.45 | 106.35 | 0.03 |
| 14 kDa family | 3 | 0.12 | 9.75 | 29.25 | 0.01 |
| Kazal domain | 3 | 0.12 | 9.28 | 27.85 | 0.01 |
| Signal transduction | 3 | 0.12 | 2.23 | 6.68 | 0.002 |
| Madanin | 2 | 0.08 | 153.53 | 307.05 | 0.07 |
| Energy metabolism | 2 | 0.08 | 13.13 | 26.25 | 0.01 |
| CDIV | 2 | 0.08 | 9.44 | 18.88 | 0.005 |
| EF hand domain | 2 | 0.08 | 5.33 | 10.65 | 0.003 |
| Histamine release factor | 1 | 0.04 | 6057.33 | 6057.33 | 1.45 |
| Fatty acid-binding protein | 1 | 0.04 | 70.19 | 70.19 | 0.02 |
| Kazal/SPARC domain | 1 | 0.04 | 70.11 | 70.11 | 0.02 |
| Immune system | 1 | 0.04 | 21.85 | 21.85 | 0.01 |
| Cysteine rich hydrophobic domain 2 | 1 | 0.04 | 10.83 | 10.83 | 0.003 |
| 26 kDa family | 1 | 0.04 | 5.49 | 5.49 | 0.001 |
| Proline rich | 1 | 0.04 | 5.09 | 5.09 | 0.001 |
| Cysteine rich | 1 | 0.04 | 2.23 | 2.23 | 0.001 |
aTPM (transcripts per million) values were used to estimate expression
Fig. 3Expression proportions of the R. zambeziensis secretory protein families during feeding. The proportions of the highest contributing secretory protein families of female (a) and male (b) ticks at different feeding time points are indicated. Expression levels were measured by transcripts per million (TPM). Colour key representing the protein families is indicated. Expression values can be found in Additional file 1: Table S3
Fig. 4Overview of differential expression in the R. zambeziensis sialotranscriptome. a Classification of differentially expressed transcripts into different protein or transcript classes. b Proportion of differential expression observed within each protein or transcript class. Differential expression analyses were performed using the edgeR software package (with the parameters: fixed dispersion = 0.4, fold change > 4 and FDR P < 0.01)
Fig. 5Differentially expressed transcripts of secretory protein families in R. zambeziensis. The numbers of up- (red colour) and downregulated (green) secretory protein transcripts, after pairwise comparisons between different feeding time points, are represented. Pairwise comparisons are shown for female: (a) day 0 vs day 3, (c) day 3 vs day 5 and (e) day 0 vs day 5; and male ticks: (b) day 0 vs day 3, (d) day 3 vs day 5 and (f) day 0 vs day 5. The pairwise comparisons are represented as a progression of feeding and show how transcript expression changed from the earlier to the later time point. The edgeR software package (fixed dispersion = 0.4, fold change > 4 and FDR P < 0.01) was used for differential expression