Literature DB >> 29462134

An insight into the salivary gland and fat body transcriptome of Panstrongylus lignarius (Hemiptera: Heteroptera), the main vector of Chagas disease in Peru.

Jessica C Nevoa1, Maria T Mendes2, Marcos V da Silva1, Siomar C Soares1, Carlo J F Oliveira1, José M C Ribeiro3.   

Abstract

Triatomines are hematophagous arthropod vectors of Trypanosoma cruzi, the causative agent of Chagas Disease. Panstrongylus lignarius, also known as Panstrongylus herreri, is considered one of the most versatile triatomines because it can parasitize different hosts, it is found in different habitats and countries, it has sylvatic, peridomestic and domestic behavior and it is a very important vector of Chagas disease, especially in Peru. Molecules produced and secreted by salivary glands and fat body are considered of important adaptational value for triatomines because, among other functions, they subvert the host haemostatic, inflammatory and immune systems and detoxify or protect them against environmental aggressors. In this context, the elucidation of the molecules produced by these tissues is highly valuable to understanding the ability of this species to adapt and transmit pathogens. Here, we use high-throughput sequencing techniques to assemble and describe the coding sequences resulting from the transcriptome of the fat body and salivary glands of P. lignarius. The final assembly of both transcriptomes together resulted in a total of 11,507 coding sequences (CDS), which were mapped from a total of 164,676,091 reads. The CDS were subdivided according to their 10 folds overexpression on salivary glands (513 CDS) or fat body (2073 CDS). Among the families of proteins found in the salivary glands, lipocalins were the most abundant. Other ubiquitous families of proteins present in other sialomes were also present in P. lignarius, including serine protease inhibitors, apyrase and antigen-5. The unique transcriptome of fat body showed proteins related to the metabolic function of this organ. Remarkably, nearly 20% of all reads mapped to transcripts coded by Triatoma virus. The data presented in this study improve the understanding on triatomines' salivary glands and fat body function and reveal important molecules used in the interplay between vectors and vertebrate hosts.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29462134      PMCID: PMC5834209          DOI: 10.1371/journal.pntd.0006243

Source DB:  PubMed          Journal:  PLoS Negl Trop Dis        ISSN: 1935-2727


Introduction

Panstrongylus lignarius, also known as Panstrongylus herreri (WALKER, 1873) [1], is a triatomine species found in tropical and subtropical forests of South American countries including Peru, Ecuador, Colombia, Guyana, Suriname, Venezuela and Brazil [2, 3]. This species presents sylvatic behavior in the Amazon basin [4, 5] and peridomestic and domestic behavior in Peru [6]. Concerning its medical importance, this species is strongly synanthropic and is considered the major vector of Chagas disease in Peru [6, 7]. Among the triatomines of the genus Panstrongylus, the species P. lignarius is notable because, among other characteristics, it is capable of parasitizing different species of animals including marsupials, rabbits, spiny rats, anteaters, bats, chickens, toucans and pigeons [3]. In the Amazon region located in Peru, a considerable amount of the triatomines of this species are naturally infected with Trypanosoma cruzi (62.4%), and among those with identified food sources, 18.2% have been fed with human blood [8]. Among the mechanisms related to triatomine’s adaptation, it has been suggested that their saliva, which is inoculated during hematophagy, is crucial for the parasitism process and pathogen transmission. Indeed, the saliva of hematophagous arthropods, including triatomines, has inhibitory molecules of different defense mechanisms including platelet aggregation, inflammation, vasoconstriction, blood coagulation, and immune responses, which has been demonstrated to facilitate hematophagy and transmission of disease-causing agents [9]. In addition to saliva, molecules produced by the fat body from hematophagous arthropods have a substantial role in the detoxification of heme from blood, in developmental regulation and in the production of antimicrobial peptides and immunity [10-12]. Beyond these functions, the fat body is a multifunctional organ that has a pivotal role in nutrient and energy storage, in the synthesis of biomolecules and the whole metabolism [13]. It acts as a storage of energetic sources, important for the metamorphosis, egg maturation, reproduction and to survive long starvation periods. The fat body synthesizes and releases peptides, carbohydrates and lipids according to the metabolic needs and hormonal regulation [14]. It has been demonstrated through proteomic studies that triatomines of the P. lignarius species present a large number of bioactive molecules, but these molecules have a high interspecific functional biodiversity when compared to the molecules of the species Triatoma lecticularia and Rhodnius prolixus [15]. It has also been recently described that salivary molecules of P. lignarius, when compared to saliva of triatomines of the genus Triatoma, Meccus and Rhodnius have a remarkable ability to modulate dendritic cells and facilitate their invasion by T. cruzi [16]. The isolation and characterization of bioactive molecules in different tissues of blood-feeding insects has grown significantly in recent years and this scenario is mainly due to high-throughput sequencing techniques associated with bioinformatic tools. Different databases searches reveal genomes and sialomes of hematophagous arthropods such as ticks, mosquitoes and triatomines [17-24]. Here, we use high-throughput sequencing techniques to assemble and describe the coding sequences derived from a transcriptome of salivary glands and fat body of P. lignarius.

Material and methods

Ethics statement

The experiments were approved by the Institutional Animal Care and Use Committee—CEUA (protocol numbers 220 and 320).

Insects

P. lignarius was obtained from the insectary of the Universidade Federal do Triângulo Mineiro, Uberaba, Minas Gerais, Brazil. The colonies were maintained in cylindrical recipients and fed weekly on chickens. The experiments were approved by the Institutional Animal Care and Use Committee—CEUA (protocol numbers 220 and 320). Fed adults, including 7 female and 7 male insects were used to collect salivary glands (SG) and fat body (FB). One couple was dissected every other day for 14 days. The SG and FB were stored in 200 μl and 400 μl of RNA later (Qiagen, Valencia, CA) respectively, at 4°C for 48 hours and then maintained at -80°C until the day of shipping. The samples from the 14 days were pooled together and used for qRT-PCR or sent lyophilized to NIH Intramural Sequencing Center (5625 Fishers Lane—Rockville, MD 20852).

Sequencing

All procedures, including RNA extraction, libraries construction and sequencing were performed as previously described [23], with modifications. Briefly, RNA from each sample was collected using the Micro FastTrack-mRNA isolation kit (Invitrogen, Grand Island, NY) according to the manufacturer’s protocol. Following the isolation, total RNA integrity was checked using the BioAnalyser instrument (Agilent Technologies, Santa Clara, CA). The construction of mRNA libraries and sequencing were done at the NIH Intramural Sequencing Center. The fragments of cDNAs were made using a Covaris E210 (Covaris, Woburn, MA) and the libraries of SG and FB were constructed separately using the TruSeq RNA sample prep kit, v. 2 (Illumina Inc., San Diego, CA). Both libraries were amplified using eight cycles to minimize the risk of over-amplification. The sequencing of SG and FB were performed on a HiSeq 2000 (Illumina) with v. 3 flow cells and sequencing reagents. A paired-end protocol was used.

Bioinformatics

Raw data were processed using RTA 1.12.4.2 and CASAVA 1.8.2. The reads were trimmed of low quality regions, and only those with an average Illumina quality of 20 or more were used. Afterwards, they were assembled using ABySS software (Genome Sciences Centre, Vancouver, BC, Canada) [25, 26]. SOAPdenovo-Trans assembler [27] was also used because the ABySS may misassemble highly expressed transcripts. Assemblies were then joined using BLAST and Cap3 assembler [28]. All coding sequences (CDS) from SG and FB were selected based on similarities with known proteins or containing signal peptide using an automated pipeline [29]. The CDS and their respective protein sequences were placed in a hyperlinked Excel spreadsheet [30]. Software from the Center for Biological Sequence Analysis (Technical University of Denmark, Lyngby, Denmark) were used to predict Signal peptide, transmembrane domains, furin cleavage sites, and mucin-type glycosylation [29, 31–33]. Blastn [34] was used to map the reads into contigs with a word size of 25. The resulting contigs and RPKM values were also mapped to the Excel spreadsheet available as supplemental S1 and S2 Spreadsheets. Differential expression of the reads mapping to contigs between the two libraries were done using the X2 test. Relative expression of the transcripts of each separate transcriptome was evaluated using the “expression index”, which is the number of reads to a particular CDS divided by the largest number of reads mapped for a single CDS. The automated annotation of the proteins was based in the matches to various databases, including Gene Ontology, Pfam, Swissprot, KOG, SMART, Refseq-invertebrates and sequences containing Hemiptera[organism] protein sequences obtained from GenBank. The manual annotation was performed as detailed in [28].

Phylogenetic analysis

Evolutionary analyses were conducted in MEGA6 [35]. The evolutionary history of selected protein sequences was inferred using the Neighbor-Joining method. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown next to the figure branches [36]. The trees were drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Poisson correction method [37] and are in the units of the number of amino acid substitutions per site. The rate variation among sites was modeled with a gamma distribution (shape parameter = 1). The sequences are shown with the first 3 letters of the genus name followed by the first 3 of the species name followed by their GenBank accession code.

qRT-PCR

Tissue expression of randomly chosen sixty genes were evaluated by RT-PCR. Briefly, RNA of salivary glands and fat body were extracted using a RNA SV Total RNA Isolation System (Promega, USA) according to the manufacturer’s recommendations. The cDNA was prepared using a High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems, USA) according to the manufacturer’s recommendations and samples were then frozen at −20°C until analysis. Gene expression was evaluated using a Sybr Green Master Mix (Roche, EUA) and specific primers (forward and reverse) as described in S1 Spreadsheet. Ultrapure DNA/RNA-free water was used as negative control. Relative gene expression was determined by ΔΔCT comparative method using PhSigP-51408_FR4_55–276 as reference gene (similar expression in salivary gland and fat body).

Results and discussion

General description of salivary gland and fat body transcriptome

The final assembly of both transcriptomes generated a total of 11,507 CDS, which were mapped from a total of 164,676,091 reads. The 11,507 CDS were subdivided per their putative function (Table 1 and S1 Spreadsheet) as Housekeeping, Secreted, Viral, Transposons and Unknown. The housekeeping (H) class had 5,460 CDS, corresponding to 47% of the total. The putative secreted (S) had 2,943 CDS, or 25% of all CDS. Remarkably, 20% of the reads mapped to transcripts coding for putative viral proteins, particularly to Triatoma virus proteins. Transposable elements (TE) accounted to only 3% of the CDS and 0.82% of the reads. Approximately 7% of the reads, corresponding to 2,734 CDS, were not classified and were placed in the unknown (U) class (Table 1).
Table 1

General classification of all coding sequences (CDS) from the combined salivary glands and fat body transcriptome of P. lignarius.

ClassNumber of CDS% of totalNumber of Reads% of total
Housekeeping5,46047.4580,227,20848.72
Secreted2,94325.5838,247,11623.23
Viral120.1033,602,36420.41
Transposons3583.111,343,4080.82
Unknown2,73423.7611,255,9956.84
Total11,507100164,676,091100

Housekeeping (H) genes of the combined transcriptome

The 5,460 CDS classified as housekeeping genes were characterized in 21 subgroups depending of their putative functions (S1 Spreadsheet and Table 2). In these subgroups, the category “signal transduction” had the highest expression presenting 11% of the mapped reads from class H, followed by “storage” and “protein synthesis machinery”.
Table 2

Classification of all coding sequences (CDS) with putative housekeeping function extracted from the transcriptome of P.lignarius.

ClassNumber of Contigs% of totalNumber of Reads% of total
Signal transduction1,00918.489,118,21511.37
Storage180.337,216,2358.99
Protein synthesis machinery2805.136,379,9677.95
Nuclear regulation3155.776,211,0757.74
Transcription machinery59510.906,207,1647.74
Transporters/storage4588.395,503,6026.86
Protein modification machinery3115.704,996,7426.23
Cytoskeletal3296.034,788,3345.97
Metabolism, lipid2825.164,584,5805.71
Extracellular matrix/cell adhesion1993.643,512,5024.38
Proteasome machinery2424.433,112,1533.88
Metabolism, energy1903.483,011,9663.75
Metabolism, amino acid1162.122,696,2063.36
Metabolism, carbohydrate1913.502,537,5163.16
Oxidant metabolism/detoxification1442.642,378,3672.96
Protein export machinery3316.062,360,3142.94
Immunity1051.922,124,0902.65
Metabolism, nucleotide1031.891,199,9741.50
Metabolism, intermediate601.10978,5961.22
Transcription factor1482.71946,8381.18
Nuclear export340.62362,7720.45
Total5,46010080,227,208100
Of the 11,507 CDS of the combined transcriptome, 513 were found 10x or more expressed in the salivary glands, 2,073 were found similarly overexpressed in the fat bodies, and 8,921 CDS were not particularly overexpressed in either organ (Fig 1). We will proceed analyzing these enriched subsets, as they represent what is possibly specific for each tissue type.
Fig 1

Venn diagram of SG and FB transcriptomes of P. lignarius.

SG10x, 10-fold overexpressed transcripts in the salivary gland compared to fat body; FB10x, 10-fold overexpressed transcripts in the fat body compared to the salivary gland; and, Not Enriched, transcripts expressed in both SG and FB with less than a 10-fold variation.

Venn diagram of SG and FB transcriptomes of P. lignarius.

SG10x, 10-fold overexpressed transcripts in the salivary gland compared to fat body; FB10x, 10-fold overexpressed transcripts in the fat body compared to the salivary gland; and, Not Enriched, transcripts expressed in both SG and FB with less than a 10-fold variation.

The enriched salivary gland transcriptome of Panstrongylus lignarius

A total of 513 CDS appeared at least 10 X overexpressed in salivary glands (now referred to SG enriched transcriptome) when compared to the fat body transcriptome (S2 Spreadsheet and Table 3). The majority of these were associated with secreted products as expected for the SG.
Table 3

Classification of coding sequences (CDS) that are at least 10 x overexpressed in the salivary glands transcriptome as compared to the fat body transcriptome.

SG overexpressed 10 xAverage FPKM SGN% of totalAverage FPKM FBRatio FPKM
Secreted2,005.4532864.136.66301.07
Housekeeping221.6013626.514.2851.74
Unknown830.20448.583.95210.38
Transposons6.3740.780.3120.58
Total or average513100145.94
The SG enriched transcriptome contains 136 transcripts attributed to the housekeeping class, as further detailed in Table 4.
Table 4

Classification of coding sequences (CDS) 10X overexpressed in the salivary transcriptome with putative housekeeping function.

SubclassNo. of CDSNo. of reads% Total
Detoxification17673,88334.84
Cytoskeletal4267,21613.84
Lipid metabolism18203,53010.54
Protein synthesis machinery4185,8479.62
Unknown conserved15143,4727.42
Transporters33140,6267.28
Signal transduction2489,5064.63
Glycosyl transferase176,1953.95
15-hydroxyprostaglandin dehydrogenase465,3303.38
Transcription machinery652,7142 2.72
Nuclear Regulation223,0561.19
Salivary ubiquitin13,5810.18
Transcription Factors32,1560.11
Protein export and modification22,1140.11
Intermediate metabolism11,1810.06
Energy metabolism16440.03
Total1361,931,051100
Transcripts belonging to the detoxification class are the most abundant of the SG overexpressed CDS of the housekeeping class. Of these 17 transcripts, 12 are members of the cytochrome P450 family. Additionally, 4 CDS matches 15-hydroxyprostaglandin dehydrogenase, a similar finding in previous triatomine transcriptomes of Triatoma [18, 38, 39] and Panstrongylus megistus [40]. In a previous review, it was stated that this combination of transcripts suggested a role of triatomine salivary glands in the manufacture of eicosanoids [9]. However, a search for prostaglandins in the saliva of triatomines was negative [16]. Since these enzymes are associated with prostaglandin catabolism, it is here speculated that prostaglandins may function as salivary secretagogues and that the enzyme is associated with agonist detoxification. Alternatively, non-prostaglandin eicosanoids may be produced by the SG.

Transcripts coding for putative secreted proteins in Panstrongylus lignarius salivary glands

Lipocalins and the small molecule binding proteins with a JH binding motif comprised over 40% of the secreted transcripts that are overexpressed in the SG transcriptome (Table 5).
Table 5

Classification of coding sequences with putative secretory function extracted from sialotranscriptome of P. lignarius.

SubclassNo. of CDSNo. of reads% Total
Lipocalins886,280,41129.09
JH binding protein72,629,38712.18
Kazal-type peptides182,343,41210.85
Glycine rich proteins171,951,5749.04
Salivary proteases331,829,3738.47
Other secreted proteins941,580,6897.32
Conserved insect family 1541,349,8106.25
Conserved insect family 12171,000,8544.64
Immunity related8983,0204.55
Inositol-145-triphosphate 5-phosphatase4414,8551.92
Amylase/Maltase4302,1861.40
Antigen-53245,8731.14
Hemiptera specific family 2255171,8740.80
Mucin3155,0750.72
Serpin286,7230.40
Apyrase283,2920.39
Hemiptera specific family 2101051,1960.24
Transferrin135,8690.17
Endonuclease135,8620.17
Ribonuclease128,5320.13
Lipase312,4430.06
Phosphatase111,3080.05
Odorant binding protein25,4120.03
Total32821,589,030100.00
Lipocalins are widely distributed in vertebrates, invertebrates, plants and bacteria [41, 42] and it is one of the main classes of proteins on the salivary glands of ticks and triatomines [43, 44]. Lipocalins possess a conserved three-dimensional structure and include an extensive group of extracellular proteins that generally bind to small hydrophobic proteins, extracellular ligands and other proteins. Triatomine lipocalins were shown to have vasodilator, anticoagulant and antiplatelet activities [45]. Some of these functions, such as the anticlotting activities of triabin [46] or nitrophorin 2 [47, 48], are exerted by interactions of the lipocalin with a clotting cascade protein, while other functions relate to their strong binding to agonists of hemostasis or inflammation, namely their kratagonist function (from the Greek “kratos” = to seize) [49]. Recently, a salivary lipocalin of Rhodnius prolixus was shown to antagonize cysteinyl leukotrienes [50]. The assembly of the P. lignarius transcriptome revealed 78 contigs coding for full length lipocalins, all at least 10 times overexpressed in the SG when compared to the FB, and averaging over 3,000 fold overexpression. The protein sequences of these 78 gene products were aligned with 252 other triatomine lipocalins, producing a dendrogram where at least 15 clades with strong bootstrap support are observed (Fig 2). Eight P. lignarius sequences are found within the Pal-Tri-Dip clade which includes the platelet aggregation inhibitors pallidipin [51, 52], triplatin [53, 54] and dipetalodipin [55] from the Triatoma and Dipetalogaster genera. Triplatin and dipetalodipin were shown to be kratagonists of eicosanoids, possibly the same mechanism occurring in pallidipin. Triafestins are inhibitors of the activation of the kinin system [56] found in T. infestans. The two characterized sequences are within a clade with strong bootstrap support. Four P. lignarius sequences are found within this clade. The clade containing Rhodnius platelet aggregation inhibitor (RPAI) [57, 58], as well the Rhodnius prolixus leukotriene binding protein (LBP) [50] is Rhodnius specific, thus not containing any Panstrongylus sequences. It, however, has a sister clade, with low bootstrap support, containing several Triatoma and Panstrongylus sequences, including two from P. lignarius. The clade named triabin contains the anti-thrombin inhibitor from T. infestans [46, 59] and several other sequences, including two from P. lignarius. A large clade contains the salivary antigen procalin [60], of unknown physiological function. The BABP clade, which is sister to the uniquely Rhodnius nitrophorin clade, has the Rhodnius prolixus Biogenic Amine Binding Protein [61, 62], a protein having anti-platelet and vasodilatory activities. Ten P. lignarius sequences populate this clade. This analysis may help to design experiments with recombinant triatomine proteins aiming at determining their functions.
Fig 2

Phylogram of the lipocalin family proteins from P. lignarius and their best matches.

The optimal tree with the sum of branch length = 123.81570372 is shown. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The bar at the center of the graph indicates a value of 0.5. The analysis involved 330 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 423 positions in the final dataset. The values near the branches represent the percentage of bootstrap support. Values below 50% are not shown. For more details, see Material and Methods.

Phylogram of the lipocalin family proteins from P. lignarius and their best matches.

The optimal tree with the sum of branch length = 123.81570372 is shown. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The bar at the center of the graph indicates a value of 0.5. The analysis involved 330 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 423 positions in the final dataset. The values near the branches represent the percentage of bootstrap support. Values below 50% are not shown. For more details, see Material and Methods. Apyrases are enzymes that hydrolyze ATP and ADP to AMP and orthophosphate. These are abundant in the saliva of blood feeding arthropods presumably because they destroy these agonists of platelet aggregation and inflammation [63]. In mosquitoes [64] and triatomine bugs [65], but not in Rhodnius, salivary apyrases belong to the 5’-nucleotidase family, while in Cimex [66] and sand flies [67], and probably in Rhodnius [68], they belong to the Cimex family of apyrases. In P. lignarius, two apyrase-like proteins of the 5’-nucleotidase family are highly expressed in the SG with a total of 83,000 reads (Table 5). A third member of the family was additionally identified, but it is not particularly enriched in either FB or SG transcriptomes. This third member has a glycophosphatidylinositol (GPI) anchor as predicted by the big-PI Predictor site [69], while the two overexpressed proteins do not, indicating they are secreted and not membrane bound. This is in accordance with the postulated evolution of secreted salivary apyrases which included a step of gene duplication of an ancestral, membrane-bound product, plus loss of the GPI anchor [64]. The phylogram of the three P. lignarius members of the apyrase/5’-nucleotidase family together with their best matches by Blastp to GenBank proteins displays robust clades for various insect orders of families, the P. lignarius proteins each sharing a robust clade with other triatomine proteins, indicative of their long evolutionary history (Fig 3). Tellingly is the absence of Rhodnius proteins, supporting the monophyletic status of this genus with relation to its evolution to blood feeding [70].
Fig 3

Phylogram of the apyrase/5’-nucleotidase family of proteins from P. lignarius and their best matches.

The optimal tree with the sum of branch length = 15.40595903 is shown. The values near the branches represent the percentage of bootstrap support. Values below 75% are not shown. The analysis involved 41 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 689 positions in the final dataset. For more details, see Material and Methods.

Phylogram of the apyrase/5’-nucleotidase family of proteins from P. lignarius and their best matches.

The optimal tree with the sum of branch length = 15.40595903 is shown. The values near the branches represent the percentage of bootstrap support. Values below 75% are not shown. The analysis involved 41 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 689 positions in the final dataset. For more details, see Material and Methods. The inositol phosphate 5-phosphatase (IPP) enzymes were predicted in 4 CDS. Previously, the Rhodnius homolog was shown to act both on soluble inositol phosphatase and the substrate phosphoinositide [71]. These phospholipids are involved in several cellular processes related to signal transduction, secretion and cytoskeletal structure. Although IPP was found to be produced by Rhodnius prolixus salivary glands in 2006 [71], it still has an unknown function in this organism. These studies suggested that IPP ejected in saliva act by decreasing the concentration of PI (4,5) P2 e PI(3,4,5) P3, which are present in the plasma membrane of cells and platelets, causing changes in the cytoskeletal architecture [72]; however, how they could enter the cell to perform this function is still a puzzle. Among other enzymes overexpressed in the salivary glands, we address a cathepsin D, which is normally a housekeeping enzyme associated with lysosomes, but is over 1,000 fold overexpressed in the SG, and has a signal peptide indicative of secretion, indicating this protein may have a role in blood meal acquisition by interfering with host hemostasis. Serine proteases were also found, these having been regularly detected in triatomine sialomes [9]. The function of these proteases in feeding, however, is unknown, but could be related to fibrin hydrolysis, as occur with salivary serine proteases from a tabanid [73]. Somewhat surprising is the finding of amylases and maltase overexpressed in the SG. These enzymes are usually found in mosquito sialomes, and are associated with the sugar feeding mode of these organism [63]. Is it possible that P. lignarius feeds on plants? Recently it was shown that R. prolixus feeds on plants [74] and perhaps this behavior is more widespread. Serine-proteases inhibitory proteins were found in the salivary gland transcriptome. These may play a role in inhibiting the coagulation cascade or the activation of the complement system. These proteins are subdivided according to their domain, such as the Kazal domain and serpins [75]. The Kazal family was the second most abundant in P. lignarius SG, corresponding to ~ 10% of the total reads. They consist in molecules with single or multiple domains with a shared conserved motif and a distinct pattern of cysteine distribution. Several proteins from the Kazal family were already described in vertebrates and invertebrates, including triatomines. Kazal domain containing peptides are typical inhibitors of serine proteases. Indeed, the two Kazal domain protein Rhodnin, isolated from the crop of R. prolixus, was shown to inhibit thrombin [76]; similarly, dipetalogastin was isolated from D. maximus [77] and brasiliensin from T. brasiliensis [78] guts. Infestins, with up to seven Kazal domains, were isolated from T. infestans midgut [79-82] and shown to inhibit thrombin, neutrophil elastase and Factor XIIa. The salivary gland transcriptome of P. lignarius discloses a seven-Kazal domain containing peptide that is the most expressed member of this peptide class, encoded by Ph-59126. On the other hand, a KGD-containing Kazal peptide in tabanids named vasotab was shown to have vasodilatory activity, in addition to anti-platelet activity [83]. Phylogenetic analysis of P. lignarius Kazal-domain containing peptides aligned with other related triatomine proteins indicates the complexity of this family (Fig 4). A clade containing the triatomine intestinal serine protease inhibitors (named Infestin in Fig 4) contains the seven-Kazal peptide from P. lignarius mentioned above, indicating that P. lignarius may have co-opted this peptide family for salivary expression. Four other robust clades, plus one Rhodnius-specific clade indicate the diversity of these peptides in triatomines. Clade IV includes JAW15592.1, JAW15851.1 and JAW15336.1 which have weak similarity to vasotab; however, the KGD domain of vasotab associated with anti-platelet function is not found in P. lignarius.
Fig 4

Phylogram of the Kazal-domain containing family of proteins from P. lignarius and their best matches.

The optimal tree with the sum of branch length = 14.99541793 is shown. The values near the branches represent the percentage of bootstrap support. Values below 50% are not shown. The analysis involved 43 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 969 positions in the final dataset. For more details, see Material and Methods. Robust clades are named I-IV (where no member has been functionally characterized); Infestin and Rhodnius indicate clades containing previously functionally characterized proteins.

Phylogram of the Kazal-domain containing family of proteins from P. lignarius and their best matches.

The optimal tree with the sum of branch length = 14.99541793 is shown. The values near the branches represent the percentage of bootstrap support. Values below 50% are not shown. The analysis involved 43 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 969 positions in the final dataset. For more details, see Material and Methods. Robust clades are named I-IV (where no member has been functionally characterized); Infestin and Rhodnius indicate clades containing previously functionally characterized proteins. The serpin domain containing proteins, or serpin-like, are also found in the saliva of arthropods and they affect hemostasis, including platelet adhesion, fibrinolysis and coagulation, facilitating the blood feeding. Two salivary enriched serpins were found in the P. lignarius transcriptome. The Antigen-5 protein family is a group of proteins belonging to the cysteine-rich secretory proteins (CRISP) superfamily which have been identified through sialotranscriptomes in the saliva of different hematophagous insects, such as mosquitoes [84, 85], phlebotomines [86-88], and triatomines [17, 18, 21, 22, 24, 89–91]. Although it is frequently found in the saliva of hematophagous arthropods, its function is mostly unknown and it was described to be part of the toxin repertoire of snake venom [92]. An antigen-5 protein from Dipetalogaster maxima was shown to inhibit platelet aggregation by low doses of collagen and to have superoxide dismutase activity [93]. The current transcriptome of P. lignarius identified three members of this family that are highly expressed in the SG. Immunity-related proteins and peptides are broadly found in several organisms from mammals to plants and hematophagous arthropods. In the saliva of these insects and ticks, those peptides, such as lysozyme and defensins may help in controlling the microbial growth in the ingested blood and, in the vertebrate host, they may prevent microbial infections at the biting place [94]. Pathogen pattern recognition proteins such as lectins were also found in this group; these proteins may play a role in modulating host immunity. Only two transcripts coding for members of the Odorant/Pheromone-Binding Family (OBP) were found in the enriched SG transcriptome, corroborating the findings from other studies with Rhodnius, Triatoma, Panstrongylus and Cimex, where in Cimex they were more abundant. The properties of these proteins in blood-feeding is unknown [9]. We also identified 7 CDS putatively coding for Juvenile hormone binding proteins (JH binding proteins) with very long sequences and representing ~12% of the total reads. These proteins were already identified in the SG of Rhodnius neglectus [91], Anopheles culicifacies [95], Anopheles gambiae [96] and Aedes aegypti [97], however, no single CDS coding for JH binding proteins was found in the sialotranscriptome of P. megistus [24]. JH in adult insects is responsible for controlling the reproductive maturation and inducing vitellogenesis, as JH and vitellogenin are negatively correlated. Additionally, JH carrier and odorant binding proteins have a pivotal function in the regulation of feeding behavior of hematophagous arthropods [98]. Despite these more conventional functions, it is possible that these gene products may have been recruited to a salivary function to act as kratagonists of lipidic mediators of hemostasis. The sialome of P. lignarius revealed at least two protein family expansions that appear exclusive of Hemiptera, and three protein family expansions that appear to be exclusive of insects. Some of the transcripts are highly expressed, such as Ph-55919, with a RPKM = 14,784. Their functions are unknown.

Fat body

There are few transcripts analyses of fat body published so far, such as Bactocera dorsalis [12], Melipona scutellaris [99], and Aedes aegypti [100], and none are related to triatomines. The evaluation of the entire transcriptome from the fat body of P. lignarius (Table 6) showed the most prevalent subclass was classified as secreted proteins with ~20.5% of the total reads, which agrees with one of the main functions of this tissue, the synthesis of peptides with distinct functions, many to be destined to the hemolymph compartment. Following the secreted proteins, the next subclass with high FB expression was storage with ~8% of the reads, also associated with secreted hemolymph proteins. On the 10x overexpressed subset (Table 7), the most prevalent classes of molecules were also related to secreted proteins and storage, followed by proteins with unknown function and cytoskeletal proteins.
Table 6

Classification of all coding sequences (CDS) from the fat body transcriptome of P. lignarius.

SubclassNo. of CDSNo. of reads% Total
Secreted protein2,66416,900,61620.40
Storage176,882,1768.23
Signal transduction9855,648,5556.82
Transcription machinery5904,808,0765.80
Viral product114,200,8865.07
Protein synthesis machinery2764,173,6415.04
Transporters and channels4253,879,5404.68
Unknown product1,5593,429,2734.14
Cytoskeletal proteins3253,399,6634.10
Unknown conserved8813,356,8644.05
Nuclear regulation3132,975,7293.60
Protein modification2952,844,0943.43
Proteasome machinery2412,378,8032.87
Lipid metabolism2602,260,8692.73
Amino acid metabolism1152,250,7412.71
Energy metabolism1882,075,0932.50
Carbohydrate metabolism1851,698,8192.05
Protein export3291,392,8781.68
Extracellular matrix1821,298,6201.56
Unkown conserved membrane protein2351,029,2301.24
Transposable element3531,020,6351.23
Nucleotide metabolism102979,4831.18
Immunity97922,6671.11
Intermediary metabolism59789,4040.95
Detoxification72762,6710.92
Transcription factor145677,0340.81
Oxidant metabolism/Detoxification56496,5850.60
Nuclear export34287,5660.34
Total10,99482,820,211100
Table 7

Coding sequences (CDS) 10X overexpressed in fat body compared to salivary glands from transcriptome of P. lignarius.

SubclassNo. of CDSNo. of reads% Total
Secreted protein61111,982,88032.26
Storage96,803,96618.32
Unknown product3872,081,9055.60
Cytoskeletal proteins1121,794,7944.83
Transcription machinery451,696,1424.57
Transporters and channels891,609,1274.33
Signal Transduction1681,242,1313.34
Amino acid metabolism351,065,1222.87
Unknown conserved1561,000,5742.69
Carbohydrate metabolism30826,7992.23
Energy metabolism33799,2572.15
Proteasome machinery29717,2031.93
Extracellular matrix52681,4891.83
Nucleotide metabolism11677,4611.82
Protein modification65561,4691.51
Transposable element18494,2321.33
Detoxification9490,7351.32
Unkown conserved membrane protein31459,3871.24
Lipid metabolism43450,2271.21
Protein synthesis machinery16390,1481.05
Immunity31346,2410.93
Nuclear regulation22283,8270.76
Intermediary metabolism6230,4410.62
Oxidant metabolism/Detoxification14195,3730.53
Protein export32136,6850.37
Transcription factor1773,4420.20
Nuclear export253,3680.14
Total2,07337,144,425100
Vitellogenin is an important protein precursor of vitelline, used in insect oocytes’ formation and maturation, and it is produced exclusively in the fat body of insects and then processed and secreted in hemolymph [101]. The transcriptome of P. lignarius allowed for the disclosure of JAW07678.1, a 1,324 aa long protein with N-terminal Vitellogenin-N and carboxyterminal VWD motifs, typical of insect vitellogenins. Alignment of this protein sequence with its best matches from the NCBI database allows identification of the insect order-specific clades Hemiptera, Hymenoptera and Diptera, including the Diptera sub-orders Brachycera and Nematocera, and the Heteroptera sub-order within the Hemiptera, and within Heteroptera, the Reduvidae family (Fig 5).
Fig 5

Phylogram of the vitellogenin protein of P. lignarius and their best matches.

The optimal tree with the sum of branch length = 11.41384103 is shown. The values near the branches represent the percentage of bootstrap support. Values below 50% are not shown. The analysis involved 45 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 1713 positions in the final dataset. For more details, see Material and Methods. The Panstrongylus lignarius sequence is shown with a red marker.

Phylogram of the vitellogenin protein of P. lignarius and their best matches.

The optimal tree with the sum of branch length = 11.41384103 is shown. The values near the branches represent the percentage of bootstrap support. Values below 50% are not shown. The analysis involved 45 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 1713 positions in the final dataset. For more details, see Material and Methods. The Panstrongylus lignarius sequence is shown with a red marker. Transferrins are glycoproteins found in different animals, such as mammals, marsupials, fish and in more than 34 species of invertebrates, including R. prolixus [102]. These proteins may function in insect defense mechanisms [103]. In insects, transferrins are synthesized and stored in the fat body for posterior secretion to the hemolymph, where they participate in iron uptake and distribution with ferritin [104]. A transferrin was identified in the fat body transcriptome of P. lignarius. This 656 aa long protein was aligned with its best matches from the NCBI database to produce the phylogram shown in Fig 6, where robust clades of several insect orders are observed.
Fig 6

Phylogram of the transferrin protein of P. lignarius and their best matches.

The optimal tree with the sum of branch length = 10.61177438 is shown. The values near the branches represent the percentage of bootstrap support. Values below 50% are not shown. The analysis involved 51 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 1487 positions in the final dataset. For more details, see Material and Methods. The Panstrongylus lignarius sequence is shown with a red marker.

Phylogram of the transferrin protein of P. lignarius and their best matches.

The optimal tree with the sum of branch length = 10.61177438 is shown. The values near the branches represent the percentage of bootstrap support. Values below 50% are not shown. The analysis involved 51 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 1487 positions in the final dataset. For more details, see Material and Methods. The Panstrongylus lignarius sequence is shown with a red marker.

Triatoma virus

Previous sialotranscriptomes from triatomines reported a low expression of diverse viruses. For example, in an Illumina-based sialotranscriptome of Panstrongylus megistus seven transcripts were reported that best matched viral proteins. However, these transcripts were poorly expressed, reaching expression indexes below 0.0025 [40]. Another Illumina-based sialotranscriptome of Triatoma infestans reported 38 transcripts similarly coding for putative viruses, two of which had an expression index between 1 and 2.7, most similar to Deformed winged virus and to Drosophila A virus [39]. Similarly, the P. lignarius transcriptome uncovered 12 transcripts putatively coded by viruses. Remarkably, two transcripts coding for the capsid P1 polyprotein (Genbank AHB63946.1) and the nonstructural protein precursor (Genbank NP_620562.1) of Triatoma virus were very highly expressed, attaining expression indexes of 100 (most expressed transcript) and 18 in the SG. Each transcript accrued more than 9% of the totality of reads, summing up to near 20% of all transcriptome reads deriving from the viral genome (Table 1 and S1 Spreadsheet). The relative expression was high both on the salivary gland as well as in the fat body transcriptomes. Triatoma virus was first discovered infecting T. infestans in Argentina [105], and later shown to infect other Triatoma species as well as Psammolestes coreodes [106]. A survey of laboratory reared insects additionally detected several species harboring the virus, including specimens from the Barbacena insectary from where the P. lignarius used in this work derived [107]. However, P. lignarius was not analyzed in that study, and thus this species can be added to the 15 previously found infected with this virus, which included a single Panstrongylus species (P. guentheri). Mice inoculation with the virus resulted in a non-infective immune response [108], and it was found that people with Chagas’ disease living in Bolivia, Argentina and Mexico developed a detectable immune response to the virus [109]. The high levels of transcription found in the salivary glands suggest that in addition to the previously proposed route of fecal contamination, humans and rodents could be infected via direct salivary inoculation. From the standpoint of insect to insect propagation, it has been proposed that viral transmission occurs via the fecal-oral route or by cleptohematophagy [107]. It should be added that, to the extent mature viral particles are secreted in the bugs’ saliva, co-feeding bugs could promote transmission between insects, as their feeding mechanism includes frequent reversal of the ingestion pump, possibly to dislodge incipient platelet plugs [110, 111], thus spreading the virus in the skin vasculature from where it could reach co-feeding insects.

Determination of relative tissue expression of randomly selected genes

Confirmation of the gene expression of randomly selected genes from the salivary gland and fat body transcriptomes of P. lignarius was performed. We selected twenty 10X overexpressed SG genes, twenty 10X overexpressed FB genes and twenty genes with similar expression in both tissues. As demonstrated in Fig 7, SG or FB overexpression or similar expression between these tissues was confirmed by qRT-PCR for most of the evaluated genes, as predicted by transcriptome analysis.
Fig 7

Validation by qPCR of the differential expression of transcripts between salivary gland and fat body libraries.

Relative gene expression was calculated by ΔΔCT method using PhSigP-51408_FR4_55–276 as reference gene (similar expression in salivary gland and fat body). Scatter plot presenting the SG:FB expression ratio of FB overexpressed, similarly expressed and SG overexpressed genes. The Y axis is the log2 of the observed ratio between SG and FB by qPCR on three groups of arbitrarily selected transcripts that are overexpressed in either tissue, or similarly expressed. S1 Table provides for the transcript names and primer sequences used in this experiment.

Validation by qPCR of the differential expression of transcripts between salivary gland and fat body libraries.

Relative gene expression was calculated by ΔΔCT method using PhSigP-51408_FR4_55–276 as reference gene (similar expression in salivary gland and fat body). Scatter plot presenting the SG:FB expression ratio of FB overexpressed, similarly expressed and SG overexpressed genes. The Y axis is the log2 of the observed ratio between SG and FB by qPCR on three groups of arbitrarily selected transcripts that are overexpressed in either tissue, or similarly expressed. S1 Table provides for the transcript names and primer sequences used in this experiment.

Conclusions

Different species of insects diverge in relation to the molecular compounds of their saliva, which are determined by their evolutionary history including habitat distribution and food source. Sialome studies of several species have already identified a variety of molecules with potential industrial and/or clinical use for their pharmacological activity and discriminating properties as biomarkers of vector exposure, respectively [112]. The present work contributed to the public disclosure of over 9,000 protein sequences that should contribute to the discovery of new pharmacologically active compounds or new vector-exposure immunological markers while serving as a protein database for mass-spectrometric protein identification studies.

Hyperlinked spreadsheet with protein and coding sequences obtained from the transcriptome assemblies.

(XLSX) Click here for additional data file.

Classified hyperlinked spreadsheet with worksheets containing salivary gland and fat body overexpressed gene products.

(XLSX) Click here for additional data file.

List of transcripts and primer sequences used in Fig 7 experiment.

(DOCX) Click here for additional data file.
  101 in total

1.  Infestin, a thrombin inhibitor presents in Triatoma infestans midgut, a Chagas' disease vector: gene cloning, expression and characterization of the inhibitor.

Authors:  I T N Campos; R Amino; C A M Sampaio; E A Auerswald; T Friedrich; H-G Lemaire; S Schenkman; A S Tanaka
Journal:  Insect Biochem Mol Biol       Date:  2002-09       Impact factor: 4.714

2.  A Deep Insight Into the Sialotranscriptome of the Chagas Disease Vector, Panstrongylus megistus (Hemiptera: Heteroptera).

Authors:  José M C Ribeiro; Alexandra Schwarz; Ivo M B Francischetti
Journal:  J Med Entomol       Date:  2015-03-29       Impact factor: 2.278

3.  Identification and characterization of a collagen-induced platelet aggregation inhibitor, triplatin, from salivary glands of the assassin bug, Triatoma infestans.

Authors:  Akihiro Morita; Haruhiko Isawa; Yuki Orito; Shiroh Iwanaga; Yasuo Chinzei; Masao Yuda
Journal:  FEBS J       Date:  2006-06-06       Impact factor: 5.542

4.  An insight into the sialotranscriptome of Triatoma rubida (Hemiptera: Heteroptera).

Authors:  José M C Ribeiro; Teresa C F Assumpção; Van M Pham; Ivo M B Francischetti; Carolina E Reisenman
Journal:  J Med Entomol       Date:  2012-05       Impact factor: 2.278

5.  Exploring the sialome of the blood-sucking bug Rhodnius prolixus.

Authors:  J M C Ribeiro; J Andersen; M A C Silva-Neto; V M Pham; M K Garfield; J G Valenzuela
Journal:  Insect Biochem Mol Biol       Date:  2004-01       Impact factor: 4.714

6.  Purification, cloning, and expression of an apyrase from the bed bug Cimex lectularius. A new type of nucleotide-binding enzyme.

Authors:  J G Valenzuela; R Charlab; M Y Galperin; J M Ribeiro
Journal:  J Biol Chem       Date:  1998-11-13       Impact factor: 5.157

7.  Expression of active recombinant pallidipin, a novel platelet aggregation inhibitor, in the periplasm of Escherichia coli.

Authors:  B Haendler; A Becker; C Noeske-Jungblut; J Krätzschmar; P Donner; W D Schleuning
Journal:  Biochem J       Date:  1995-04-15       Impact factor: 3.857

8.  Seroprevalence of Triatoma virus (Dicistroviridae: Cripaviridae) antibodies in Chagas disease patients.

Authors:  Jailson F B Querido; María G Echeverría; Gerardo A Marti; Rita Medina Costa; María L Susevich; Jorge E Rabinovich; Aydee Copa; Nair A Montaño; Lineth Garcia; Marisol Cordova; Faustino Torrico; Rubén Sánchez-Eugenia; Lissete Sánchez-Magraner; Xabier Muñiz-Trabudua; Ibai López-Marijuan; Gabriela S Rozas-Dennis; Patricio Diosque; Ana M de Castro; Carlos Robello; Julio S Rodríguez; Jaime Altcheh; Paz M Salazar-Schettino; Marta I Bucio; Bertha Espinoza; Diego M A Guérin; Marcelo Sousa Silva
Journal:  Parasit Vectors       Date:  2015-01-17       Impact factor: 3.876

9.  Toward a catalog for the transcripts and proteins (sialome) from the salivary gland of the malaria vector Anopheles gambiae.

Authors:  Ivo M B Francischetti; Jesus G Valenzuela; Van My Pham; Mark K Garfield; José M C Ribeiro
Journal:  J Exp Biol       Date:  2002-08       Impact factor: 3.312

10.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

View more
  4 in total

1.  Transcriptomics Applied to the Study of Chagas Disease Vectors.

Authors:  Kelly Cristine Borsatto; Monika Aparecida Coronado; Cleber Galvão; Raghuvir Krishnaswamy Arni; Kaio Cesar Chaboli Alevi
Journal:  Am J Trop Med Hyg       Date:  2022-02-28       Impact factor: 2.345

2.  Salivary and Intestinal Transcriptomes Reveal Differential Gene Expression in Starving, Fed and Trypanosoma cruzi-Infected Rhodnius neglectus.

Authors:  Tamires Marielem Carvalho-Costa; Rafael Destro Rosa Tiveron; Maria Tays Mendes; Cecília Gomes Barbosa; Jessica Coraiola Nevoa; Guilherme Augusto Roza; Marcos Vinícius Silva; Henrique César Pereira Figueiredo; Virmondes Rodrigues; Siomar de Castro Soares; Carlo José Freire Oliveira
Journal:  Front Cell Infect Microbiol       Date:  2021-12-17       Impact factor: 5.293

3.  A fat body transcriptome analysis of the immune responses of Rhodnius prolixus to artificial infections with bacteria.

Authors:  Nicolas Salcedo-Porras; Pedro Lagerblad Oliveira; Alessandra Aparecida Guarneri; Carl Lowenberger
Journal:  Parasit Vectors       Date:  2022-07-29       Impact factor: 4.047

4.  An Integrative Sialomic Analysis Reveals Molecules From Triatoma sordida (Hemiptera: Reduviidae).

Authors:  Yanna Reis Praça; Paula Beatriz Santiago; Sébastien Charneau; Samuel Coelho Mandacaru; Izabela Marques Dourado Bastos; Kaio Luís da Silva Bentes; Sofia Marcelino Martins Silva; Waldeyr Mendes Cordeiro da Silva; Ionizete Garcia da Silva; Marcelo Valle de Sousa; Célia Maria de Almeida Soares; José Marcos Chaves Ribeiro; Jaime Martins Santana; Carla Nunes de Araújo
Journal:  Front Cell Infect Microbiol       Date:  2022-01-03       Impact factor: 5.293

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.