| Literature DB >> 25626148 |
Agata K Jakubowska1, Remziye Nalcacioglu2, Anabel Millán-Leiva3, Alejandro Sanz-Carbonell4, Hacer Muratoglu5, Salvador Herrero6, Zihni Demirbag2.
Abstract
Thaumetopoea pityocampa (pine processionary moth) is one of the most important pine pests in the forests of Mediterranean countries, Central Europe, the Middle East and North Africa. Apart from causing significant damage to pinewoods, T. pityocampa occurrence is also an issue for public and animal health, as it is responsible for dermatological reactions in humans and animals by contact with its irritating hairs. High throughput sequencing technologies have allowed the fast and cost-effective generation of genetic information of interest to understand different biological aspects of non-model organisms as well as the identification of potential pathogens. Using these technologies, we have obtained and characterized the transcriptome of T. pityocampa larvae collected in 12 different geographical locations in Turkey. cDNA libraries for Illumina sequencing were prepared from four larval tissues, head, gut, fat body and integument. By pooling the sequences from Illumina platform with those previously published using the Roche 454-FLX and Sanger methods we generated the largest reference transcriptome of T. pityocampa. In addition, this study has also allowed identification of possible viral pathogens with potential application in future biocontrol strategies.Entities:
Mesh:
Year: 2015 PMID: 25626148 PMCID: PMC4353898 DOI: 10.3390/v7020456
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Figure 1Map representing locations of insect sampling: (1) Samsun-Ankara highway; (2) Samsun-Alaçam; (3) Sinop vicinity; (4) Sinop-Gerze; (5) Sinop-Boyabat; (6) Sinop-Boyabat exit; (7) Samsun-Vezirköprü; (8) Amasya-Tokat highway; (9) Tokat-Sivas highway; (10) Tokat-Niksar highway; (11) Tokat-Reşadiye exit; and (12) Sivas-Koyulhisar.
Sequencing features of the T. pityocampa transcriptome sequencing.
| MG | FB | T | HE | |
|---|---|---|---|---|
| Nr of raw reads | 111,505,976 | 117,988,640 | 114,664,330 | 118,378,516 |
| Total sequence (Mb) | 11,262,103 | 11,916,852 | 11,581,097 | 11,956,230 |
| Sequence quality average | 34 | 34 | 34 | 34 |
| Nr of processed reads | 78,680,476 | 87,668,878 | 86,497,974 | 89,424,098 |
| Sequence quality average a | 36 | 36 | 36 | 36 |
a After processing.
Assembly statistics of the T. pityocampa transcriptome.
| Nr of unigenes | 152,669 |
| Nr of transcripts | 161,682 |
| Average (median) transcript length | 610 (322) |
| Min-Max transcript length | 201–49,848 |
| N50 transcript length a | 924 |
| Total nr of residues | 98,648,698 |
a N50: contig length for which half of the summed size of the assembly is this size or longer.
Figure 2Transcriptome and functional analysis characteristics. (A) Length distribution of transcripts obtained in T. pityocampa transcriptome; (B) Gene ontology level distribution in T. pityocampa annotated unigenes in the three main GO categories: P- biological process, F- molecular function and C- cell component.
Figure 3Gene ontology (GO) assignments for T. pityocampa transcriptome. Unigenes were classified into functional groups based on level 3 GO assignments as predicted for their involvement in (A) biological process and (B) molecular function. The number of unigenes assigned to each GO term is shown. We included classes that represent more than 1% of the total number of classified sequences, to simplify the visualization of the results.
Figure 4Venn diagram showing the number of orthologs shared between T. pityocampa and B. mori (Bm), D. melanogaster (Dm), and T. castaneum (Tc).
Unigenes in the transcriptome showing similarity to sequences from RNA viruses.
| Virus | Unigene | Sequence Length | BLAST Match | e-Value |
|---|---|---|---|---|
| Cytoplasmic polyhedrosis virus | TPUC35905_TC01 | 4102 | 0.0 | |
| TPUC31106_TC01 | 3898 | 0.0 | ||
| TPUC32941_TC01 | 3698 | 0.0 | ||
| TPUC35768_TC01 | 3304 | 0.0 | ||
| TPUC32710_TC01 | 2780 | 0.0 | ||
| TPUC25109_TC01 | 1813 | 0.0 | ||
| TPUC37981_TC01 | 1914 | 0.0 | ||
| TPUC80863_TC01 | 1249 | 0.0 | ||
| TPUC98042_TC01 | 1058 | 0.0 | ||
| TPUC71859_TC01 | 924 | 1.60E-175 | ||
| Iflavirus | TPUC51699_TC01 | 9816 | 0.0 | |
| TPUC109136_TC01 | 357 | 3.30E-58 | ||
| TPUC143735_TC01 | 335 | 2.35E-18 | ||
| TPUC71578_TC01 | 328 | 1.26E-49 | ||
| TPUC147958_TC01 | 323 | 1.15E-07 | ||
| TPUC03472_TC01 | 298 | 9.18E-56 | ||
| TPUC100622_TC01 | 230 | 3.18E-39 | ||
| TPUC83511_TC01 | 225 | 1.82E-37 | ||
| TPUC151007_TC01 | 217 | 3.84E-23 | ||
| TPUC91335_TC01 | 214 | 1.36E-37 | ||
| Rhabdovirus | TPUC44929_TC01 | 4501 | Maraba virus L polymerase protein | 0.0 |
| TPUC38841_TC01 | 2830 | 6.50E-11 | ||
| TPUC14459_TC01 | 1900 | Jurona virus L polymerase protein | 1.09E-115 | |
| TPUC75494_TC01 | 1016 | Dolphin rhabdovirus L polymerase protein | 7.40E-12 | |
| TPUC37175_TC01 | 914 | 5.31E-48 | ||
| TPUC48042_TC01 | 895 | 1.35E-15 | ||
| TPUC98122_TC01 | 631 | 1.51E-55 | ||
| TPUC56532_TC01 | 516 | China fish rhabdovirus L polymerase protein | 7.95E-72 | |
| TPUC99390_TC01 | 361 | 2.79E-34 |
Figure 5T. pityocampa cypovirus 5 phylogenetic relationship, genome structure, and tissue distribution mapping. (A) Inferred Bayesian phylogenetic tree based on nucleotide sequence of the polyhedrin gene (segment 10, 1071 nt) within Cypovirus species described. Posterior probabilities are indicated in branches. Scale bar indicates distance in nucleotide substitution/position. Host species order is shown between brackets (D: Diptera, L: Lepidoptera). (B) Schematic representation of T. pityocampa cypovirus 5 genomic structure including virus reads mapping in head, fat body, midgut and tegument (below). Segments are drawn to scale. Coverage histograms were obtained from BAM files after normalizing by the millions of reads obtained in each library.
Figure 6Thaumatopoea pityocampa iflavirus −1 phylogenetic relationship, genome structure, and tissue distribution mapping. (A) Inferred Bayesian phylogenetic tree based on the amino acid sequences of the RdRp comprising the conserved domains I to VIII (359 aa) of members of the family Iflaviridae. Drosophila C virus from the family Dicistroviridae has been used as outgroup. Posterior probabilities are indicated in branches. Scale bar indicates distance measured as the number of amino acid substitutions per position. Host species order is shown between brackets (Hy: Hymenoptera, He: Heteroptera, D: Diptera, L: Lepidoptera, O: Orthoptera). (B) Schematic genome representation of Thaumatopoea pityocampa iflavirus −1 including virus reads mapping in head, fat body, midgut and tegument (below). The conserved domains for the helicase (Hel), protease (Pro) and RNA-dependent RNA polymerase (RdRp) are indicated. Limits of the VP1–VP4 polypeptides were predicted by comparison with other iflaviruses. Hypothetical binding of the small viral protein VPg has been included in the scheme. Coverage histograms were obtained from BAM files after normalizing by the millions of reads obtained in each library.
Figure 7Phylogenetic relationships, genome structure, and tissue distribution mapping of rhabdovirus-like sequences detected in Thaumatopoea pityocampa. (A) Molecular phylogenetic analysis among the family Rhabdoviridae of the predicted Thaumatopoea pityocampa rhabdo-like virus, based on alignment of the 158-residue of domain III L-protein by Bayesian method based on WAG model. Posterior probabilities are indicated in branches. Scale bar indicates an evolutionary distance measured as the number of amino acid substitutions per position. Host species class is shown between brackets. (B) Schematic representation of the contig enclosing the L-protein sequence from the Thaumatopoea pityocampa rhabdo-like virus including reads mapping in four studied tissues, head, fat body, midgut and tegument (below). Asterisks are indicative of Indel mutations introducing frameshift; stop indicates premature termination codons. Coverage histograms were obtained from BAM files after normalizing by the millions of reads obtained in each library.