Literature DB >> 27576083

Combinatory annotation of cell membrane receptors and signalling pathways of Bombyx mori prothoracic glands.

Panagiotis Moulos1,2, Martina Samiotaki2, George Panayotou2, Skarlatos G Dedos3.   

Abstract

The cells of prothoracic glands (PG) are the main site of synthesis and secretion of ecdysteroids, the biochemical products of cholesterol conversion to steroids that shape the morphogenic development of insects. Despite the availability of genome sequences from several insect species and the extensive knowledge of certain signalling pathways that underpin ecdysteroidogenesis, the spectrum of signalling molecules and ecdysteroidogenic cascades is still not fully comprehensive. To fill this gap and obtain the complete list of cell membrane receptors expressed in PG cells, we used combinatory bioinformatic, proteomic and transcriptomic analysis and quantitative PCR to annotate and determine the expression profiles of genes identified as putative cell membrane receptors of the model insect species, Bombyx mori, and subsequently enrich the repertoire of signalling pathways that are present in its PG cells. The genome annotation dataset we report here highlights modules and pathways that may be directly involved in ecdysteroidogenesis and aims to disseminate data and assist other researchers in the discovery of the role of such receptors and their ligands.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27576083      PMCID: PMC5004587          DOI: 10.1038/sdata.2016.73

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


Background & Summary

Insect cell membrane receptors are excellent drug targets for pest control and management[1,2]. As such, an extensive amount of research has been carried out on various cell membrane receptors expressed in model insects, such as Drosophila melanogaster[3] and Bombyx mori[4,5], or insects that are major pests of crops or agents of animal and human diseases[6-9]. Identifying species-specific receptors expressed in cells crucial for the development of insects, such as the prothoracic gland (PG) cells, is even more important as these receptors can become targets for highly specialised, precise and species-specific drug discovery. The PG cells are the main site of synthesis of ecdysteroids from cholesterol and their subsequent secretion[10,11]. Research efforts have shown that the signalling pathways that govern ecdysteroidogenesis are quite complex and involve a very broad array of second messengers and signalling modules with a high degree of overlapping integration and redundancy, albeit a large number of unidentified ligands, receptors and signalling components[10,12-16]. Much work on the ligands and receptors that stimulate or inhibit ecdysteroidogenesis has been carried out in model insects such as Bombyx mori[12,14,17], whose extensive genome annotation is publicly available with genes mapped to chromosome and scaffold locations and the sequences of their coding protein(s)[18,19]. In this study we further annotated the existing dataset of Bombyx mori genes[18,19] (http://sgp.dna.affrc.go.jp/ComprehensiveGeneSet/) to include 1) genes expressed in the PG cells identified by our proteomic (using liquid chromatography-tandem mass spectrometry (LC-MS/MS)) and transcriptomic (RNA-seq) datasets (Fig. 1) as described in a supporting paper[20], 2) annotations of signalling pathways that are present in PG cells (Fig. 2), 3) annotations of Bombyx mori genes with the LocTree3[21] protein subcellular localisation prediction dataset (Data Citation 1) and 4) annotation of the coding sequences of the proteins (Data Citation 1), by integrating results of our bioinformatics, LC-MS/MS and RNA-seq analyses from PG samples from day 0 (V−0) and day 6 (V−6) of the 5th instar (Data Citation 1) and further analysis of the expression patterns of these receptors by quantitative PCR (qPCR).
Figure 1

Schematic workflow of the experimental design.

a: The figure shows the iterative approaches we followed to identify the cell membrane receptors expressed in prothoracic gland cells of Bombyx mori. False positive and false negative hits based on the liquid chromatography-tandem mass spectrometry (LC-MS/MS) and transcriptomic (RNA-seq) analyses were analysed by qPCR assays and further examined by visualization in the University of California, Santa Cruz (UCSC) Genome Browser. Abbreviations indicate the time of ecdysis (E) to the final larval stage, the time of head critical period (HCP; see text for details), the time of feeding cessation and onset of wandering (W) behaviour, the time of metamorphosis to pupa (P) and the time of initiation of apoptosis (A) by the prothoracic gland cells. b: Venn diagrams showing the results of liquid chromatography-tandem mass spectrometry (LC-MS/MS) and transcriptomic (RNA-seq) analyses of the 3 biological replicates from day 0 (V−0) and day 6 (V−6) from prothoracic gland cells of the 5th instar of Bombyx mori.

Figure 2

KEGG database-adopted signalling pathways identified in the prothoracic gland cells of the silkworm, Bombyx mori.

Signalling pathways from KEGG database were manually edited to highlight the signalling cascade components i) identified by our proteomic and transcriptomic data (yellow boxes), ii) identified by our transcriptomic (and qPCR analyses for the cell membrane receptors) but not by our proteomic data analyses (blue boxes), iii) not identified by both our proteomic and transcriptomic data analyses (green boxes) to be expressed by the prothoracic gland cells of the silkworm, Bombyx mori. White boxes indicate signalling cascade components not identified in the genome of the silkworm, Bombyx mori. Readers are referred to the KEGG database and Data Citation 1 for detailed description of the abbreviated terms.

To implement our workflow (Fig. 1a), first we generated experimental data and analysis of LC-MS/MS datasets coupled with experimental data and analysis of RNA-seq datasets (Fig. 1 and Data Citation 1,Data Citation 2 and Data Citation 3). We next created a Bombyx mori reference genome dataset (Data Citation 4) to map and visualise the RNA-seq data and, in parallel, we generated literature-based and bioinformatics-assisted annotated lists of cell membrane receptors present in the genome of Bombyx mori. Next, we validated the expression of cell membrane receptors in PG cells through qPCR and clarified the presence of false negative and false positive hits in our LC-MS/MS and/or RNA-seq datasets (Fig. 1a and Tables 1 and 2). Critical to our approach was the visualization of RNA-seq data through the University of California, Santa Cruz (UCSC) Genome Browser[22] track data hubs webpage, hosting the Bombyx mori reference genome (Data Citation 4), that allowed us to explain false positive and false negative hits (Tables 1 and 2) in our LC-MS/MS, RNA-seq and qPCR data.
Table 1

False negative results of cell membrane receptors expressed in prothoracic glands cells of Bombyx mori.

NoReceptor Typepublic ID (gene)chr. IDchr. start positionchr. end positionProtein descriptionIdentified by qPCRIdentified by RNA-seqIdentified by LC-MS/MSrpgm V−0rpgm V−6False negative reasonForward primerReverse primer
The 29 genes presented in this table were not identified by both our transcriptomic (RNA-seq) and proteomic (LC-MS/MS) analysis and conclusive evidence of their expression came from the quantitative PCR (qPCR) assays using the indicated primers. Abbreviations: GPCR: G protein-coupled receptor; RTK: receptor tyrosine kinase; RTSK: receptor serine/threonine kinase; RTGC: receptor type guanylate cyclase; Chr: Chromosome; rpgm: reads per gene model; V−0: Day 0 of the 5th instar; V−6: Day 6 of the 5th instar.              
1Class A GPCRBMgn003863chr12168430721697706Neuropeptide receptor A180.0010.005Expressed on days other than V−0/V−6 ATGAAGGAGGAGATGGAGAGG CCGAAGTTGGTGTTGAAGAAC
2Class A GPCRBMgn010037chr7698359701773Putative prostanoid receptor0.0330.017Incorrect annotation TGCAGTCTATGAACCACAACG TGACACAGAGTGGAGGGAGATAC
3Class A GPCRBMgn006929chr101104427811051377Serotonin receptor 4 (Ser-4)0.0190.003Expressed on days other than V−0/V−6 ACGAGCCCCGAAAAACAATC CTTCACAATCACACGTCGGAAC
4Class A GPCRBMgn011918chr1162343276250382Serotonin receptor 1A0.0060.021 AAACTGCTACGTTGCAAGCG AATCACCCCTCTCCGAATCAAC
5Class A GPCRBMgn011934chr1166884206738515Protein trapped in endoderm-1-like/Trehalose receptor-like0.0030.019 CCTCAGCTGTGTCAACTTTTCG TCGTGCACGAAAACGTGTTC
6Class A GPCRBMgn012114chr111717622317270624Neuropeptide receptor A100.0190.020 GAAGATGCGGAAAGAGAACG TCGTTGAGCCAAGCATAGAG
7Class A GPCRBMgn010612chr121196844012002234Diapause hormone receptor0.0080.023 TCAAGATGACCCTGTGCTG TTCGGCTGTCTGGTTTGC
8Class A GPCRKAIKOGA006380chr181056090110565919Orphan G protein-coupled receptor (moody-like)0.0020.014No public ID AGCTGCTTTATCCTGCCCTTAG ATCGGCAGCAACCAAATGAC
9Class A GPCRBMgn008534chr181083344610842187Orphan G protein-coupled receptor (moody-like)0.0470.049Expressed on days other than V−0/V−6 GCACCGTAGACAAAAGATCCAC ACTGCGCGTTCATTATCACG
10Class A GPCRBMgn004033chr1978767907923680Neuropeptide receptor A30.0230.026 GGCAGCCTATCTTCTATTCCAC AAGGATCATCGGACCTGTTG
11Class A GPCRBMgn000093chr2466638166707304Sex peptide receptor0.0210.006 TACATCGCACCGTTTCTGC AACATCGCTGGAATGACCTC
12Class B GPCRBMgn000547chr194873029507493Neuropeptide receptor B40.00090.0007 TCCTTCCTATTTCTGGTGAACG CTTGTGAGCAACGCTGAAAC
13Class B GPCRBMgn009926chr81879717018877387Neuropeptide receptor B10.0040.006 ACAAGGAATGCACTGCGAAC TTGTTCACCGCAAACGAAC
14Class B GPCRBMgn012453chr995276119546519Neuropeptide receptor B30.0050.010 AGGCCGAACACAACAGATTG TTCTACCGGAACTAGGGTCAAC
15Unclassified GPCRBMgn010416chr1251282185148676Progestin and adiponectin receptor family member 3-like0.0280.022 GTTGCTCGCCATATACACATCG TATCACACGGGGGAAGAACATC
16RTKBMgn005972chr459406255947929Receptor tyrosine-protein kinase Ror-like0.0010.0006 TTTGCTTGCGAGTTGATGCG CGGGCTCATGATTTTCATCACC
17RTKBMgn003800chr42035234620366934Venus kinase receptor 20.0360.028 AAACCGGACTCTACGACTGAG TTGGCTTTCGTTGCGTCTTC
18RTKBMgn002597chr557198215726335Receptor tyrosine kinase (RYK) Dnt-like-2 (Doughnut)0.0160.045 GTTAGCATTGAGGACCAAACGG ATCCACTATGCAGTTCCTAGCG
19RTKBMgn008036chr96431975212Receptor tyrosine kinase Cad96Ca-like0.0080.006 ACCCTGAAAGAGAATGCCTCAG AGATCACGCGAGGTAAGGAAC
20RTKBMgn000354chr221033222410334147Neurospecific receptor kinase (Nrk/HOP)NANAIncorrect annotation CACCGTGAAGATCTACGATTGC TCCACCTTACTGGGATTGCATC
21RTKBMgn010869chr222316712523172280Receptor tyrosine kinase Cad96Ca-like 20.0010.0009Expressed on days other than V−0/V−6 GTGTTGGCAGGATTGGGTTTC TTGGCCCAGTAACGTCTGTAG
22RTKBMgn011535chr231010658010130411Proto-oncogene tyrosine kinase ROS-like (Sevenless)0.0020.006 TGGTATGAGGAAACGGGTCATC ATTTGCAGCCACTCATGTCG
23RTSKBMgn009697chr243293634343148Similar to wishful thinking0.0390.031 CAGGGCCAGTATCAATTTGGAG AGCGGTTGTGGTTCTTGTTG
24RTGCBMgn006801chr101029105010306675Receptor type guanylate cyclase 32E-like isoform0.0140.020 CGGCGAAGAGTTGTTTCGTC TCAGCACGTGACCAATAAATGC
25Other cell membrane receptorsBMgn002138chr137650953884662Furrowed homologue0.0080.007 GTTTGCAATAGGGACGGCAAG TTGGCAATTTCCCGCAATCG
26Other cell membrane receptorsBMgn002495chr91454033614576082Similar to lipocalin-1 interacting membrane receptor (limr)0.0230.024 AAAGAAGAATGCCGCATCGC ACCGCTTTGATCCCGATGAG
27Other cell membrane receptorsBMgn006572chr1056389055646813Patched domain-containing protein 3-like0.0050.017 AAATGACAACGGACGCAAGC TGTCACTGGTTTTGCGTGTTG
28Other cell membrane receptorsBMgn007929chr1523501502409987Similar to Notch0.0030.023 AACTGTGCGACGTCGAAATG AACCTTTGGCGCATTGACAC
29Other cell membrane receptorsBMgn008552chr181188810911902975Interference hedgehog0.0170.013 GGAAACGCCGAAAACTTTGC TATGTGGGGAGGGATTTCTTGC
Table 2

False positive results of cell membrane receptors that are not expressed in prothoracic glands cells of Bombyx mori.

NoReceptor Typepublic ID (gene)chr. IDchr. start positionchr. end positionProtein descriptionIdentified by qPCRIdentified by RNA-seqIdentified by LC-MS/MSrpgm V−0rpgm V−6False positive reasonForward primerReverse primer
The 19 genes presented in this table were identified by either our transcriptomic (RNA-seq) or proteomic (LC-MS/MS) analysis but the evidence of their expression were not supported by the quantitative PCR (qPCR) assays using the indicated primers. Critical to the assessment of these hits as false positives was the visalisation of the reads in the University of California. Santa Cruz (UCSC) Genome Browser (see text for details). Abbreviations: GPCR: G protein-coupled receptor; RTGC: receptor type guanylate cyclase; Chr: Chromosome; rpgm: reads per gene model; V−0: Day 0 of the 5th instar; V−6: Day 6 of the 5th instar.              
1Class A GPCRBMgn012112chr111747858117480570Neuropeptide receptor A7√ (V−6)0.00050Incorrect entry in Uniprot ACCGGCAAGAACAATGAACG TGTCTGCATCGCCTTGTTTC
2Class A GPCRBMgn010894chr222180291921873834Neuropeptide receptor A30√ (V−6)0.0010.001Incorrect entry in Uniprot GTTCCGGACACACAAGTAAACG TTCTTCGATGTGAGGACGTGAG
3Class A GPCRBMgn002244chr2690248599062392Neuropeptide receptor A6-A√ (V−6)0.0030.003Possible LC-MS/MS artefact AGCGGAAACCAACAGCAAAG TTACGGCGCAACGAATTAGC
4Class A GPCRBMgn001412chr211208693312126114Neuropeptide receptor A28/ACP receptor√ (V−6)0.0060.007Possible LC-MS/MS artefact TTGGGCGTTCGCTTTATTGC TTTGACGGCGGTTCCTTTTC
5Class A GPCRBMgn012539chr975109107518997Opsin√ (V−0)0.0010.0006Possibly expressed in the 4th instar or LC-MS/MS artefact GACCAGCTTTAGGCAAAACTGG AGCTTGTCAGGAATCCTTCGG
6Class A GPCRBMgn007930chr1523348652343080Serotonin receptor (1D-like)0.0660.003Reads aligned in intron TCCTTCGCCGAAGCATTTTC CCACATTTACCACGACAGCATC
7Class A GPCRBMgn004428chr2020307772152042Neuropeptide receptor A50.1871,840Reads aligned in intron ACACCGCTTTGGAAATTCGG TCGAAGTGAAACAGCGAACG
8Class B GPCRBMgn014256chr8261912058Metabotropic glutamate receptor 2-like√ (V−0)0.00020.0004Possibly expressed in the 4th instar or LC-MS/MS artefact TCCTGGTGTGTACAGTATACGC ACCTTGTTCGCCAACTTTCG
9Odorant receptorBMgn014699chr11237743712385061Odorant receptor (BmOr-1)0.0010.246Reads aligned in intron ATGTCATCACAACGCCCAAC TCAAGGTTAAGGGTCCGTAACG
10Odorant receptorBMgn017182chr251112783511131624Odorant receptor (BmOr-40)√ (V−6)0.010.01Possible LC-MS/MS artefact GTTTTGAACCCTTCGGAGGTTC TTGTGAGTTGCCCGACATTG
11RTGCBMgn002197chr151949806119508229Guanylate cyclase√ (V−0)0.00020.0007Possibly expressed in the 4th instar or LC-MS/MS artefact ATCATGCAGGAGGTCATAGTGC GAGCATTTCGAACATGGAGTCG
12IntegrinBMgn010454chr1225612622592387Integrin alpha 3√ (V−0)0.0010.007Incorrect entry in Uniprot AGAAGGCGCAAGAAGCTTTG TTAACGTGTCGCCACAAAGG
13IntegrinBmgn006002chr445023104508762Integrin beta 2√ (V−6)0.00010.0003Possible LC-MS/MS artefact AAACGTGGCCGAAGATTTGG AACCTCATTCTCGTCTGACCAC
14Immunity related receptorBMgn006244chr61121617111233908BmToll 12√ (V−0)(V−6)0.0020.002Possible LC-MS/MS artefact GTACACAAACGCGATCAGGTG TTGTGTTCGTATCGCGTTGG
15Immunity-related receotorBMgn000673chr194149359415375BmPGRP-L60.1770.163Possible incorrect annotation in genome CAGAGCCAATGTGTTCTTCGTG TTCCGGCAAGCTTCCAATTG
16Immunity related receptorBMgn011085chr2376982237702050BmToll 8√ (V−0)(V−6)0.0330.042Possible LC-MS/MS artefact ACGCTTTTGTGTGCTACAGC ATCAGTCTTTTCCGTCGGTCTC
17Immunity related receptorBMgn011216chr231895386818963731BmToll 9–1√ (V−6)0.000030.00003Possible LC-MS/MS artefact ATTGCAACATCAGCGTCTGC TCGTCAGCATCAGATAGTGGAG
18Other cell membrane receptorsBMgn001256chr1365495956569328Low density lipoprotein receptor-related protein 4 (LRPL)√ (V−6)0.010.01Possible LC-MS/MS artefact TTCCGTTGTGGTTGCATGAC TTGCGCAACACGGATAACTC
19Other cell membrane receptorsBMgn002138chr137650953884662Furrowed/Selectin/Sushi. von Willebrand factor type A. EGF and pentraxin domain-containing protein 1√ (V−6)0.0080.007Possible LC-MS/MS artefact GTTTGCAATAGGGACGGCAAG TTGGCAATTTCCCGCAATCG
All these datasets were integrated back into i) the annotated list of Bombyx mori genes that is publicly available (Fig. 1 and Data Citation 1) ii) the publicly available KEGG[23] signalling pathways of Bombyx mori that we further annotated, visualized and present in Fig. 2 iii) the UCSC Genome Browser[22] track data hubs webpage where results of the RNA-Seq samples analyses can be visualised by pasting the following link (http://epigenomics.fleming.gr/tracks/hs_trackhubs/ekpa_dedos_2/hub.txt) and then pasting the coordinates of a gene of interest. We provide a substantially increased list of Bombyx mori annotated genes (Data Citation 1) integrated with our LC-MS/MS and RNA-seq datasets (Fig. 1b) and other bioinformatic annotations of Bombyx mori genes (http://epigenomics.fleming.gr/metaseqr_runs/HS/000004/) and most importantly we identify and visually depict here (Fig. 2) the signalling pathways in PG cells reported in a supporting paper[20]. Our datasets can be valuable tools for researchers who want to study the role of signalling pathways in PG cells or conduct comparative studies on the presence of cell membrane receptors in other insect species. Additional information for a comprehensive understanding of the cell membrane receptors and the signalling pathways in PG cells is described in a related research paper[20].

Methods

Animals

The hybrid J106xDAIZO of Bombyx mori was used in this study. In this hybrid, the 5th instar period lasts about ~208 h, the onset of pupal commitment occurs after 60 h (day 3) and the onset of wandering behaviour occurs 144 h (day 6) after the final larval ecdysis. This hybrid has a short period of cocoon spinning that lasts ~38 h followed by a period of ~26 h before pupal metamorphosis. In this study, each day of the 5th (V) instar is designated with its numerical number (i.e., V−0, V−1 etc.) while the first day of the pupal stage is designated as P−0. Larvae were reared on fresh mulberry leaves under a 12:12-L:D photoperiod at 25±1 °C and 60% relative humidity. Larvae were staged after every larval ecdysis, and the day of each ecdysis was designated as day 0. Since larvae mainly moult to the final (5th) instar during the scotophase, all larvae that ecdysed during the scotophase were segregated immediately after the onset of photophase. This time was designated as 0 h of the 5th instar and 4 h later samples of prothoracic glands were taken (day 0 samples) while samples of prothoracic glands from day 6 were taken 144 h later, at the onset of wandering behaviour.

Experimental design

The aim of our present study was to provide a thorough map of signalling cascades present in the PG cells of Bombyx mori during the crucial final larval stage and the onset of the pupal stage before these cells initiate apoptosis (Fig. 1a). We have chosen to analyse PG samples from day 0 and day 6 of the 5th instar of Bombyx mori because there are striking differences in the hormonal milieu in these two developmental time points. On day 0, PG cells secrete very low amounts of ecdysteroids while the juvenile hormone titre is high[24,25], whereas on day 6 PG cells secrete high amounts of ecdysteroids while the juvenile hormone titre is low[24,25] while in both days the PG cells are not fully stimulated by prothoracicotropic hormone[25]. In addition, the onset of wandering behaviour on day 6 is the safest benchmark that the animals have been developing orderly and there has been extensive research carried out on the signalling pathways of PG cells of Bombyx mori on day 6. We identified the receptors that participate in these cascades, determined their expression profile during this developmental stage and then illustrated and linked the expression of these receptors to KEGG database signalling pathways[23,26] present in PG cells (Fig. 2). The datasets presented in this study were generated from PG cells from day 0 and day 6 (onset of wandering stage) of the final larval instar, the 5th instar (Fig. 1a,b). A set of 3 biological replicates of PG samples from day 0, and 3 biological replicates of PG samples from day 6 were subjected to proteomic analysis by liquid chromatography-tandem mass spectrometry (LC-MS/MS) and an identical set was subjected to transcriptomic (RNA-seq) analysis (Fig. 1). To isolate the PGs, larvae were anesthetized by submersion in water and the two PGs of each larva were dissected rapidly (~2 min/animal) in sterile saline (0.85% NaCl). For LC-MS/MS analysis, following pre-incubation for 15–30 min in Grace’s medium (Invitrogen), glands were meticulously cleared of any associated tissue or debris, pooled and successively transferred to gradually diminishing volumes of Grace’s medium drops (n=5) before being snap frozen in dry ice and stored at −80 °C before further processing. For RNA-seq analysis, total RNA was isolated from PGs from day 0 and day 6 of the 5th instar as described below. PGs were meticulously cleared of any associated tissue or debris and total RNA was immediately extracted with TRIzol (Invitrogen) according to the manufacturer’s instructions. The lllumina mRNA-Seq Sample Prep Kit was used according to the manufacturer’s instructions (1,004,898 Rev.D). Briefly, using oligo-dT magnetic beads, mRNA was isolated from total RNA and after mRNA fragmentation, cDNA was synthesised, ligated with sequencing adapters and amplified by PCR. Quality and yield after sample preparation was measured with the Agilent 2,100 Bioanalyzer (Agilent Technologies). Resulting products size distribution had a broad peak between 200–500 bp on a DNA 1,000 chip. Next, 17 pM of DNA was used for clustering and DNA sequencing on lllumina cBot and HiSeq2500 (HCS v2.2.58 software) according to the manufacturer’s protocols. For qPCR, total RNA was isolated from PGs from each day of the 5th instar and the 1st day of the pupal stage (P−0; Fig. 1) as described above (n=7 for each day of the investigated developmental stage). Using 200 U Superscript III reverse transcriptase (Invitrogen) in 20 μl reaction volumes, first strand cDNA was synthesized from 2 μg total RNA with an oligo(dT)20 primer (Invitrogen) according to manufacturer’s instructions (n=7 for each day of the investigated developmental stage) and used in qPCR. MIQE guidelines-adopted[27] complete protocols of the quantitative PCR are fully described in our supporting paper[20].

Proteomic analysis of Bombyx mori PG cells

Prothoracic glands were resuspended in 150 μl lysis buffer containing 100 mM Tris-HCl, pH 7.6, 4% SDS and freshly made 100 mM DTT. Samples were incubated for 3 min at 95 °C, followed by 20 min incubation in a sonication water bath and then centrifuged at 17,000×g for 30 min at 4 °C. Protein extracts were processed according to the Filter Aided Sample Preparation (FASP) protocol[28] using spin filter devices with 10kDa cutoff (Sartorius, VN01H02). The 150 μl lysate was diluted in 8 M Urea/100 mM Tris-HCl pH 8.5, the filters were extensively washed with the urea solution, treated with 10 mgml−1 iodoacetamide in the urea solution and incubated for 30 min in the dark for cysteine alkylation. Proteins on the top of the filters were washed three times with 50 mM ammonium bicarbonate and finally digested by adding 1 μg trypsin/LysC mix in 80 μl of 50 mM ammonium bicarbonate solution (Mass spec grade, Promega) and incubated overnight at 37 °C. Peptides were eluted by centrifugation and upon speed-vac-assisted solvent removal eluted peptides were reconstituted in 0.1% formic acid, 2% acetonitrile in water and transferred into glass sample vials. Peptide concentration was determined by nanodrop absorbance measurement at 280 nm and 2.5 μg peptides were pre-concentrated with a flow of 3 μlmin−1 for 10 min using a C18 trap column (Acclaim PepMap100, 100 μm×2 cm, Thermo Scientific) and then loaded onto a 50 cm C18 column (75 μm ID, particle size 2 μm, 100 Å, Acclaim PepMap RSLC, Thermo Scientific). The binary pumps of the HPLC (RSLCnano, Thermo Scientific) consisted of solution A (2% (v/v) acetonitrile in 0.1% (v/v) formic acid) and solution B (80% acetonitrile in 0.1% formic acid). The peptides were separated using a linear gradient of 4% solution B up to 40% in 450 min for an 8 h gradient run with a flow rate of 300 nl/min. The column was placed in an oven operating at 35 °C. For LC-MS/MS, purified peptides were analysed by HPLC MS/MS coupled to an LTQ Orbitrap XL Mass spectrometer (Thermo Fisher Scientific, Waltham, MA, USA) equipped with a nanospray source. Full scan MS spectra were acquired in the orbitrap (m/z 300–1,600) in profile mode and data-dependent acquisition, with the resolution set to 60,000 at m/z 400 and automatic gain control target at 106 ions. The six most intense ions were sequentially isolated for collision-induced (CID) MS/MS fragmentation and detection in the linear ion trap. Dynamic exclusion was set to 1 min and activated for 90 s. Ions with single charge states were excluded. Lock mass of m/z 445, 120025 was used for internal calibration. Xcalibur (Thermo Scientific) was used to control the system and acquire the raw files. Peptides were identified using the Proteome Discoverer 1.4 software (Thermo Scientific). The Orbitrap raw data (Data Citation 3), with peak S/N threshold set to 1.5, were searched using SEQUEST HT against the Uniprot Bombyx mori entries (14,788 sequences; Data Citation 5) with strict trypsin specificity and with maximum two missed cleavages and variable modifications of methionine oxidation, deamidation of glutamine and asparagine residues and acetylation of the N-terminus. Carbamidomethylation of cysteines was set as static modification. The identified peptides were filtered based on their Xcorr values versus peptide charge states (XCorr >2 for charge state +2 and XCorr >2.5 for charge state +3). The raw results without any further processing to remove false positives and duplicate entries are presented in Data Citation 5.

The Bombyx mori reference genome

For our analyses, a Bombyx mori reference genome was assembled[20] (Data Citation 4) by incorporating assembled scaffolds anchored to chromosomes (http://sgp.dna.affrc.go.jp/pubdata/genomicsequences.html) and genome contigs assembled to scaffolds, but not anchored to chromosomes. Scaffolds that were less than 20 kb in length were excluded. The remaining sequences (chromosomes and scaffolds) were merged to a final FASTA file which was used to construct a Bowtie2[29] (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) index for subsequent use with TopHat2[30] and Bowtie2[29] aligners. In addition, a comprehensive gene set was constructed from genes anchored to chromosomes and additional genes that were inferred in the scaffold sequences and are available in KAIKObase (http://sgp.dna.affrc.go.jp/KAIKObase/). This comprehensive gene set was used to construct a gene file in GTF format to supply it to the TopHat2[30] aligner. The source code and all files pertinent to this analysis are included in Data Citation 4.

Sequencing and alignment of short reads

Image analysis, base calling, and quality check were performed with the lllumina data analysis pipeline RTA v1.18.64 and Bcl2fastq v1.8.4. Reads were on average 13.26 Gb for day 0 (V−0) samples (n=3) and 13.69 Gb for day 6 (V−6) samples (n=3) (Data Citation 2), 90.6% of clusters passed lllumina filters[27] and percentage of bases with Q-score ≥30 were 89.36%. The resulting FASTQ files containing pair-end 125 bp sequence reads were subjected to quality control using the FastQC[31] package and mapped on the reference genome using TopHat2[30] with the standard parameters for reads obtained with Illumina platforms apart from the following: • the --GTF parameter was supplied with additional transcript annotation data for the Bombyx mori reference genome as described above. • the --mate-inner-dist and --mate-std-dev parameters which are crucial for paired-end reads were estimated from the Bioanalyzer reports provided from the sequencing of each sample. • --read-gap-length and --read-edit-dist were set to 3 (default is 2) to allow some more freedom in the strictness of the overall alignment procedure since the Bombyx mori genome is not fully annotated yet. After completing a first round of spliced alignment with TopHat2, reads which failed to map to the reference genome were supplied to the bedtools bamtofastq command from the BEDTools suite (https://github.com/arq5x/bedtools2) to create a FASTQ subset of the original raw short reads. These short reads subset was subjected to a second round of unspliced alignment with Bowtie2 in sensitive mode (options applied: --local --very-sensitive-local --maxins 1,000 --dovetail) to allow mapping of part of this subset back to the reference genome. Mapping of this subset back to the reference genome occurred when a sufficient and continuous proportion of each read (first 50 bases) was successfully aligned. This procedure allowed for the alignment of paired-end reads located quite further than the average pair distance. These two rounds of alignment procedure led to increased alignment rates.

Statistical analysis

Statistical analysis was performed using the Bioconductor package metaseqR[32]. Specifically, the BAM files, one for each RNA-seq sample, were summarized to a gene read counts table, using the Bioconductor package GenomicRanges[33]. In the final read counts table, each row represented one gene, each column one RNA-Seq sample and each cell the corresponding read counts associated with each row and column. The gene counts table was normalized for inherent systematic or experimental biases (e.g., sequencing depth, gene length) using the Bioconductor package edgeR[34] after removing genes that had zero counts over all the RNA-seq samples (891 genes). The output of the normalization algorithm was a table with normalized counts. Prior to the statistical testing procedure, the gene read counts were filtered for possible artefacts that could affect the subsequent statistical testing procedures. Genes presenting read counts below the median read counts of the total normalized count distribution (7,139 genes with cut-off value 199 normalized read counts) were excluded from further analysis. The total number of genes excluded due to the application of gene filters was 8,051. The resulting gene counts table was subjected to differential expression analysis for the contrasts day 0, 5th instar versus day 6, 5th instar using the Bioconductor packages DESeq[35], edgeR[34], limma[36], NBPSeq[37], NOISeq[38] and baySeq[39]. To combine the statistical significance from multiple algorithms so as to optimize the trade-off between true positives and false hits, we applied the PANDORA[32] weighted p-value method across all results. Setting the p-value (FDR or adjusted p-value) threshold to 0.05, we identified 5,539 differentially expressed genes with statistically significant p-value and of these 928 were up-regulated, 1,077 were down-regulated and 3,534 were not differentially expressed according to an absolute fold change cut-off value of 1 in log2 scale.

RNA-seq data visualisation

To create the UCSC Genome Browser visualization track data hub (http://epigenomics.fleming.gr/tracks/hs_trackhubs/ekpa_dedos_2/hub.txt), BAM files resulting from the alignment procedure were converted to BED format (https://genome.ucsc.edu/FAQ/FAQformat.html#format1) using the bedtools bamtobed command from the BEDTools suite (http://bedtools.readthedocs.org/en/latest/index.html#) with the -split option to report RNA-seq reads split by the TopHat2 algorithm as separate alignments, referred hereafter as ‘tags’. The RNA signal from these files was extracted by reformatting them in BedGraph (http://genome.ucsc.edu/goldenPath/help/bedgraph.html) format using the bedtools genomecov command from the BEDTools suite with the -bg option and then to bigwig (https://genome.ucsc.edu/goldenpath/help/bigWig.html) format using the bedGraphToBigWig program supplied by UCSC. The bigWig tracks were visualized in a custom UCSC Genome Browser track data hub, hosting the Bombyx mori (bmori2) reference genome and the normalized (total signal of 1010) RNA-seq samples. The track data hub is available at http://epigenomics.fleming.gr/tracks/hs_trackhubs/ekpa_dedos_2/hub.txt and this link must be pasted in the My Hubs tab of the UCSC Genome Browser application.

Signalling pathway analysis and annotation through KEGG database

KEGG pathway maps (http://www.genome.ad.jp/kegg/pathway.html) where available for the silkworm Bombyx mori[23,26] or the fruit fly Drosophila melanogaster[23] were used to identify proteins that are components of each pathway and examine the presence of each protein in our LC-MS/MS datasets (Data Citation 1,Data Citation 3 and Data Citation 5) and its coding gene expression in our RNA-seq datasets (Data Citation 1 and Data Citation 2). Each component of the map was subjected to blastp at the KAIKObase[18] website to identify the corresponding gene in our dataset. Signalling pathway maps were manually curated to highlight the different features we identified (Fig. 2). Each of the illustrated pathways in Fig. 2 met the criterion that it had at least three of its components identified in our LC-MS/MS datasets and was not previously reported to be comprehensively present in PG cells of Bombyx mori.

Code availability

The code used to analyse the RNA-seq data, described at various levels herein, is available in an archive within Data Citation 4 with further explanations in this link (http://epigenomics.fleming.gr/metaseqr_runs/HS/000004/). Briefly, the code consists of: i) Custom Perl and Linux shell scripts used to 1) create the Bombyx mori reference genome assembly for RNA-seq short read alignment, 2) create the GTF gene file and 3) generate files suitable for custom visualization of the UCSC Genome Browser. ii) Custom Linux shell scripts that can be used to reproduce the two rounds of alignment procedure described in ‘Methods’. iii) Custom R script that can be used to reproduce the differential expression analysis.

Data Records

The annotated list of Bombyx mori genes file (Data Citation 1) is available on Figshare (doi:10.6084/m9.figshare.3420235). The raw reads of our transcriptomic (RNA-seq) data have been deposited to the NCBI Short Read Archive (SRA, http://www.ncbi.nlm.nih.gov/sra/) under accession number SRP062258 (Data Citation 2). This record combines the 3 biological replicates from day 0 (http://www.ncbi.nlm.nih.gov/biosample/SAMN03978782) and the 3 biological replicates from day 6 (http://www.ncbi.nlm.nih.gov/biosample/SAMN03978783) presented in this study. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE[40] partner repository with the dataset identifier PXD004265 and 10.6019/PXD004265 (Data Citation 3). The Bombyx mori reference gene file in GTF format together with a series of files centred around the Bombyx mori genome (Data Citation 4) is available on Figshare (doi:10.6084/m9.figshare.3420412). The Proteome Discoverer data analysis files (Data Citation 5) are available on Figshare (doi:10.6084/m9.figshare.342031).

Technical Validation

Assessment of false negative and false positive hits of Bombyx mori cell membrane receptors in RNA-seq and LC-MS/MS datasets

Based on the initial bioinformatic identification of 369 cell membrane receptors present in the genome of Bombyx mori, we analysed by qPCR the expression of 339 transcripts out of the initial 369 cell membrane receptor genes. Due to high sequence similarity between some transcripts a total of 30 genes could not be analysed by qPCR[20]. Of those 339 genes that were analysed by qPCR, 29 genes were found to be false negative, i.e., expressed in the PG cells (Table 1), mainly because their expression showed peaks on days other than day 0 (V−0) or day 6 (V−6) from which the RNA-seq and LC-MS/MS samples were derived. In 2 of the 29 instances, the RNA-seq data returned negative results due to incorrect annotation of the Bombyx mori genome and in 1 instance (i.e., putative prostanoid receptor; Table 1), gene BMgn010037 is annotated as only a fragment of the full open reading frame. The RNA-seq and LC-MS/MS data was more complicated for the false positive results, i.e., receptors that were not found to be expressed in the PG cells by qPCR assays (Table 2). These cases were 19 in total (Table 2). In 3 cases the reason for being false positives could be attributed to no longer being expressed in the 5th instar but were expressed in the 4th instar (Table 2). In 3 cases the RNA-seq reads were aligning with sequences found within introns while in 3 other cases the LC-MS/MS data returned hits incorrectly annotated in the UniProt Bombyx mori sequence database (14,788 sequences). In the remaining 10 cases the reason for getting negative results in qPCR assays and the RNA-seq data is probably due to a combination of LC-MS/MS analyses artefacts and incorrect annotation of the Bombyx mori genome.

Quantitative PCR analysis of false negative and false positive hits of Bombyx mori cell membrane receptors in RNA-seq and LC-MS/MS datasets

The MIQE guidelines-adopted[27] complete protocols of the qPCR assays are fully described in our supporting paper[20]. All primers, including those shown in Tables 1 and 2, were designed with an online tool (http://primer3plus.com/cgi-bin/dev/primer3plus.cgi) using the following custom settings: 1) amplicons size range 230–270 bp, 2) primer size minimum 18 bases, optimum 20 bases, maximum 23 bases, 3) primer Tm minimum 59 °C, optimum 60 °C, maximum 62 °C, maximum difference 1 °C, 4) Primer GC% minimum 30, optimal 50, maximum 80, and the other settings were left to default values. Among the various combinations of primers returned by the online tool, we selected the pair that combined the following criteria: 1) minimal penalty value, 2) amplicon at the 3′ end of the cDNA sequence, 3) unique hit in the NCBI Primer-BLAST webpage (http://www.ncbi.nlm.nih.gov/tools/primer-blast/index.cgi?LINK_LOC=BlastHome) using the RefseqmRNA database and Bombyx mori as the organism, 4) unique hit in the KAIKObase website using Blastn.

Usage Notes

The annotation of KEGG pathway maps we provide in Fig. 2 can serve as a reference tool for other researchers who want to analyse the contribution of each of these signalling pathways to ecdysteroidogenesis. However, it is critical to note that all these signalling modules may not simply serve ecdysteroidogeneis, i.e., ecdysteroids secretion can not be the only readout of the activation or inhibition of components of each pathway in PG cells. The perplexing array of these pathways and the scarcity of data regarding other functions that PG cells may carry out in the insect body makes it difficult at present to assign discrete roles to each one of them and maybe there can not be a single role to be assigned to each one of them. For example, the Hedgehog signalling pathway has been linked to ecdysteroidogenesis in Drosophila melanogaster[13] but does this link mean that the Wnt and the TGF signalling pathways (Fig. 2) are also involved in ecdysteroidogenesis? Data Citation 2 contains the raw RNA-seq data from Bombyx mori PGs. The files have been analysed with up-to-date software packages such as TopHat2[30] and Bowtie2[29] aligners for short read mapping to the Bombyx mori reference genome, and the Bioconductor package metaseqR with the PANDORA[32] method for normalization and statistical analyses. Our data can be used to examine the performance of other short read aligners or it can be a suitable resource in studies that map differential gene expression in various tissues or organs of this or other insect species. Data Citation 3 contains the raw proteomics data from Bombyx mori PGs. The files have been analysed with the software package Proteome Discoverer 1.4. We believe that extending the data analysis to obtain quantitative protein profiles from the existing raw files using available proteomic analysis platforms would be quite interesting because it will provide a dimension of protein abundance in PG cells. Furthermore, identification and differential quantitation of post-translational modifications will be of major importance. Being datasets derived from control, untreated whole tissue extracts, our data can be a suitable resource when combined with future studies on organelle-enriched proteomic analysis or phosphoproteomic analysis of ecdysteroidogenesis in insects. We believe that the publicly available custom track data hub resource on UCSC server (http://epigenomics.fleming.gr/tracks/hs_trackhubs/ekpa_dedos_2/hub.txt) hosting our RNA-seq data can be the appropriate visual tool in deciphering gene and exon/intron boundaries of yet uncharacterized Bombyx mori genes and identifying previously unidentified genes through the extensive visual inspection features of the tracks. We encourage other research teams to compare our Proteome Discoverer version 1.4. data analysis files (Data Citation 5) with different or more up-to-date software and analyse our raw data (Data Citation 3) to yield further insight than we already report. The annotated gene list of Bombyx mori that we provide here (Data Citation 1) can be integrated with the publicly available datasets on KAIKObase[18] and can serve as a template for further annotation of Bombyx mori genes. The Bombyx mori reference gene file in GTF format (contained within Data Citation 4) can be the appropriate resource to use in follow-up research on the expression profiles of other genes that participate in ecdysteroidogenesis. Data Citation 4 contains i) the assembled Bombyx mori genome as described in the ‘Methods’ section, ii) the assembled Bombyx mori transcriptome assembled like the Bombyx mori genome and used for optimizing the short read alignment procedure, iii) the assembled Bombyx mori genes in GFF, GTF and text tab-delimited format, iv) bowtie2 indexes for the Bombyx mori genome, v) bowtie2 indexes for the Bombyx mori transcriptome and vi) custom code used for the analysis. With the full report produced by the Bioconductor package metaseqR[32] (http://epigenomics.fleming.gr/metaseqr_runs/HS/000004/), researchers may explore a rich set of quality controls regarding the raw data (Data Citation 2), as well as several analytics regarding sample quality, normalization effects and differential expression analysis. In addition, researchers may download the full list of Bombyx mori genes coupled with RNA abundance values and other quality metrics as well as all the necessary parameters that will help them reproduce the differential expression analysis.

Samples, subjects, and data outputs

Detailed information accounting for each sample, the data-generating assays applied to each sample and the resulting data outputs are provided in Data Citation 1.

Additional Information

How to cite this article: Moulos, P. et al. Combinatory annotation of cell membrane receptors and signalling pathways of Bombyx mori prothoracic glands. Sci. Data 3:160073 doi: 10.1038/sdata.2016.73 (2016).
  36 in total

1.  KEGG: kyoto encyclopedia of genes and genomes.

Authors:  M Kanehisa; S Goto
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

3.  Neuroendocrine regulation of Drosophila metamorphosis requires TGFbeta/Activin signaling.

Authors:  Ying Y Gibbens; James T Warren; Lawrence I Gilbert; Michael B O'Connor
Journal:  Development       Date:  2011-05-25       Impact factor: 6.868

4.  The G protein-coupled receptors in the silkworm, Bombyx mori.

Authors:  Yi Fan; Peng Sun; Yu Wang; Xiaobai He; Xiaoyan Deng; Xiaopan Chen; Guozheng Zhang; Xin Chen; Naiming Zhou
Journal:  Insect Biochem Mol Biol       Date:  2010-06-08       Impact factor: 4.714

5.  Basic pattern of fluctuation in hemolymph PTTH titers during larval-pupal and pupal-adult development of the silkworm, Bombyx mori.

Authors:  Akira Mizoguchi; Skarlatos G Dedos; Hajime Fugo; Hiroshi Kataoka
Journal:  Gen Comp Endocrinol       Date:  2002-06-15       Impact factor: 2.822

Review 6.  Human disease models in Drosophila melanogaster and the role of the fly in therapeutic drug discovery.

Authors:  Udai Bhan Pandey; Charles D Nichols
Journal:  Pharmacol Rev       Date:  2011-03-17       Impact factor: 25.468

7.  Bombyx prothoracicostatic peptides activate the sex peptide receptor to regulate ecdysteroid biosynthesis.

Authors:  Naoki Yamanaka; Yue-Jin Hua; Ladislav Roller; Ivana Spalovská-Valachová; Akira Mizoguchi; Hiroshi Kataoka; Yoshiaki Tanaka
Journal:  Proc Natl Acad Sci U S A       Date:  2010-01-19       Impact factor: 11.205

8.  baySeq: empirical Bayesian methods for identifying differential expression in sequence count data.

Authors:  Thomas J Hardcastle; Krystyna A Kelly
Journal:  BMC Bioinformatics       Date:  2010-08-10       Impact factor: 3.169

9.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

10.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions.

Authors:  Daehwan Kim; Geo Pertea; Cole Trapnell; Harold Pimentel; Ryan Kelley; Steven L Salzberg
Journal:  Genome Biol       Date:  2013-04-25       Impact factor: 13.583

View more
  5 in total

1.  Transcriptomic analysis of the prothoracic gland from two lepidopteran insects, domesticated silkmoth Bombyx mori and wild silkmoth Antheraea pernyi.

Authors:  Hai-Xu Bian; Dong-Bin Chen; Xi-Xi Zheng; Hong-Fang Ma; Yu-Ping Li; Qun Li; Run-Xi Xia; Huan Wang; Yi-Ren Jiang; Yan-Qun Liu; Li Qin
Journal:  Sci Rep       Date:  2019-03-29       Impact factor: 4.379

2.  Microcurrent Stimulation Triggers MAPK Signaling and TGF-β1 Release in Fibroblast and Osteoblast-Like Cell Lines.

Authors:  Evangelia Konstantinou; Zoi Zagoriti; Anastasia Pyriochou; Konstantinos Poulas
Journal:  Cells       Date:  2020-08-19       Impact factor: 6.600

3.  Neurotransmitters Affect Larval Development by Regulating the Activity of Prothoracicotropic Hormone-Releasing Neurons in Drosophila melanogaster.

Authors:  Shun Hao; Julia Yvonne Gestrich; Xin Zhang; Mengbo Xu; Xinwei Wang; Li Liu; Hongying Wei
Journal:  Front Neurosci       Date:  2021-12-17       Impact factor: 4.677

4.  RNAi-Mediated Silencing of Putative Halloween Gene Phantom Affects the Performance of Rice Striped Stem Borer, Chilo suppressalis.

Authors:  Muhammad Faisal Shahzad; Atif Idrees; Ayesha Afzal; Jamshaid Iqbal; Ziyad Abdul Qadir; Azhar Abbas Khan; Ayat Ullah; Jun Li
Journal:  Insects       Date:  2022-08-15       Impact factor: 3.139

5.  Refining a steroidogenic model: an analysis of RNA-seq datasets from insect prothoracic glands.

Authors:  Panagiotis Moulos; Alexandros Alexandratos; Ioannis Nellas; Skarlatos G Dedos
Journal:  BMC Genomics       Date:  2018-07-13       Impact factor: 3.969

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.