| Literature DB >> 35893024 |
Pauline Nicole O de la Peña1, Adria Gabrielle D Lao1, Ma Anita M Bautista1.
Abstract
RNA sequencing was used to assemble transcriptome data for Philippine-reared silkworm and compare gene expression profiles of strains reared in high- and low-temperature environments. RNA was isolated from the silk glands of fifth instar larvae and mRNA-enriched libraries were sequenced using Illumina NextSeq 500. Transcriptome reads were assembled using reference-based and de novo assemblers, and assemblies were evaluated using different metrics for transcriptome quality, including the read mapping rate, E90N50, RSEM-eval, and the presence of single-copy orthologs. All transcriptome assemblies were able to reconstruct >40,000 transcripts. Differential expression analysis found 476 differentially expressed genes (DEGs; 222 upregulated, 254 downregulated) in strains reared in different temperatures. Among the top DEGs were myrosinase, heat shock proteins, serine protease inhibitors, dehydrogenases, and regulators of the juvenile hormone. Validation of some of the top DEGs using qPCR supported the findings of the in silico analysis. GO term enrichment analysis reveals an overrepresentation of GO terms related to nucleotide metabolism and biosynthesis, lipid and carbohydrate metabolic processes, regulation of transcription, nucleotide binding, protein binding, metal binding, catalytic activity, oxidoreductase activity, and hydrolase activity. The data provided here will serve as a resource for improving local strains and increasing silk production of Philippine-reared B. mori strains.Entities:
Keywords: Bombyx mori; RNA-seq; silk gland; silkworm; transcriptome analysis
Year: 2022 PMID: 35893024 PMCID: PMC9329738 DOI: 10.3390/insects13080669
Source DB: PubMed Journal: Insects ISSN: 2075-4450 Impact factor: 3.139
Primers used for qPCR validation of differentially expressed genes in Bombyx mori strains.
| Primers | Sequences (5′ to 3′) | Product Length (bp) | Protein ID |
|---|---|---|---|
| Downregulated genes (CAR vs. TCMO) | |||
| Hsp 20 | GCCAACGATGTCCAGAGATT | 196 | heat shock protein hsp20.1 |
| odcdl | AAATGTCTGGGTGGAAGCAG | 235 | oxygen-dependent choline dehydrogenase-like |
| myr | GGAGACCGAGTGAAGACCTG | 152 | myrosinase 1 |
| Upregulated genes (CAR vs. TCMO) | |||
| pl | TGTTGTCGCTTTTCATCTGC | 207 | peroxidase-like |
| LOC105842393 | AAATGGCGCACAAATAGAGG | 177 | uncharacterized protein |
| ts | AAACCGAGCGTCCACTTATG | 186 | uncharacterized protein (tetraspanin family) |
| Housekeeping gene for normalization | |||
| rp49 | CAGGCGGTTCAAGGGTCAATAC | 213 | ribosomal protein 49 |
Characteristics of silkworm strains used in the study.
| Strain | Origin | Strain Classification | Cocoon | Cocoon Shell Percentage | Raw Silk Percentage |
|---|---|---|---|---|---|
| MO204 | TCMO | Priority | Oval, white | 22.92% | 22.42% |
| DMMMSU119 | TCMO | Poor performing | Peanut, white | 21.81% | 21.12% |
| LAT51 | CAR | Priority | Peanut, white | 21.57% | 17.44% |
| ITA | CAR | Priority | Oval, yellow gold | 20.27% | no data |
Cocoon shell percentage: Ratio of the weight of silk shell to the weight of the whole cocoon; Raw silk percentage: Ratio of the weight of raw silk to the weight of the whole cocoon.
Statistics of raw RNA-seq read sequences of Bombyx mori strains.
| Sample | Read Count | % ≥ Q30 | Mean Q Score | GC Content (%) |
|---|---|---|---|---|
| MO204-1 | 17,531,032 | 94.11 | 34.60 | 41 |
| MO204-3 | 8,590,562 | 93.49 | 34.47 | 43 |
| MO204-4 | 13,085,562 | 94.10 | 34.59 | 44 |
| DMMMSU-0 | 11,406,238 | 92.68 | 34.29 | 46 |
| DMMMSU-2 | 21,789,720 | 92.38 | 34.22 | 46 |
| DMMMSU-4 | 15,464,914 | 91.51 | 34.04 | 46 |
| LAT51-1 | 12,321,992 | 93.70 | 34.51 | 42 |
| LAT51-2 | 22,767,422 | 93.91 | 34.56 | 41 |
| LAT51-4 | 8,793,834 | 94.74 | 34.73 | 44 |
| ITA-1 | 3,621,664 | 86.80 | 33.05 | 48 |
| ITA-2 | 4,633,324 | 88.96 | 33.51 | 48 |
| ITA-3 | 3106 | 65.36 | 28.55 | 75 |
Figure 1Bombyx mori raw read counts and proportion of reads after processing.
Figure 2Summary of STAR alignment statistics of Bombyx mori RNA-seq reads to the reference genome. (PE: Paired-end reads, SE: Single-end reads). Generated using STAR.log files and MultiQC [23].
Figure 3Overall alignment rate of RNA-seq reads to the transcriptome assemblies.
Contig statistics of the different transcriptome assemblies from Bombyx mori strains. Statistics generated from TrinityStats.pl.
| Trinity | Cufflinks | StringTie | |
|---|---|---|---|
| Total Trinity “genes” | 47,562 | 40,593 | 41,851 |
| Total Trinity transcripts | 69,948 | 40,593 | 41,851 |
| Percent GC | 38.12% | 42.08% | 41.66% |
| N10 | 3257 | 7944 | 7333 |
| N20 | 2487 | 6108 | 5595 |
| N30 | 1992 | 4987 | 4491 |
| N40 | 1635 | 4126 | 3640 |
| N50 | 1313 | 3414 | 2981 |
| Median contig length | 468 | 1738 | 1226 |
| Average contig length | 800.21 | 2305.35 | 1789.3 |
Figure 4ExN50 profiles of the (a) Cufflinks, (b) StringTie, and (c) Trinity merged Bombyx mori transcriptome assemblies.
DETONATE scores of the different Bombyx mori transcriptome assemblies.
| Assembly | RSEM-Eval | Nucleotide F1 (Unweighted) | Contig F1 (Unweighted) | Weighted | k-Mer Compression Score |
|---|---|---|---|---|---|
| Trinity | −8.35 × 109 | 0.38019 | 0.00239295 | 0.678414 | −7.32371 |
| Cufflinks | −1.34 × 1010 | 0.682955 | 0.690038 | 0.545637 | −22.5079 |
| StringTie | −1.55 × 1010 | 0.775391 | 0.677147 | 0.545727 | −17.3472 |
Figure 5Measure of completeness of transcriptome assemblies from Bombyx mori strains using BUSCO (dataset: insecta_odb9).
Figure 6Heat map of the 112 DEGs from Bombyx mori with padj < 0.1. The values shown are scaled to the distances from the average of each row (gene).
Figure 7Volcano plot of DEGs from Bombyx mori, log fold change vs. −log P. Gray: Nonsignificant genes; green: Genes with significant fold changes but insignificant adjusted p-values; blue: Genes that passed the significance threshold according to adjusted p-value but did not have significant fold changes; red: Genes that have adjusted p-value < 0.1 and significant fold change (|log2 FC| > 1). Graph generated in R package EnhancedVolcano [28].
Summary of select top DEGs between different Bombyx mori strains (CAR vs. TCMO).
| Downregulated Genes (CAR vs. TCMO) | ||||
|---|---|---|---|---|
| Gene ID | log2FC |
| Protein ID | Function |
| LOC101740965 | −12.8030 | 1.31 × 10−20 | myrosinase 1 isoform X2 | hydrolysis of glucosinolates, defense enzyme |
| Hsp20.1 | −4.4182 | 7.45 × 10−17 | heat shock protein hsp20.1 | chaperone protein |
| LOC101739591 | −4.4249 | 7.92 × 10−20 | transient receptor potential channel pyrexia isoform X1 | ion transport, contains ankyrin repeat |
| LOC101741105 | −10.2153 | 5.85 × 10−13 | myrosinase 1 | hydrolysis of glucosinolates, defense enzyme |
| LOC101738036 | −10.7102 | 2.66 × 10−10 | serine protease inhibitor swm-1 | serine protease inhibitor, trypsin inhibitor |
| LOC105842960 | −3.8422 | 4.35 × 10−15 | oxygen-dependent choline dehydrogenase-like | oxidoreductase activity, FAD/NAD(P)-binding |
|
| ||||
|
|
|
|
|
|
| LOC101739033 | 5.7239 | 7.09 × 10−16 | zinc finger protein 664 | dimer formation, metal binding |
| LOC101740169 | 8.3899 | 1.81 × 10−14 | uncharacterized protein LOC101740169 | tetraspanin family, scaffolding, cell adhesion, proliferation |
| LOC101741524 | 6.9289 | 1.99 × 10−12 | synaptic vesicle glycoprotein 2C | sugar transporter |
| LOC105842878 | 3.3250 | 2.85 × 10−9 | peroxidase-like | heme peroxidase, redox activity, oxidoreductase activity, defense enzyme |
| LOC119629954 | 7.5484 | 1.91 × 10−8 | zinc finger BED domain-containing protein 4-like isoform X2 | contains hAT dimerization domain, ribonuclease H-like superfamily: nucleotide metabolism and transport |
| LOC119630495 | 9.2422 | 5.08 × 10−7 | uncharacterized protein LOC119630495 | contains motifs: peptidase, Hsp70, TetR |
| LOC105842393 | 6.7307 | 1.61 × 10−6 | uncharacterized protein LOC105842393 | contains motifs: DUF5641 (pfam18701), unknown function, found in retrotransposons |
| Defensin | 2.2251 | 5.08 × 10−7 | defensin like protein 2 precursor | contains signal peptide |
Figure 8Results of qPCR validation of differentially expressed genes. Heat shock protein 20 (Hsp20), oxygen-dependent choline dehydrogenase-like (odcdl), and myrosinase 1 (myr) were downregulated in CAR vs. TCMO, while uncharacterized protein (LOC105842393), peroxidase-like (pl), and uncharacterized protein tetraspanin family (ts) were upregulated in CAR vs. TCMO. Log2FC values calculated from qPCR results are compared with values obtained during DESeq2 analysis. Error bars indicate standard deviation; asterisks denote significant change in CAR vs. TCMO (Student’s t-test), p < 0.05.