| Literature DB >> 35886762 |
Hidemasa Bono1,2, Takuma Sakamoto3,4, Takeya Kasukawa5, Hiroko Tabunoki3,4.
Abstract
Next-generation sequencing has revolutionized entomological study, rendering it possible to analyze the genomes and transcriptomes of non-model insects. However, use of this technology is often limited to obtaining the nucleotide sequences of target or related genes, with many of the acquired sequences remaining unused because other available sequences are not sufficiently annotated. To address this issue, we have developed a functional annotation workflow for transcriptome-sequenced insects to determine transcript descriptions, which represents a significant improvement over the previous method (functional annotation pipeline for insects). The developed workflow attempts to annotate not only the protein sequences obtained from transcriptome analysis but also the ncRNA sequences obtained simultaneously. In addition, the workflow integrates the expression-level information obtained from transcriptome sequencing for application as functional annotation information. Using the workflow, functional annotation was performed on the sequences obtained from transcriptome sequencing of the stick insect (Entoria okinawaensis) and silkworm (Bombyx mori), yielding richer functional annotation information than that obtained in our previous study. The improved workflow allows the more comprehensive exploitation of transcriptome data and is applicable to other insects because the workflow has been openly developed on GitHub.Entities:
Keywords: RNA sequencing; functional annotation; silkworm; stick insect; transcriptome assembly
Year: 2022 PMID: 35886762 PMCID: PMC9319598 DOI: 10.3390/insects13070586
Source DB: PubMed Journal: Insects ISSN: 2075-4450 Impact factor: 3.139
Source of reference databases.
| Category | Name of Resource | URL |
|---|---|---|
| Protein | Ensembl | 1 |
| UniProtKB | ||
| Non-coding RNA sequences | Ensembl | 1 |
| EnsemblGenomes | 1 | |
| Protein and RNA domain | Pfam | |
| Rfam |
1 Only the URL for reference data of the typical organism is listed.
Figure 1Overview of the annotation workflow, Fanflow4Insects.
Figure 2Annotation workflow for transcript description from sequence information. Left half for protein sequence level annotation; right half for nucleotide sequence level annotation.
Figure 3Annotation workflow for transcript description from expression information.
Protein-level annotation for E. okinawaensis.
| Annotation Category | Annotation Level | Number of | Cumulative Number | Percentage |
|---|---|---|---|---|
| Protein homolog from tophit | Human or mouse homolog | 44,351 | 44,351 | 65.1 |
| 3739 | 48,090 | 70.6 | ||
| 2349 | 50,439 | 74.0 | ||
| Homolog found in UniProtKB | 4629 | 55,068 | 80.8 | |
| No protein homolog | Protein domain | 3808 | 58,876 | 86.4 |
| Hypothetical protein | 9286 | 68,162 | 100 |
Functional annotation from expression for E. okinawaensis.
| Annotation Level | All | Hypothetical Protein (5699) | Unclassifiable Transcript (253,176) |
|---|---|---|---|
| Fat body-specific expression | 48,675 | 622 | 42,790 |
| Midgut-specific expression | 28,918 | 315 | 25,005 |
| Constitutive expression | 228,103 | 4606 | 181,521 |
| Not expressed | 5661 | 156 | 3860 |
Protein-level annotation for B. mori.
| Annotation Category | Annotation Level | Number of | Cumulative Number | Percentage |
|---|---|---|---|---|
| Protein homolog from tophit | Human or mouse homolog | 31,354 | 31,354 | 68.6 |
| 1752 | 33,106 | 72.4 | ||
| 2113 | 35,219 | 77.0 | ||
| Homolog found in UniProtKB | 2888 | 38,107 | 83.4 | |
| No protein homolog | Protein domain | 5398 | 43,505 | 95.2 |
| Hypothetical protein | 2214 | 45,719 | 100 |
Functional annotation of B. mori transcripts based on expression.
| Annotation Level | All | Hypothetical Protein | Unclassifiable Transcript (12,845) |
|---|---|---|---|
| Fat body-specific expression | 39 | 1 | 11 |
| Midgut-specific expression | 108 | 0 | 29 |
| Malpighian tubule-specific expression | 83 | 3 | 14 |
| Silk gland-specific expression | 365 | 12 | 92 |
| Testis-specific expression | 861 | 19 | 492 |
| Ovary-specific expression | 179 | 9 | 55 |
| Constitutive expression | 7825 | 114 | 1205 |
| No expression | 609 | 24 | 125 |
Figure 4Comparison of ncRNA transcripts between E. okinawaensis and B. mori. (a) The numbers of ncRNA transcripts predicted from fruit fly annotation. (b) The numbers of ncRNA transcripts predicted from human annotation.
ncRNA transcripts expressed both in E. okinawaensis and B. mori.
| Transcript ID | Gene Name | Gene Description | |
|---|---|---|---|
| Annotation from fruit fly | FBtr0100888 | mt:lrRNA | mitochondrial large ribosomal RNA |
| FBtr0345722 | asRNA:CR45330 | antisense RNA:CR45330 | |
| FBtr0346876 | 28SrRNA: | 28S ribosomal RNA:CR45837 | |
| FBtr0346877 | pre-rRNA:CR45846 | ribosomal RNA primary transcript:CR45846 | |
| FBtr0346881 | pre-rRNA:CR45847 | ribosomal RNA primary transcript:CR45847 | |
| FBtr0346882 | 18SrRNA: | 18S ribosomal RNA:CR45841 | |
| Annotation from human | ENST00000450451 | novel transcript | |
| ENST00000501016 | novel transcript | ||
| ENST00000518947 | HOXA-AS3 | HOXA cluster antisense RNA 3 [Source:HGNC Symbol;Acc:HGNC:43748] | |
| ENST00000547387 | novel transcript, antisense to TUBA1B | ||
| ENST00000618978 | U2 | U2 spliceosomal RNA [Source:RFAM;Acc:RF00004] | |
| ENST00000623543 | novel transcript, antisense to TUBA8 | ||
| ENST00000631211 | novel transcript, similar to YY1 associated myogenesis RNA 1 YAM1 | ||
| ENST00000638356 | novel transcript, antisense to ATP4A |