| Literature DB >> 27442123 |
Ahmed Sayadi1, Elina Immonen1, Helen Bayram1, Göran Arnqvist1.
Abstract
Despite their unparalleled biodiversity, the genomic resources available for beetles (Coleoptera) remain relatively scarce. We present an integrative and high quality annotated transcriptome of the beetle Callosobruchus maculatus, an important and cosmopolitan agricultural pest as well as an emerging model species in ecology and evolutionary biology. Using Illumina sequencing technology, we sequenced 492 million read pairs generated from 51 samples of different developmental stages (larvae, pupae and adults) of C. maculatus. Reads were de novo assembled using the Trinity software, into a single combined assembly as well as into three separate assemblies based on data from the different developmental stages. The combined assembly generated 218,192 transcripts and 145,883 putative genes. Putative genes were annotated with the Blast2GO software and the Trinotate pipeline. In total, 33,216 putative genes were successfully annotated using Blastx against the Nr (non-redundant) database and 13,382 were assigned to 34,100 Gene Ontology (GO) terms. We classified 5,475 putative genes into Clusters of Orthologous Groups (COG) and 116 metabolic pathways maps were predicted based on the annotation. Our analyses suggested that the transcriptional specificity increases with ontogeny. For example, out of 33,216 annotated putative genes, 51 were only expressed in larvae, 63 only in pupae and 171 only in adults. Our study illustrates the importance of including samples from several developmental stages when the aim is to provide an integrative and high quality annotated transcriptome. Our results will represent an invaluable resource for those working with the ecology, evolution and pest control of C. maculatus, as well for comparative studies of the transcriptomics and genomics of beetles more generally.Entities:
Mesh:
Year: 2016 PMID: 27442123 PMCID: PMC4956038 DOI: 10.1371/journal.pone.0158565
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The overall workflow, summarizing the steps of the transcriptome assembly.
Summary statistics of sequencing data and the combined de novo transcriptome assembly of C. maculatus.
| Raw reads (2×101 bp) | 492,095,358 |
| Filtered Paired-end reads (2×101 bp) | 474,915,945 |
| Total assembled bases | 199,346,342 |
| Number of Transcripts | 218,192 |
| Number of genes | 1465,883 |
| Average transcript length | 914 |
| Min gene length | 224 |
| Max gene length | 26,805 |
| Number of genes > 1 Kb | 26,215 |
| Number of genes > 5 Kb | 1,443 |
| Number of genes > 10 Kb | 107 |
| Transcript N50 (bp) | 1,818 |
| GC content | 38.98 |
Summary statistics of the individual and the combined transcriptome assemblies.
| Larvae | Pupae | Adults | Combined | |
|---|---|---|---|---|
| 72,299 | 79,647 | 71,523 | 218,192 | |
| 57,061 | 62,374 | 53,793 | 145,883 | |
| 1,819 | 1,969 | 2,072 | 1,818 | |
| 953 | 962 | 1,037 | 914 | |
| 39.52 | 39.33 | 39.34 | 38.98 | |
| 68,882,917 | 76,609,446 | 74,156,506 | 199,346,342 | |
| 19,219 | 21,366 | 21,617 | 54,358 | |
| 1,254 | 1,606 | 1,576 | 3,889 | |
| 93 | 130 | 132 | 283 |
Fig 2Blast2GO blast results.
(A) Species distribution for the top BLAST hits for genes in the Nr database. (B) E-value distribution of BLAST hits with a cutoff E-value of 1.0E-5. (C) Similarity distribution of the top BLAST hits.
Fig 3Histogram of GO classifications of C. maculatus Unigenes.
Fig 4Histogram of the clusters of orthologous groups (COG).
Fig 5KEGG pathway distribution.
The number of private genes during ontogeny in C. maculatus.
| All | ORFs | Blast Nr | ORFs with BlastNr | ||
|---|---|---|---|---|---|
| 145,883 | 27,878 | 33,129 | 22,401 | ||
| FPKM> = 2 | 212 | 51 | 51 | 38 | |
| 2>FPKM> = 0.5 | 1,623 | 114 | 179 | 80 | |
| 2>FPKM>0 | 8,453 | 895 | 1,288 | 628 | |
| FPKM> = 2 | 531 | 64 | 63 | 35 | |
| 2>FPKM> = 0.5 | 2,946 | 159 | 266 | 101 | |
| 2>FPKM>0 | 14,823 | 1,437 | 2,197 | 1,017 | |
| FPKM> = 2 | 455 | 222 | 171 | 151 | |
| 2>FPKM> = 0.5 | 2,365 | 367 | 447 | 283 | |
| 2>FPKM>0 | 16,669 | 2,196 | 2,894 | 1,593 |
Here, private genes are defined as those expressed at low (either 2>FPKM >0 or 2>FPKM> = 0.5) or higher levels (FPKM> = 2) in a particular developmental stage, but not found expressed in any of the other stages (FPKM = 0).
The top expressed private genes in larvae, pupae and adults.
| Genes id | Predicted Function (Blast2GO) | Length | FPKM | |
|---|---|---|---|---|
| TR64718|c0_g1_i3 | larval cuticle protein | 684 | 2.802 | |
| TR24068|c0_g1_i4 | catalase-like | 1619 | 7.605 | |
| TR28212|c0_g1_i1 | equilibrative nucleoside transporter 3- partial | 262 | 2.85 | |
| TR8452|c1_g2_i3 | glyoxylate reductase hydroxypyruvate reductase-like | 1740 | 15.646 | |
| TR1265|c0_g1_i1 | glycosyl hydrolase | 1335 | 22.887 | |
| TR52474|c1_g4_i1 | glycoside hydrolase family 1 | 579 | 149.367 | |
| TR18717|c0_g2_i1 | beta-galactosidase-1-like protein 2 | 2098 | 3.517 | |
| TR55315|c3_g1_i2 | cathepsin b-like cysteine protease | 1515 | 17.053 | |
| TR55185|c0_g2_i1 | cuticle protein 7 | 1477 | 3.359 | |
| TR68734|c1_g2_i1 | resilin isoform x1 | 2028 | 164.682 | |
| TR73641|c7_g7_i1 | cuticle protein 8-like | 807 | 52.115 | |
| TR7965|c0_g2_i1 | endothelin-converting enzyme 2-like | 456 | 2.292 | |
| TR10464|c0_g1_i1 | myosin-VIIa | 3706 | 5.021 | |
| TR16797|c0_g1_i1 | probable h aca ribonucleoprotein complex subunit 1 | 1510 | 12.089 | |
| TR64463|c0_g1_i1 | tektin-2 | 1565 | 6.473 | |
| TR20413|c0_g1_i1 | tubulin alpha-1 chain | 1667 | 7.276 | |
| TR9448|c0_g1_i2 | odorant-binding protein 4 | 542 | 5.003 | |
| TR29765|c3_g1_i1 | bone morphogenetic protein 10 isoform x2 | 1287 | 2.578 | |
| TR2044|c0_g1_i1 | calmodulin isoform x1 | 899 | 2.811 | |
| TR37403|c3_g1_i11 | digestive cysteine protease intestain | 2269 | 2.088 |