| Literature DB >> 25830018 |
Sheina B Sim1, Bernarda Calla2, Brian Hall1, Theodore DeRego2, Scott M Geib2.
Abstract
BACKGROUND: Bactrocera cucurbitae is a serious global agricultural pest. Basic genomic information is lacking for this species, and this would be useful to inform methods of control, damage mitigation, and eradication efforts. Here, we have sequenced, assembled, and annotated a comprehensive transcriptome for a mass-rearing sexing strain of this species. This forms a foundational genomic and transcriptomic resource that can be used to better understand the physiology and biochemistry of this insect as well as being a useful tool for population genetics.Entities:
Keywords: Bactrocera cucurbitae; Melon fly; RNA-Seq; SIT; Sterile insect technique; Tephritidae; Translocation; White-pupae
Mesh:
Year: 2015 PMID: 25830018 PMCID: PMC4379760 DOI: 10.1186/s13742-015-0053-x
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Transcriptome assembly and annotation statistics compared with other Tephritid transcriptomes and the genome
| Species |
|
|
|
|
|---|---|---|---|---|
| Number of read pairs used in assembly (SRA accession number) | ||||
| Egg (SRA: SRS691534) | 43741314 | 12462204 | - | - |
| Larvae (SRA: SRS691533) | 51568835 | 11753084 | - | - |
| Pupae (SRA: SRS691532) | 47093178 | 13291147 | 93256673 | - |
| Adult (SRA: SRS691531) | 46515243 | 47250123 | 96929532 | - |
| Total | 188918570 | 84756558 | 190186205 | - |
| Normalized reads ( | 12792085 | 7796491 | 17217414 | - |
|
| ||||
| Number of unigenes (or | 50220 | 47216 | 118793 | - |
| N50 longest transcript/unigene | 2191 | 1882 | 1187 | - |
| Sum longest transcript/unigene (Mb) | 49.63 | 40.20 | 81.56 | - |
| Number of transcripts | 76688 | 80345 | 190958 | - |
| N50 transcript length (bp) | 2626 | 2802 | 2686 | - |
| Sum transcript length (Mb) | 100.20 | 109.48 | 236.18 | - |
| Transcripts per unigene | 1.53 | 1.70 | 1.61 | - |
| GC % | 38.10 | 39.11 | 36.21 | - |
|
| ||||
| Number of unigenes | 10425 | 10784 | 10741 | 15504 |
| N50 unigene length (longest transcript/unigene) | 3464 | 3043 | 3383 | 2979 |
| Sum longest transcript/unigene (Mb) | 28.12 | 24.46 | 28.34 | 30.53 |
| Number of transcripts | 17654 | 23539 | 21761 | 25205 |
| N50 transcript length (bp) | 3477 | 3460 | 3913 | 3633 |
| Sum transcript length (Mb) | 48.28 | 62.06 | 66.65 | 68.47 |
| Isoforms per unigene | 1.69 | 2.18 | 2.03 | 1.63 |
| GC % | 40.17 | 40.32 | 39.41 | 49.70 |
| N50 protein length (amino acids) | 323 | 301 | 310 | 370 |
| Number of proteins with complete ORF (%) | 12936 (73.2) | 13017 (55.3) | 15740 (72.3) | - |
|
| ||||
| Number of proteins with PFAM domains identified | 13029 | 16612 | 13646 | - |
| Number of proteins with Gene Ontology Terms | 10640 | - | 13648 | - |
| Number of proteins with gene names | 15956 | 17093 | 15841 | - |
| Number of proteins with significant hit to | 16070 | 20713 | 19245 | - |
aData from Geib et al., 2014 [2]; bData from Calla et al., 2014 [8]; cData from Flybase r6.03 [11]; dBLASTP hit with e-value cutoff 1e-5.
Figure 1Comparison oftranscriptome to related fly species. Distribution of (A) transcript length and (B) predicted polypeptide length of the B. cucurbitae transcriptome compared with published Bactrocera dorsalis and Ceratitis capitata de novo transcriptome assemblies and the current Drosophila melanogaster transcript/protein set (Flybase r6.03).