| Literature DB >> 31088494 |
Abhijeet Shah1,2, Joseph I Hoffman3, Holger Schielzeth4.
Abstract
BACKGROUND: The club-legged grasshopper Gomphocerus sibiricus is a Gomphocerinae grasshopper with a promising future as model species for studying the maintenance of colour-polymorphism, the genetics of sexual ornamentation and genome size evolution. However, limited molecular resources are available for this species. Here, we present a de novo transcriptome assembly as reference resource for gene expression studies. We used high-throughput Illumina sequencing to generate 5,070,036 paired-end reads after quality filtering. We then combined the best-assembled contigs from three different de novo transcriptome assemblers (Trinity, SOAPdenovo-trans and Oases/Velvet) into a single assembly.Entities:
Keywords: Acrididae; Gomphocerinae; Insects; Mitochondria; Orthoptera; Transcriptome; Wolbachia
Mesh:
Year: 2019 PMID: 31088494 PMCID: PMC6518663 DOI: 10.1186/s12864-019-5756-4
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1The club-legged grasshopper Gomphocerus sibiricus, an alpine-dwelling species exhibiting prominent sexual dimorphism of the front leg. Photo credit: Holger Schielzeth
TransRate assembly metrics and BUSCO completeness assessment for Gomphocerus sibiricus (this study) in comparison to the other two Acridid transcriptomes published (Stenobothrus lineatus, [13], Chorthippus biguttulus, [34]). Higher TransRate assembly scores indicate better quality assemblies. BUSCO completeness assessment was conducted using the insect ortholog database (orthoDB9)
| TransRate metrics |
|
|
|
|---|---|---|---|
| Number of contigs | 82,251 | 57,778 | 67,733 |
| Number of contigs with ORF | 21,347 | 12,717 | 30,018 |
| N50 of contig length | 1357 | 1207 | 1246 |
| Length of longest contig | 43,026 | 22,561 | 34,437 |
| Length of shortest contig | 301 | 200 | 600 |
| Proportion of read fragments mapped | 0.88 | 0.70 | 0.51 |
| Proportion of good read pairs mapping | 0.82 | 0.63 | 0.43 |
| TransRate assembly score | 0.325 | 0.162 | 0.106 |
| BUSCOs (Number of BUSCO units found) | |||
| Complete BUSCOs | 1405 | 1337 | 1489 |
| Complete and Single-Copy BUSCOs | 1093 | 1244 | 1323 |
| Complete and duplicated BUSCOs | 312 | 93 | 166 |
| Fragmented BUSCOs | 137 | 142 | 99 |
| Missing BUSCOs | 116 | 179 | 70 |
| Total BUSCOs searched | 1658 | 1658 | 1658 |
Fig. 2A two-dimensional kernel density plot showing minor allele frequency (MAF) plotted against log10 sequence coverage. The dark purple regions indicate higher densities, whereas the light blue regions indicate lower densities. Marginal histograms show frequency distributions of minor allele frequency (top axis) and coverage (right axis)
Fig. 3Transcripts from Gomphocerus sibiricus mapping to the reference mitochondrial genome assembly. The short vertical lines indicate positions of putative SNPs. The grey shaded blocks indicate protein and rRNA coding regions
Fig. 4Coverage of the Wolbachia genome estimated by mapping G. sibiricus transcripts against the pel wPi strain of Wolbachia. The major peaks around positons 1,136,000 bp and 1,236,0000 bp correspond to the 16 s and 23 s rRNA coding regions
Fig. 5Flow chart showing a summary of the de novo assembly procedure and downstream analyses