| Literature DB >> 25000941 |
Bin Chen1, Yu-Juan Zhang, Zhengbo He, Wanshun Li, Fengling Si, Yao Tang, Qiyi He, Liang Qiao, Zhentian Yan, Wenbo Fu, Yanfei Che.
Abstract
BACKGROUND: Anopheles sinensis is the major malaria vector in China and Southeast Asia. Vector control is one of the most effective measures to prevent malaria transmission. However, there is little transcriptome information available for the malaria vector. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to build a transcriptome dataset for functional genomics analysis by large-scale RNA sequencing (RNA-seq).Entities:
Mesh:
Substances:
Year: 2014 PMID: 25000941 PMCID: PMC4105132 DOI: 10.1186/1756-3305-7-314
Source DB: PubMed Journal: Parasit Vectors ISSN: 1756-3305 Impact factor: 3.876
Statistics of RNA-seq based sequencing, assembling and functional annotation for
| | Number of total clean reads | 51,606,364 |
| | Number of total clean nucleotides (nt) | 4,644,572,760 |
| | Q20 percentage of total clean reads | 95.92% |
| | GC percentage of total clean nucleotides | 51.26% |
| | N percentage of total clean nucleotides | 0.00% |
| Number of unigenes | 38,504 (5,372 into distinct clusters; 33,132 singletons) | |
| | Total length (nt) of total unigenes | 21,977,286 |
| | Mean length (nt) of total unigenes | 571 |
| | N50 (nt) of total unigenes | 711 |
| Unigenes with Nr database | 25,456 (66% of 38,504 unigenes) | |
| (E-value < =1e-5) | Unigenes with Nt database | 20,554 (53%) |
| | Unigenes with Swiss-Prot database | 17,651 (46%) |
| | Unigenes with KEGG database | 16,622 (43%), 257 pathways |
| | Unigenes with COG database | 7,204 (19%), 25 functional categories |
| | Unigenes with GO database | 16,588 (43%), 62 subcategories grouped to 3 main categories |
| | Biological process | 27 sub-categories |
| | Cellular component | 17 sub-categories |
| | Molecular function | 18 sub-categories |
| Total unigenes annotated | 26,650 (69% of 38,504 unigenes) |
Figure 1Distribution of block count (A) and coverage (B) of alignment between transcriptome and reference genome. Coverage is the ratio of match-to-length for each unigene.
Figure 2NR classification of unigenes. A) E-value distribution; B) Similarity distribution. C) Species distribution.
Figure 3GO function classification of unigenes.
Figure 4COG classification of unigenes.
Homology analysis between and other Dipteran genomes using BLASTX with cut-off E-value of 1E-5
| Number of a.a. sequences | 14324 | 17408 | 19018 | 27538 |
| Number of sequences with a.a. > 50 | 14296 | 17345 | 18906 | 27410 |
| Number of one-directional BLAST hits | 12973 | 11997 | 11990 | 10119 |
| Number of Bi-directional BLAST hits | 6586 | 6116 | 6084 | 4919 |
| Genome sequence version | AgamP3.6 | AaegL1.3 | CpipJ1.3 | r5.47 |
| Source of genome sequence | Vectorbase | Vectorbase | Vectorbase | Flybase |
Figure 5Homologous gene numbers between , and , and detected by one- and bi-directional BLAST searches. The numbers decreased with the phylogenetic distance between An. sinensis and other Dipteran species (the divergence times were adapted from Grimldi et al. [43].
Figure 6GO terms similarity distribution among and Bar graph was plotted using a web-based tool, WEGO.
GC content and codon bias in predicted ORFs of unigenes in .
| % GC of 24,361 ORFs | 55.25% |
| GC3 (% GC at 3rd codon position) | 65.40% |
| Nc (Effective number of condons) | 46.71 |
| Total number of codons | 4,048,458 |
Features of SSRs identified in the transcriptome
| Number of unigenes longer than 1 kb | 4,921 |
| Total nucleotides screened (knt) | 21,977 |
| Number of unigenes containing SSRs | 1,223 |
| Number of identified SSRs | 1,904 |
| Kinds of identified SSRs | 307 |
| Number of unigenes containing more than 1 SSRs | 681 |
| Frequency of SSR in transcriptome | 1/11.5Kb |
Frequency of SSRs in transcriptome
| Di | - | - | 297 | 168 | 111 | 58 | 24 | 31 | 689 | 36.2 |
| Tri | - | 685 | 263 | 126 | 18 | 1 | 1 | 4 | 1098 | 57.7 |
| Tetra | 70 | 22 | 4 | 1 | 3 | - | - | 2 | 102 | 5.3 |
| Penta | 7 | 2 | 1 | - | - | - | - | - | 10 | 0.5 |
| Hexa | 4 | 1 | - | - | - | - | - | - | 5 | 0.3 |
| Total | 81 | 710 | 565 | 295 | 132 | 59 | 25 | 37 | 1904 | |
| % | 4.3 | 37.3 | 30.0 | 15.5 | 6.9 | 3.1 | 1.3 | 1.9 | ||