| Literature DB >> 28327996 |
Wen Xie1, Chunhai Chen2, Zezhong Yang1, Litao Guo1, Xin Yang1, Dan Wang2, Ming Chen2, Jinqun Huang2, Yanan Wen1, Yang Zeng1, Yating Liu1, Jixing Xia1, Lixia Tian1, Hongying Cui1, Qingjun Wu1, Shaoli Wang1, Baoyun Xu1, Xianchun Li3, Xinqiu Tan4, Murad Ghanim5, Baoli Qiu6, Huipeng Pan6, Dong Chu7, Helene Delatte8, M N Maruthi9, Feng Ge10, Xueping Zhou11, Xiaowei Wang12, Fanghao Wan11, Yuzhou Du13, Chen Luo14, Fengming Yan15, Evan L Preisser16, Xiaoguo Jiao17, Brad S Coates18, Jinyang Zhao2, Qiang Gao2, Jinquan Xia2, Ye Yin2, Yong Liu4, Judith K Brown3, Xuguo Joe Zhou19, Youjun Zhang1.
Abstract
The sweetpotato whitefly Bemisia tabaci is a highly destructive agricultural and ornamental crop pest. It damages host plants through both phloem feeding and vectoring plant pathogens. Introductions of B. tabaci are difficult to quarantine and eradicate because of its high reproductive rates, broad host plant range, and insecticide resistance. A total of 791 Gb of raw DNA sequence from whole genome shotgun sequencing, and 13 BAC pooling libraries were generated by Illumina sequencing using different combinations of mate-pair and pair-end libraries. Assembly gave a final genome with a scaffold N50 of 437 kb, and a total length of 658 Mb. Annotation of repetitive elements and coding regions resulted in 265.0 Mb TEs (40.3%) and 20 786 protein-coding genes with putative gene family expansions, respectively. Phylogenetic analysis based on orthologs across 14 arthropod taxa suggested that MED/Q is clustered into a hemipteran clade containing A. pisum and is a sister lineage to a clade containing both R. prolixus and N. lugens. Genome completeness, as estimated using the CEGMA and Benchmarking Universal Single-Copy Orthologs pipelines, reached 96% and 79%. These MED/Q genomic resources lay a foundation for future 'pan-genomic' comparisons of invasive vs. noninvasive, invasive vs. invasive, and native vs. exotic Bemisia, which, in return, will open up new avenues of investigation into whitefly biology, evolution, and management.Entities:
Keywords: Annotation; Assembly; Genomics; Whitefly Bemisia tabaci
Mesh:
Year: 2017 PMID: 28327996 PMCID: PMC5467035 DOI: 10.1093/gigascience/gix018
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Statistics comparison of genome assembly and annotation between MED/Q and MEAM1/B
| MED/Q | MEAM1/B | |||
|---|---|---|---|---|
| Sequencing summary | Scaffold | Contig | Scaffold | Contig |
| Total number | 4954 | 29 618 | 19 761 | 52 036 |
| Total length of (bp) | 658 272 463 | 638 061 971 | 615 029 878 | 599 923 598 |
| Gap number (bp) | 19 828 575 | 0 | 14 380 491 | 0 |
| Average length (bp) | 132 877 | 21 543 | 31 123 | 11 529 |
| N50 length (bp) | 436 791 | 44 366 | 3 232 964 | 29 918 |
| N90 length (bp) | 111 835 | 11 504 | 381 346 | 6117 |
| Maximum length (bp) | 2 857 362 | 362 835 | 11 178 615 | 269 706 |
| Minimum length (bp) | 501 | 500 | 500 | 500 |
| GC content (%) | 39.46 | 39.46 | 39.64 | 39.64 |
| TEs proportion (%) | 265 Mb (0.40) | 269 Mb (0.44) | ||
| CEGMA evaluation (%) | 96 | 100 | ||
| BUSCO evaluation | 78 | 96.8 | ||
| Gene number | 20 786 | 15 664 | ||
| Average gene length (bp) | 10 065 | 22 762 | ||
| Average CDS length (bp) | 1952 | 1470 | ||
| Average exon per gene | 6 | 6 | ||
| Average exon length (bp) | 351 | 234 | ||
| Average intron length (bp) | 1776 | 3125 | ||
| Annotation gene (%) | 79.97 | 81 | ||
| Assemble software | SOAPdenovo | Platanus | ||
From this study.
From the published MEAM1/B genome [11].
Only contigs and scaffolds ≧500 bp were included in the genome assembly.
Figure 1:Phylogenetic relationships and genomic comparisons between Bemisia tabaci and other insect species (A) Phylogenetic relationships of B. tabaci (BEMTA) to insects and other arthropods based on single-copy orthologous genes present in their complete genomes. The following 12 insect species were used for this analysis: Acyrthosiphon pisum (ACYPI), Anopheles gambiae (ANOGA), Apis mellifera (APIME), BEMTA, Bombyx mori (BOMMO), Danaus plexippus (DANPL), Drosophila melanogaster (DROME), Nasonia vitripennis (NASVI), Nilaparvata lugens (NILLU), Pediculus humanus (PEDHU), Rhodnius prolixus (RHOPR), and Tribolium castaneum (TRICA). The two arthropods Daphnia pulex (DAPPU) and Tetranychus urticae (TETUR) were used as outgroup taxa. Branch lengths represent divergence times estimated for the second codon position of 308 single-copy genes, using PhyML with a gamma distribution across sites and a HKY85 substitution model. The branch supports were inferred based on the approximate likelihood ratio test (aLRT). Gene orthology was determined by comparing the genomes of these 14 arthropod species. The use of 1:1:1 refers to single-copy gene orthologs found across all 14 lineages. The use of N:N:N refers to multi-copy gene paralogs found across the 14 lineages. Diptera, Hemiptera, Hymenoptera, Lepidoptera, and Insecta refer to taxon-specific genes present only in the particular lineage. SD indicates species-specific duplicated genes, and ND indicates species-specific unclustered genes. (B) Image of adult MED/Q. (C) A Venn diagram showing the orthologous groups shared among the hemipteran genomes of A. pisum, B. tabaci, N. lugens, and R. prolixus. Our analysis found 3341 gene families common to all four hemipteran genomes, and 2921 common to the genomes of the six vascular (blood and phloem) feeders.