| Literature DB >> 16608516 |
Dena Leshkowitz1, Shirley Gazit, Eli Reuveni, Murad Ghanim, Henryk Czosnek, Cindy McKenzie, Robert L Shatters, Judith K Brown.
Abstract
BACKGROUND: The past three decades have witnessed a dramatic increase in interest in the whitefly Bemisia tabaci, owing to its nature as a taxonomically cryptic species, the damage it causes to a large number of herbaceous plants because of its specialized feeding in the phloem, and to its ability to serve as a vector of plant viruses. Among the most important plant viruses to be transmitted by B. tabaci are those in the genus Begomovirus (family, Geminiviridae). Surprisingly, little is known about the genome of this whitefly. The haploid genome size for male B. tabaci has been estimated to be approximately one billion bp by flow cytometry analysis, about five times the size of the fruitfly Drosophila melanogaster. The genes involved in whitefly development, in host range plasticity, and in begomovirus vector specificity and competency, are unknown.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16608516 PMCID: PMC1488848 DOI: 10.1186/1471-2164-7-79
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Number of sequences from the various libraries and number of sequences assembled into contigs (sequences of mitochondrial origin were removed from the contig assembly process)
| Total number of sequenced clones | 18,976 | 673 | 3,745 | 4,321 | 5,857 | 4,380 |
| Clones of mitochondrial origin | 5,542 | 59 | 866 | 1,576 | 1,465 | 1,576 |
| Sequences in contigs and singletons | 9,110 | 201 | 1,816 | 2,093 | 2,704 | 2,296 |
Contig and singleton statistics. Breakdown with respect to their number. GC content, average length and % annotated are shown. Annotation was determined by having a homolog in any of the databases searched with an E-value of 1.0e-06. Even though the Candidatus Portiera aleyrodidarum bacterial DNA was present in the assembly processes, the sequences were removed when statistical calculations were carried out (represented by *) to avoid distortion of the results
| All sequences | 4,860 | 0.34* | 515* | 45.5% |
| Number of contigs | 1,017 | 0.38* | 785.5* | 58% |
| Number of singletons | 3,843 | 0.34 | 443 | 42% |
*excluding sequences from the endosymbiont Candidatus Portiera aleyrodidarum
Figure 1Number of sequences building a contig versus the contig length Scatter plot of contigs sequence number (x axis; number of sequences that make up a certain contig) versus the contig length (y axis; bases). The colour scale represents the amount of sequences; the size of the square represents the number of HBT sequences. The annotations for the ten contigs, having the highest number of sequences is shown.
Information on the most abundant contigs. The number of sequences that compose the contigs and the source library, their length, GC content and annotation are indicated. The contig with the highest number of sequences was identified as part of the B. tabaci primary symbiont, Candidatus Portiera aleyrodidarum, based on a partial genomic sequence (gi|32423678|gb|AY268081.1|). This DNA sequence was included in the assembly process. Among the most the abundant contigs were B. tabaci mitochondrial genome sequences (most of them were removed during the preassembly stage)
| 425 | 31,123* | 0.3 | 22 | 393 | 3 | 2 | 4 | ||
| Bt-HInst-045-1-B11-T3_B11 | 167 | 598 | 0.16 | 8 | 159 | 0 | 0 | 0 | Mitochondria |
| Bt_TYLCV004_B07 | 155 | 2,883 | 0.48 | 0 | 0 | 51 | 57 | 47 | Vitellogenin precursor |
| Bt-HInst-008-1-D2-T3_D02 | 103 | 589 | 0.15 | 0 | 103 | 0 | 0 | 0 | Mitochondria |
| Bt-ToMoV-020-1-D2-T3_D02 | 60 | 1,660 | 0.64 | 0 | 0 | 22 | 13 | 25 | Vitellogenin precursor |
| Bt-H-024-1-D9-T3_D09 | 60 | 1,775 | 0.47 | 0 | 0 | 43 | 15 | 2 | Large subunit rRNA |
| Bt-HInst-032-1-E6-T3_E06 | 58 | 792 | 0.46 | 2 | 56 | 0 | 0 | 0 | Unknown |
| Bt-TYLCV-043-1-D2-T3_D02 | 52 | 1,922 | 0.46 | 0 | 0 | 8 | 20 | 24 | Vitellogenin precursor |
| Bt-HInst-013-1-H10-T3_H10 | 47 | 260 | 0.38 | 1 | 46 | 0 | 0 | 0 | Novel |
| Bt-HInst-003-1-G5-T3_G05 | 46 | 970 | 0.4 | 0 | 19 | 13 | 9 | 5 | Myosin |
| Bt-ToMoV-034-1-B10-T3_B10 | 39 | 1,039 | 0.65 | 0 | 0 | 0 | 30 | 9 | Putative senescence |
| Bt-ToMoV-023-1-C12-T3_C12 | 28 | 542 | 0.46 | 0 | 0 | 7 | 12 | 9 | Novel (signal peptide & transmembranal domain) |
| Bt-TYLCV-030-1-C9-T3_C09 | 28 | 1,959 | 0.44 | 0 | 0 | 3 | 23 | 2 | Vitellogenin precursor |
* In the assembly process the sequence of the whitefly primary endosymbiotic bacteria Candidatus Portiera aleyrodidarum [21] (AY268081.1) was included (see Results, EST assembly into contigs)
Figure 2Library distribution for the most abundant contigs The figure represents for each of the 13 most abundant contigs as revealed by their sequence composition. The number of sequences building the contigs is from 28 to 425. The percentage in based on the total number of sequence forming the contig. The contigs are represented by their annotation.
Figure 3NR-Homologies a. NR E-value distribution Nr E-value distribution is shown as a percent of the total top homologies. Represented are the 1,544 contigs and singletons that had a homology to a protein in the nr database with an E-value of at least 1.0e-06. b. NR species distribution Species distribution is shown as a percent of the total top homologies. Represented are the 1,544 contigs and singletons that had a homology to a protein in the nr database with an E-value of at least 1.0e-06 (abbreviations used: Homo for Homo sapiens, Mus for Mus musculus and Rat for Rattus norvegicus).
Number of contigs (out of the 4,860) that had a significant hit (E-value equal or smaller than 1.0e-06) with the listed databases
| Drosophila | 1,053 |
| Nr | 1,544 |
| Nt | 1,207 |
| Swissprot | 1,224 |
| EST other | 1,224 |
| Contigs and singletons found in at least one of the above databases | 2,211 |
Figure 4Ontology using Swiss-Prot homologies The Swiss-Prot homologs were used as a query for the FatiGO tool. The output of FatiGO is summarized here in three main categories in level 2: biological process, molecular function and cellular component.
Drosophila homologs used to discover over-represented ontologies in the B. tabaci contigs and singletons
| (Max: 50) | 732 | 10,309 | ||
| Ribosome | 106 | 202 | 0 | |
| cytoplasm | 357 | 1,800 | 0 | |
| ribonucleoprotein complex | 124 | 370 | 0 | |
| cytosolic ribosome (sensu Eukaryota) | 74 | 98 | 0 | |
| structural constituent of ribosome | 106 | 194 | 0 | |
| protein complex | 306 | 1623 | 0 | |
| large ribosomal subunit | 58 | 103 | 6.50E-81 | |
| small ribosomal subunit | 44 | 74 | 3.73E-65 | |
| Cytosol | 122 | 440 | 2.52E-63 | |
| mitochondrion | 131 | 509 | 2.87E-60 | |
| intracellular | 467 | 3,678 | 2.12E-58 | |
| cellular biosynthesis | 205 | 1,077 | 1.06E-55 | |
| intracellular organelle | 394 | 2,919 | 2.21E-54 | |
| Organelle | 394 | 2,919 | 2.21E-54 | |
| non-membrane-bound organelle | 160 | 744 | 3.87E-54 | |
| intracellular non-membrane-bound organelle | 160 | 744 | 3.87E-54 | |
| cellular metabolism | 562 | 5,047 | 1.27E-52 | |
| biosynthesis | 208 | 1,145 | 2.01E-51 | |
| metabolism | 584 | 5,438 | 7.70E-50 | |
| protein biosynthesis | 151 | 731 | 4.65E-47 | |
| macromolecule biosynthesis | 154 | 760 | 2.74E-46 | |
| mitochondrial inner membrane | 64 | 187 | 3.25E-45 | |
| inner membrane | 64 | 187 | 3.25E-45 | |
| oxidative phosphorylation | 56 | 152 | 6.10E-44 | |
| mitochondrial electron transport chain | 38 | 82 | 1.29E-40 | |
| mitochondrial membrane | 68 | 222 | 1.37E-40 | |
| ATP synthesis coupled electron transport | 35 | 72 | 8.45E-40 | |
| cellular physiological process | 645 | 6,729 | 3.39E-39 | |
| mitochondrial matrix | 52 | 151 | 7.69E-37 | |
| cytosolic large ribosomal subunit (sensu Eukaryota) | 42 | 57 | 1.12E-34 | |
| macromolecule metabolism | 354 | 2,902 | 2.19E-34 | |
| Structural molecule activity | 140 | 760 | 2.73E-34 | |
| primary metabolism | 512 | 4,908 | 4.18E-34 | |
| eukaryotic 43S preinitiation complex | 43 | 64 | 8.21E-33 | |
| cellular macromolecule metabolism | 330 | 2,676 | 1.98E-32 | |
| monovalent inorganic cation transporter activity | 49 | 151 | 1.09E-31 | |
| cellular process | 661 | 7,297 | 2.03E-31 | |
| Hydrogen ion transporter activity | 48 | 149 | 1.10E-30 | |
| physiological process | 674 | 7,566 | 1.67E-30 | |
| Cell | 512 | 5,045 | 3.62E-30 | |
| generation of precursor metabolites and energy | 106 | 532 | 4.41E-30 | |
| Organelle membrane | 82 | 382 | 1.28E-26 | |
| cytosolic small ribosomal subunit (sensu Eukaryota) | 32 | 43 | 1.28E-26 | |
| eukaryotic 48S initiation complex | 32 | 43 | 1.28E-26 | |
| cellular protein metabolism | 293 | 2,433 | 1.48E-25 | |
| primary active transporter activity | 52 | 192 | 3.21E-25 | |
| protein metabolism | 293 | 2,452 | 7.25E-25 | |
| intracellular membrane-bound organelle | 291 | 2,524 | 1.50E-21 | |
| membrane-bound organelle | 291 | 2,524 | 1.50E-21 | |
| mitochondrial ribosome | 28 | 76 | 1.69E-21 | |
| Ribosome mitochondrial andmetabolism enriched |
Drosophila homologs used to discover under-represented ontologies in the B. tabaci contigs and singletons
| 732 | 10309 | |||
| molecular_function unknown | 8 | 939 | 3.31E-13 | |
| biological_process unknown | 8 | 810 | 9.08E-11 | |
| cellular_component unknown | 17 | 983 | 2.48E-10 | |
| signal transducer activity | 39 | 1,092 | 3.28E-05 | |
| receptor activity | 14 | 584 | 0.000109 | |
| transmembrane receptor activity | 9 | 452 | 0.000286 | |
| signal transduction | 58 | 1,303 | 0.000994 |
Mapping the pathways for the 37 unique EC numbers extracted by the BiocloneDB application for the nr homologs. The output was produced by the 'KEGG gpath tool'.
| map00010 Glycolysis / Gluconeogenesis | EC 1.2.1.3 |
| EC 3.1.3. | |
| EC 5.3.1. | |
| map00030 Pentose phosphate pathway | EC 1.1.1.44 |
| EC 1.1.99.10 | |
| EC 3.1.3.11 | |
| map00031 Inositol metabolism | EC 5.3.1.1 |
| map00040 Pentose and glucuronate interconversions | EC 2.4.1.17 |
| map00051 Fructose and mannose metabolism | EC 3.1.3.11 |
| EC 5.3.1.1 | |
| map00052 Galactose metabolism | EC 2.7.7.12 |
| map00053 Ascorbate and aldarate metabolism | EC 1.2.1.3 |
| map00071 Fatty acid metabolism | EC 1.2.1.3 |
| map00120 Bile acid biosynthesis | EC 1.2.1.3 |
| EC 1.2.1.3 | |
| map00130 Ubiquinone biosynthesis | EC 1.6.5.3 |
| map00150 Androgen and estrogen metabolism | EC 2.4.1.17 |
| map00190 Oxidative phosphorylation | EC 1.6.5.3 |
| EC 1.9.3.1 | |
| EC 1.10.2.2 | |
| EC 3.6.3.14 | |
| map00193 ATP synthesis | EC 3.6.3.14 |
| map00195 Photosynthesis | EC 3.6.3.14 |
| map00220 Urea cycle and metabolism of amino groups | EC 2.6.1.11 |
| EC 6.3.4.5 | |
| map00230 Purine metabolism | EC 2.7.4.6 |
| EC 2.7.7.6 | |
| map00240 Pyrimidine metabolism | EC 2.7.4.6 |
| EC 2.7.4.9 | |
| EC 2.7.7.6 | |
| map00251 Glutamate metabolism | EC 6.3.1.2 |
| map00252 Alanine and aspartate metabolism | EC 6.3.4.5 |
| map00260 Glycine, serine and threonine metabolism | EC 1.1.1.95 |
| EC 2.3.1.37 | |
| EC 2.7.1.32 | |
| map00280 Valine, leucine and isoleucine degradation | EC 1.2.1.3 |
| map00310 Lysine degradation | EC 1.2.1.3 |
| map00330 Arginine and proline metabolism | EC 1.2.1.3 |
| EC 6.3.4.5 | |
| map00340 Histidine metabolism | EC 1.2.1.3 |
| map00380 Tryptophan metabolism | EC 1.2.1.3 |
| map00400 Phenylalanine, tyrosine and tryptophan biosynthesis | EC 4.2.3.4 |
| EC 6.1.1.20 | |
| map00410 beta-Alanine metabolism | EC 1.2.1.3 |
| map00480 Glutathione metabolism | EC 2.5.1.18 |
| map00500 Starch and sucrose metabolism | EC 2.4.1.1 |
| EC 2.4.1.17 | |
| map00520 Nucleotide sugars metabolism | EC 2.7.7.12 |
| map00530 Aminosugars metabolism | EC 3.2.1.14 |
| map00550 Peptidoglycan biosynthesis | EC 6.3.1.2 |
| map00561 Glycerolipid metabolism | EC 1.2.1.3 |
| map00564 Glycerophospholipid metabolism | EC 2.7.1.32 |
| map00620 Pyruvate metabolism | EC 1.2.1.3 |
| EC 2.7.9.2 | |
| map00630 Glyoxylate and dicarboxylate metabolism | EC 4.1.1.39 |
| map00631 1,2-Dichloroethane degradation | EC 1.2.1.3 |
| map00640 Propanoate metabolism | EC 1.2.1.3 |
| map00650 Butanoate metabolism | EC 1.2.1.3 |
| map00710 Carbon fixation | EC 3.1.3.11 |
| map00720 Reductive carboxylate cycle (CO2 fixation) | EC 2.7.9.2 |
| map00860 Porphyrin and chlorophyll metabolism | EC 2.4.1.17 |
| EC 4.99.1.1 | |
| map00903 Limonene and pinene degradation | EC 1.2.1.3 |
| map00910 Nitrogen metabolism | EC 6.3.1.2 |
| map00970 Aminoacyl-tRNA biosynthesis | EC 6.1.1.20 |
| map02040 Flagellar assembly | EC 3.6.3.14 |
| map03020 RNA polymerase | EC 2.7.7.6 |
| map03050 Proteasome | EC 3.4.25.1 |
| map03070 Type III secretion system | EC 3.6.3.14 |
Results of nine contigs BLAST searched against gi|2522237|dbj|BAA22791.1| vitellogenin from the butterfly Athalia rosae
| BT-TYLCV-021-1-E5-T3_E05_1 | 50.72 | 207 | 618 | 1 | 1 | 207 | 9.20E-60 |
| BT-TOMOV-035-1-E6-T3_E06_1 | 58.33 | 84 | 595 | 344 | 241 | 324 | 4.10E-25 |
| BT-TYLCV-039-1-F7-T3_F07_1 | 46.97 | 132 | 398 | 3 | 478 | 607 | 3.90E-31 |
| BT_TYLCV004_B07_1 | 30.97 | 649 | 5 | 1,885 | 1,141 | 1,755 | 9.80E-89 |
| BT-TYLCV-024-1-F5-T3_F05_1 | 39.02 | 123 | 372 | 4 | 1,256 | 1,378 | 3.60E-20 |
| BT-TYLCV-030-1-C9-T3_C09_1 | 30.3 | 396 | 3 | 1,187 | 1,391 | 1,755 | 6.80E-54 |
| BT-TYLCV-043-1-D2-T3_D02_1 | 34.14 | 331 | 8 | 997 | 1,463 | 1,759 | 2.90E-49 |
| BT-TOMOV-020-1-D2-T3_D02_1 | 35.25 | 244 | 46 | 777 | 1,550 | 1,759 | 1.50E-35 |
| TMVBT002_D07_T3_061_1 | 48.68 | 76 | 20 | 247 | 1,684 | 1,759 | 2.80E-18 |
Figure 5Multiple alignment of vitellogenin-like contig Protein multiple alignments of five contigs with BAA22791.1 (amino acid 1371 to 1770) used as a profile. The numbers in the figure designate the following sequences :(1) BAA22791.1 (2) BT-TOMOV-020-1-D2-T3_D02 (4) BT-TYLCV-030-1-C9-T3_C09 (6) BT-TYLCV-043-1-D2-T3_D02 (7) BT_TYLCV004_B07 (8) TMVBT002_D07. Mview parameters: Identities computed with respect to: (1) gi_2522237_dbj_BAA22791.1 and coloured by: consensus/50% and property
Figure 6Repeats in the vitellogenin-like contig Translation of BT-TOMOV-020-1-D2-T3_D02 from 1 to 1611. The region of the AAC repeats is highlighted in yellow. The region rich in serine codons is highlighted in blue.