| Literature DB >> 16542494 |
Beatriz Sabater-Muñoz1, Fabrice Legeai, Claude Rispe, Joël Bonhomme, Peter Dearden, Carole Dossat, Aymeric Duclert, Jean-Pierre Gauthier, Danièle Giblot Ducray, Wayne Hunter, Phat Dang, Srini Kambhampati, David Martinez-Torres, Teresa Cortes, Andrès Moya, Atsushi Nakabachi, Cathy Philippe, Nathalie Prunier-Leterme, Yvan Rahbé, Jean-Christophe Simon, David L Stern, Patrick Wincker, Denis Tagu.
Abstract
Aphids are the leading pests in agricultural crops. A large-scale sequencing of 40,904 ESTs from the pea aphid Acyrthosiphon pisum was carried out to define a catalog of 12,082 unique transcripts. A strong AT bias was found, indicating a compositional shift between Drosophila melanogaster and A. pisum. An in silico profiling analysis characterized 135 transcripts specific to pea-aphid tissues (relating to bacteriocytes and parthenogenetic embryos). This project is the first to address the genetics of the Hemiptera and of a hemimetabolous insect.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16542494 PMCID: PMC1557754 DOI: 10.1186/gb-2006-7-3-r21
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Schematic phylogenetic tree representing the insect Orders comprising species where genome sequencing projects have been completed or are in an advanced stage. The figure is a greatly simplified version of a phylogeny shown in [9] representing the largely agreed relationships between these Orders, plus the major evolutionary transitions for insects (as deduced by synamorphic characters, that is, novel characters derived from preexisting ones) along a time scale expressed in millions of years from present. For each Order with species involved in a genome sequencing project, the node corresponding to its separation from its most closely related order (extant or extinct) is shown (dashed lines represent sister clades).
List of pea aphid libraries used for the EST database
| Biological source | Aphid line | Library | RNA | Vector | Sequencing center | Accession Number |
| Antennae | YR2 | ApAL3SD | Total | pDNR-LIB | Roscoff | [GenBank: |
| YR2 | ID0AEE | Total | λ Uni-Zap | Genoscope | [GenBank: | |
| Bacteriocyte | ISO | ApBac | Total | λ FLC-I | RIKEN | [DDBJ:BP535536 to BP537955] |
| Digestive tract | LL01 | ApDT | Total | pDNR-LIB | Roscoff | [GenBank: |
| Head | YR2 | ApHL3LD | Total | pDNR-LIB | Roscoff | [GenBank: |
| YR2 | ApHL3SD | Total | pDNR-LIB | Valencia | [GenBank: | |
| P123 | ID0ACC | Total | λ Uni-Zap | Genoscope | [GenBank: | |
| Parthenogenetic embryo | YR2 | ID0ADD | Total | λ Uni-Zap | Genoscope | [GenBank: |
| Whole-body, multistage | Unknown | ApMS; 14419; 14436 | Polya+ | λ Uni-Zap | Genoscope and Fort Pierce | [GenBank: |
Number of raw sequences, selected ESTs, sizes, contigs formed, and redundancy in A. pisum EST database
| Biological source | Library | EST | Rejected | Selected | M bp | Contig | Singletons | Redundancy | |||
| Bacterial | rRNA | Short sequences | Vector sequences | ||||||||
| Antennae | ApAL3SD | 1,031 | 10 | 39 | 84 | 0 | 898 | 398 | 305 | 283 | 34.52 |
| ID0AEE | 5,424 | 23 | 431 | 46 | 1 | 4,923 | 622 | 1,037 | 2,414 | 29.90 | |
| Bacteriocyte | ApBac | 2,345 | 1 | 0 | 3 | 0 | 2,341 | 871 | 275 | 40 | 86.54 |
| Digestive tract | ApDT | 1,184 | 52 | 333 | 94 | 0 | 705 | 403 | 267 | 211 | 32.20 |
| ApHL3LD | 1,245 | 24 | 30 | 359 | 0 | 832 | 394 | 366 | 201 | 31.85 | |
| Head | ApHL3SD | 2,068 | 7 | 33 | 739 | 0 | 1,289 | 363 | 382 | 438 | 36.38 |
| ID0ACC | 10,706 | 3 | 902 | 221 | 3 | 9,577 | 574 | 2,012 | 1,564 | 62.66 | |
| Parthenogenetic embryo | ID0ADD | 5,473 | 136 | 541 | 105 | 0 | 4,691 | 717 | 210 | 151 | 92.30 |
| Whole body, multistage | ApMS; | 17,964 | 479 | 1455 | 382 | 0 | 15,648 | 716 | 5153 | 3027 | 47.72 |
| GenBank | mRNA | 3 | 0 | 0 | 0 | 0 | 3 | 1220 | 2 | 1 | 0.00 |
| Total | 47,443 | 735 | 3,764 | 2,033 | 4 | 40,907 | 628 | 4,300 | 7,782 | 70.46 | |
M bp: mean size of ESTs in base pairs.
Figure 2Size distribution of the 12,082 EST-derived unique transcripts from A. pisum. Contigs and singletons with (filled bars) or without (open bars) a significant hit have been selected with a cutoff value 10-5 after a BLASTX on Uniprot. Size classes (in base pairs) were binned (for sequences less than 200 bp and more than 1,500 bp) to contain a minimum of 20 sequences for both 'hits' and 'no-hits' contigs. The curves (hits, filled diamonds; no-hits, open diamonds) show the percentage of contigs for which a coding sequence was predicted by FrameD. Contigs with no predicted coding sequences are presumably entirely UTR.
Gene Ontology annotation of pea aphid unique transcripts after GoToolBox statistical analysis
| Gene Ontology | Putative orthologs set | Corrected | |
| 2,397 | 10,032 | ||
| Physiological process | 2,260 | 7,986 | En (2e-112) |
| Cellular process | 2,201 | 7,727 | En (9e-101) |
| Regulation of biological process | 431 | 1,583 | En (8e-4) |
| Growth | 35 | 98 | En (4e-5) |
| Pigmentation | 19 | 51 | / |
| Behavior | 129 | 637 | / |
| Reproduction | 162 | 812 | / |
| Development | 470 | 2,227 | D (4e-4) |
| Unknown | 49 | 833 | D (2e-46) |
| 1,684 | 7,428 | ||
| Cell | 1,532 | 5,150 | En (3e-124) |
| Protein complex | 733 | 1,747 | En (9e-98) |
| Organelle | 1,020 | 2,966 | En (3e-84) |
| Extracellular matrix | 23 | 82 | / |
| Extracellur region | 85 | 450 | / |
| Unknown | 73 | 1,865 | D (3e-140) |
| 2,397 | 10,104 | ||
| Catalytic activity | 1,290 | 4,070 | En (1e-52) |
| Binding | 1,334 | 4,301 | En (8e-49) |
| Structural molecule activity | 298 | 757 | En (8e-23) |
| Translation regulator activity | 54 | 92 | En (5e-12) |
| Transporter activity | 372 | 1,237 | En (9e-8) |
| Enzyme regulator activity | 110 | 379 | / |
| Antioxydant activity | 16 | 39 | / |
| Motor activity | 26 | 88 | / |
| Transcription regulator activity | 187 | 841 | / |
| Signal transducer activity | 215 | 1,093 | D (1e-3) |
| Unknown | 63 | 1,798 | D (1e-144) |
The set of A. pisum contigs orthologous to D. melanogaster sequences have been compared to the whole set of D. melanogaster genes using FlyBase Gene ontology terms. The last column indicates the p value of the hypergeometric test. En, enhanced and D, depleted in A. pisum transcripts. /, no bias.
Base composition (%GC) at different positions for reconstructed coding sequences of the collection of aphid contigs and their putative homologs (best hits) in D. melanogaster
| %GC1 | %GC2 | %GC3s | 5' UTR | 3' UTR | ||
| Mean | 47.4% | 37.0% | 34.5% | 34.9% | 23.1% | |
| SD | 6.6% | 7.2% | 14.2% | 10.0% | 8.3% | |
| Mean | 56.4% | 39.8% | 68.8% | ND | ND | |
| SD | 4.6% | 5.7% | 9.0% | ND | ND |