| Literature DB >> 22118559 |
Matthew W Blair1, Andrea C Fernandez, Manabu Ishitani, Danilo Moreta, Motoaki Seki, Sarah Ayling, Kazuo Shinozaki.
Abstract
BACKGROUND: Common bean is an important legume crop with only a moderate number of short expressed sequence tags (ESTs) made with traditional methods. The goal of this research was to use full-length cDNA technology to develop ESTs that would overlap with the beginning of open reading frames and therefore be useful for gene annotation of genomic sequences. The library was also constructed to represent genes expressed under drought, low soil phosphorus and high soil aluminum toxicity. We also undertook comparisons of the full-length cDNA library to two previous non-full clone EST sets for common bean.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22118559 PMCID: PMC3240127 DOI: 10.1186/1471-2229-11-171
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Comparison of major EST sequencing efforts in common bean.
| Sequence read | Full-length | Ramírez | |
|---|---|---|---|
| Clones | 9,984 | 21,096 | 20,736 |
| Sequence reads (after LQ & vector trimming) | 7,079 | 15,781 | 37,919 |
| Singletons | 1,238 | 5,703 | 3,544 |
| Contigs | 2,981 | 2,266 | 7,510 |
| Unigenes | 4,219 | 7,969 | 10,581 |
| Proportion of unigenes per sequence (%) | 59.6% | 50.5% | 27.9% |
| Average EST length (nt) | 563.8 | 606.2 | 656.4 |
| Average contig length (nt) | 677.9 | 606.2 | 1024.2 |
| Average singleton length (nt) | 568.3 | 594.7 | 691.7 |
Two EST sequencing efforts are compared to the ESTs generated for the full-length cDNA libraries made in this study
1 clones from Thibivilliers et al. (2009) sequenced from both ends of the insert while those of Ramirez and this study were 5'end sequenced.
Figure 1Comparison of unigene length among the full-length cDNA libraries and two other EST sequencing efforts. The full-length cDNA clones were sequenes from from the 5'end as were the clones from Ramírez et al. (2005) while those of Thibivilliers et al. (2009) were sequenced from both 5 'and 3' ends.
Figure 2Overlap in homology at E-threshold of 1E. The previous EST sequencing were from Ramírez et al. (2005) and Thibivilliers et al. (2009).
Figure 3Example of alternate splicing and 5'end location of an EST for aquaporin from the full-length library. Comparison of non-full-length ESTs (below line 11) to full-length ESTs (above line 11) for a gene with high homology to GenBank accession ABU94631 encoding a putative aquaporin PIP2;3 from Phaseolus vulgaris. The last arrow at the bottom of the figure shows the homology to the equivalent ortholog in the soybean genome.
Figure 4Mapping databases with (a) top hits for gene ontology (GO) and (b) gene ontology evidence code distribution for blastx hits in terms of numbers of GO terms or sequences (y-axes), respectively. Based on the full collection of unigenes from the full-length cDNA library sequencing project.
Differences in categorization of the unigenes from the full length cDNA library compared to two other recent EST libraries of common bean.
| Gene Ontology | Full-length library (this study) | % | Root library (Blair et al., in press) | % |
|---|---|---|---|---|
| Biological regulation | 524 | 8.9 | 204 | 6.5 |
| Carbon utilization | 13 | 0.2 | 3 | 0.1 |
| Cell proliferation | 0 | 0.0 | 4 | 0.1 |
| Cellular component organization | 225 | 3.8 | 119 | 3.8 |
| Cellular processes | 1669 | 28.4 | 920 | 29.4 |
| Cellular component biogenesis | 98 | 1.7 | 80 | 2.6 |
| Developmental process | 307 | 5.2 | 136 | 4.3 |
| Death | 22 | 0.4 | 7 | 0.2 |
| Immune system process | 48 | 0.8 | 11 | 0.4 |
| Localization | 374 | 6.4 | 212 | 6.8 |
| Metabolic process | 1567 | 26.7 | 871 | 27.8 |
| Multi-organism process | 0 | 0.0 | 54 | 1.7 |
| Multi-cellular organismal process | 251 | 4.3 | 115 | 3.7 |
| Nitrogen utilization | 0 | 0.0 | 4 | 0.1 |
| Pigmentation | 0 | 0.0 | 1 | 0.0 |
| Rhythmic processes | 0 | 0.0 | 2 | 0.1 |
| Reproduction | 141 | 2.4 | 72 | 2.3 |
| Response to stimulus | 638 | 10.9 | 310 | 9.9 |
| Sulfur utilization | 0 | 0.0 | 4 | 0.1 |
| Antioxidant activity | 34 | 0.9 | 28 | 1.3 |
| Binding | 1656 | 42.5 | 841 | 40.4 |
| Catalytic activity | 1472 | 37.8 | 704 | 33.8 |
| Electron carrier activity | 108 | 2.8 | 61 | 2.9 |
| Enzyme regulator activity | 34 | 0.9 | 21 | 1.0 |
| Metallochaperone activity | 2 | 0.1 | 1 | 0.0 |
| Molecular transducer activity | 53 | 1.4 | 15 | 0.7 |
| Nutrient reservoir activity | 10 | 0.3 | 5 | 0.2 |
| Protein tag | 2 | 0.1 | 0 | 0.0 |
| Structural molecular activity | 101 | 2.6 | 193 | 9.3 |
| Translation regulator activity | 51 | 1.3 | 40 | 1.9 |
| Transporter activity | 227 | 5.8 | 113 | 5.4 |
| Transcription regulator activity | 148 | 3.8 | 60 | 2.9 |
| Apoplast | 95 | 1.9 | 58 | 2.0 |
| Cell part | 2176 | 44.6 | 1200 | 40.9 |
| Extracellular region | 0 | 0.0 | 2 | 0.1 |
| Organelle part | 404 | 8.3 | 364 | 12.4 |
| Organelle - membrane bounded | 1500 | 30.7 | 779 | 26.5 |
| Organelle - non membrane bound | 181 | 3.7 | 265 | 9.0 |
| Protein complex | 271 | 5.6 | 134 | 4.6 |
| Protein-DNA complex | 17 | 0.3 | 23 | 0.8 |
| Vesicle | 236 | 4.8 | 112 | 3.8 |
Gene ontology according to biological processes, molecular function and cellular compartmentalization.
1 refers to number of unigenes in each case for high phosphorus (HP) or low phosphorus (LP) root libraries.
2 performed at level 2 for greater details of biological process and molecular function.
3 performed at level 3 for greater details of cellular component placement.
Figure 5Example of KEGG pathways found for the full-length cDNA clone ESTs. The examples of Citrate Cycle and Peroxisome function are given with genes in green represented by full-length cDNA clones from the newly constructed library.
Simple sequence repeats found by two software programs in the unigene set of full-length cDNA sequences.
| SSR class | Repeat Finder | % | SciRoKo w/o mono-nt | % | SciRoKo w/mono-nt | % |
|---|---|---|---|---|---|---|
| Mono-nt | 0 | 0 | 0 | 0.0 | 468 | 24.2 |
| Di-nt | 29 | 16.6 | 322 | 22.0 | 322 | 16.7 |
| Tri-nt | 89 | 50.9 | 562 | 38.4 | 562 | 29.1 |
| Tetra-nt | 28 | 16.0 | 104 | 7.1 | 104 | 5.4 |
| Penta-nt | 13 | 7.4 | 193 | 13.2 | 193 | 10.0 |
| Hexa-nt | 16 | 9.1 | 283 | 19.3 | 283 | 14.6 |
| Total | 175 | 100 | 1464 | 100 | 1932 | 100 |
Comparison of RepeatFinder and SciRoKO software shown in number of repeats found and percentage of total for each SSR class from mono-nucleotide (nt) repeat to hexa-nt repeat.