| Literature DB >> 21615673 |
Pavana J Hiremath1, Andrew Farmer, Steven B Cannon, Jimmy Woodward, Himabindu Kudapa, Reetu Tuteja, Ashish Kumar, Amindala Bhanuprakash, Benjamin Mulaosmanovic, Neha Gujaria, Laxmanan Krishnamurthy, Pooran M Gaur, Polavarapu B Kavikishor, Trushar Shah, Ramamurthy Srinivasan, Marc Lohse, Yongli Xiao, Christopher D Town, Douglas R Cook, Gregory D May, Rajeev K Varshney.
Abstract
Chickpea (Cicer arietinum L.) is an important legume crop in the semi-arid regions of Asia and Africa. Gains in crop productivity have been low however, particularly because of biotic and abiotic stresses. To help enhance crop productivity using molecular breeding techniques, next generation sequencing technologies such as Roche/454 and Illumina/Solexa were used to determine the sequence of most gene transcripts and to identify drought-responsive genes and gene-based molecular markers. A total of 103,215 tentative unique sequences (TUSs) have been produced from 435,018 Roche/454 reads and 21,491 Sanger expressed sequence tags (ESTs). Putative functions were determined for 49,437 (47.8%) of the TUSs, and gene ontology assignments were determined for 20,634 (41.7%) of the TUSs. Comparison of the chickpea TUSs with the Medicago truncatula genome assembly (Mt 3.5.1 build) resulted in 42,141 aligned TUSs with putative gene structures (including 39,281 predicted intron/splice junctions). Alignment of ∼37 million Illumina/Solexa tags generated from drought-challenged root tissues of two chickpea genotypes against the TUSs identified 44,639 differentially expressed TUSs. The TUSs were also used to identify a diverse set of markers, including 728 simple sequence repeats (SSRs), 495 single nucleotide polymorphisms (SNPs), 387 conserved orthologous sequence (COS) markers, and 2088 intron-spanning region (ISR) markers. This resource will be useful for basic and applied research for genome analysis and crop improvement in chickpea. Plant Biotechnology JournalEntities:
Mesh:
Substances:
Year: 2011 PMID: 21615673 PMCID: PMC3437486 DOI: 10.1111/j.1467-7652.2011.00625.x
Source DB: PubMed Journal: Plant Biotechnol J ISSN: 1467-7644 Impact factor: 9.803
Figure 1Read length distribution of Roche/454 reads and ESTs before and after assembly. Read size of Roche/454 sequences ranged from 50 bp to a maximum of 300 bp, with the highest number of reads having read size between 201 and 250 bp. Read size of high-quality ESTs varied from 50 to 900; maximum number of reads had 551–600 bp. A size comparison between raw Roche/454 reads and assembled Roche/454 reads (contigs) showed that majority of sequences in each case had size range between 201 and 300 bp, while similar comparison between raw ESTs and contigs showed a range of 551–600 bp. However, maximum of TUS contigs (12.67%) are ranged between 251 and 300 bp.
Figure 2Functional categorization of chickpea TUSs. Chickpea TUSs (≤1E-10) were categorized hierarchically according to three principal gene ontologies, viz. biological processes, molecular functions and cellular components. Binding (46.35%) and catalytic activity (37.92%) subcategories of molecular function, organelle (28.17%) and cell part (47.12%) of cellular component, and metabolic process (28.19%) and cellular process (27.62%) of biological process categories were in higher proportion.
Figure 3Transcription factors (TFs) identified by conserved domain annotation. Based on conserved domain characteristics, TUSs showing significant annotation to transcription factors were classified. Zinc fingers of alternating composition, MADS box and AP2/ERF were highly represented than other TFs.
Figure 4Overview of differentially regulated genes involved in different metabolic processes. Gene transcripts that are induced or repressed as a result of drought stress are shown in red and green colours respectively as shown in the colour bar ranging from −2.5 to +2.5. A total of 2823 TUSs out of 2974 genes related to various metabolic pathways were grouped under 31 BINs and were mapped using MapMan software to show the different functional categories involved. List of BINs are mentioned earlier by Thimm . List of the genes in each BIN is given in Dataset S4.
SSR identification using MISA search tool
| Total number of TUSs examined | 103 215 |
| Total size of examined sequences (bp) | 34 718 996 |
| Total number of identified SSRs | 26 252 |
| Number of SSR containing sequences | 23 330 |
| Number of sequences containing >1 SSR | 2480 |
| Number of SSRs present in compound formation | 2012 |
| Mono-nucleotide repeats | 24 428 |
| Di-nucleotide repeats | 743 |
| Tri-nucleotide repeats | 893 |
| Tetra-nucleotide repeats | 91 |
| Penta-nucleotide repeats | 51 |
| Hexa-nucleotide repeats | 46 |
| Primer pair designed | 3172 |
| Class-I primer pairs selected for synthesis | 728 |
MISA, MIcroSAtellite; SSR, simple sequence repeats, TUS, tentative unique sequence.
Number of SNPs classified based on allele frequency and read depth
| Number of reads/tentative contigs | ||||
|---|---|---|---|---|
| Frequency difference range | >500 | 101–500 | 11–100 | 3–10 |
| <0.1 | 389 | 751 | 2109 | 158 |
| 0.10–0.19 | 107 | 414 | 2431 | 500 |
| 0.20–0.29 | 17 | 123 | 3856 | 827 |
| 0.30–0.39 | 4 | 47 | 1478 | 992 |
| 0.40–0.49 | 1 | 13 | 746 | 828 |
| 0.50–0.59 | 8 | 18 | 502 | 1442 |
| 0.60–0.69 | – | 17 | 297 | 1361 |
| 0.70–0.79 | – | 1 | 85 | 374 |
| 0.80–0.89 | – | – | 55 | 166 |
| 0.90–1.0 | – | – | 40 | 1463 |