| Literature DB >> 24465857 |
Himabindu Kudapa1, Sarwar Azam1, Andrew G Sharpe2, Bunyamin Taran3, Rong Li2, Benjamin Deonovic4, Connor Cameron5, Andrew D Farmer5, Steven B Cannon6, Rajeev K Varshney7.
Abstract
A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs) from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201), comprising 46,369 transcript assembly contigs (TACs) has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8%) of the TACs and gene ontology assignments were determined for 21,471 (46.3%). The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs) and intron spanning regions (ISRs) for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC) of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding applications in chickpea and other related legumes.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24465857 PMCID: PMC3900451 DOI: 10.1371/journal.pone.0086039
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Details on NGS (FLX/454 and Illumina) and Sanger sequencing datasets used for developing comprehensive chickpea transcriptome assembly (CaTA v2).
| Dataset/sequencingplatform | Genotype | Tissues | Source | Number of reads |
|
| ||||
| Illumina GAII | ICC 4958 | root+shoot+leaves+buds | NIPGR | 65,900,072 |
| Illumina GAII | ICC 4958 | root+shoot | NIPGR | 69,054,282 |
|
| ||||
| Roche/454 | ICC 4958 | RNA from 5 different tissues | NIPGR | ∼2,500,000 |
| Roche/454 | ICC 4958 | 22 different developmental stages | ICRISAT/JCVI | ∼400,000 |
| Roche/454 | Amit | RNA from 5 different tissues | NRC | 496,109 |
| Roche/454 | CDC Frontier | -do- | NRC | 490,245 |
| Roche/454 | CDC Xena | -do- | NRC | 531,970 |
| Roche/454 | Cr5-10 | -do- | NRC | 610,889 |
| Roche/454 | ICC12512-1 | -do- | NRC | 507,801 |
| Roche/454 | ICCV96029 | -do- | NRC | 520,733 |
| Roche/454 | ILWC 118 | -do- | NRC | 560,321 |
| Roche/454 | Y9563-28 | -do- | NRC | 509,682 |
|
| ||||
| Sanger Sequencing | AAFC | mix of tissues | NRC | 30,537 |
| Sanger Sequencing | CDC Frontier | mix of tissues | NRC | 66,720 |
| Sanger Sequencing | C235, Castellana,Digvijay, ICC 4958, ICC1882, ICC 3996, ICCV 2,JG 315, JG 11, JG 62,Pedrosillano, Pusa,Pusa 362, WR 315,XJ 209, Azad | mix of tissues | NCBI | 41,984 ( |
Tissues collected: a) 2-week old leaf, b) stem before flowering, c) 1-week-old etiolated seedling, d) mixed flower stages and e) developing seed at mixed stages.
NIPGR- National Institute of Plant Genome Research, India; ICRISAT- International Crops Research Institute for the Semi-Arid Tropics, India; JCVI- J. Craig Venter Institute, USA; NRC- National Research Council Canada.
Comparative analysis of chickpea transcriptome assemblies.
| Comprehensive TA (CaTA v2) | Agarwal et al. | CaTA v1 Hiremathet al. | Garg et al. | Garg et al. | Deokar et al. | Varshney et al. | |
| Sequence data used | ∼7 million 454 reads+∼100 million Illumina+∼150K Sanger ESTs | 1.8 million 454 reads +121million Illumina reads | 435,018 454 reads+∼37million Illumina reads+21,491 Sanger ESTs | ∼2 million 454 reads+∼ 107 million Illuminareads | ∼ 107 millionIllumina reads | 5,494 Sanger ESTs | 20,162 Sanger ESTs |
| Number of genotypesproviding sequence data | 10 (where more than half of datawas from ICC 4958) | 1 (ICCV 2) | 4 (ICC 4958, ICC 1882,JG 11, ICCV 2) | 1 (ICC 4958) | 1 (ICC 4958) | 2 (ICC 4958, ICC 1882) +20 RILs (10 Resistant and 10 Sensitive) | 4 (ICC 4958, ICC 1882, JG 11, ICCV 2) |
| Programme(s) used forassembly | MIRA (Newbler+Abyss) | TGICL, Newbler, CLCGenomics Workbench | CAP3 | TGICL(Newbler+Velvet) | Velvet | CAP3 | CAP3 |
| Total number of TACs | 46,369 | 43,389 contigs | 103,215 (44,845 contigsand 58,370 singletons) | 34,760 contigs | 74,651 contigs | 3,062 (638 contigsand 2,424 singletons) | 6,404 (1,590 contigs and 4,814 singletons) |
| N50 (bp) | 1,726 | 1,653 | 364 (515 bp for contigs) | 1,671 | 730 | – | – |
| Largest contig (bp) | 15,644 | 15,605 | 3,346 | 13,803 | 7,827 | – | – |
| Shortest contig (bp) | 100 | 100 | 51 | 100 | 100 | – | – |
Figure 1Functional categorization of chickpea Transcript Assembly Contigs (TACs) of the CaTA v2.
Chickpea TACs representing the distribution of genes based on their annotations to terms in the GO were categorized hierarchically according to three principal gene ontologies, viz. biological processes, molecular functions and cellular components. The number of TACs representing each subcategory is shown in Y-axis.
Figure 2Enzyme classification of chickpea Transcript Assembly Contigs (TACs) among the six enzyme classes.
The graph displays the proportion of genes belonging to each enzyme class.
Figure 3Distribution of chickpea transcripts in different transcription factor (TF) families.
Based on conserved domain annotation, Transcript Assembly Contigs (TACs) showing significant annotation to transcription factors were classified.
Mapping of chickpea TACs onto Medicago genome.
|
| Total CaTA v2 hits | Hit in non-genic region (where hit has not overlapped genicregion) | Hit in genic region | Genescovered | Total number of geneson each chromosomes | Percentage of gene covered on each chromosomes |
| Mt01 | 3,112 | 153 | 2,959 | 1,437 | 4,585 | 31.34 |
| Mt02 | 3,435 | 95 | 3,340 | 1,488 | 5,022 | 29.63 |
| Mt03 | 4,388 | 120 | 4,268 | 1,789 | 5,858 | 30.54 |
| Mt04 | 4,380 | 113 | 4,267 | 2,083 | 6,529 | 31.90 |
| Mt05 | 4,406 | 93 | 4,313 | 2,096 | 7,274 | 28.81 |
| Mt06 | 1,861 | 54 | 1,807 | 537 | 2,840 | 18.91 |
| Mt07 | 3,423 | 80 | 3,343 | 1,667 | 5,524 | 30.18 |
| Mt08 | 2,888 | 99 | 2,789 | 1,387 | 4,486 | 30.92 |
|
|
|
|
|
Multiple mapping of CaTA v2 onto Medicago genome.
| Number of times mappedto | Number of CaTA |
| 1 | 15,263 |
| 2 | 2,919 |
| 3 | 772 |
| 4 | 370 |
| 5 | 218 |
| 6 | 356 |
| 7 | 83 |
| 8 | 67 |
| 9 | 46 |
| 10 | 25 |
|
| 20,119 |
Figure 4A sample view of chickpea TACs, markers and candidate ISR markers onto Medicago Genome sequence.
This image is from Legume Information System (LIS) GBrowse viewer at http://medtr.comparative-legumes.org/gb2/gbrowse/3.5.1/, shows 1 Mb (17,648,842.18,648,841 of Medicago, chromosome Mt1). Red: There was at least one additional reported CaTA v2 alignment Green: There were no other reported alignments.
Identification of simple sequence repeats: their distribution and primer design for chickpea genetics and breeding applications.
| Total number of sequences examined | 46,369 |
| Total size of examined sequences (bp) | 44,740,166 |
| Total number of identified SSRs | 5,342 |
| Number of SSR containing TACs | 4,373 |
| Number of TAC containing more than 1 SSR | 734 |
| Number of SSRs present in compound formation | 472 |
|
| |
| Number of di-nucleotide repeats | 2,094 |
| Number of tri-nucleotide repeats | 2,993 |
| Number of tetra-nucleotide repeats | 113 |
| Number of penta-nucleotide repeats | 56 |
| Number of hexa-nucleotide repeats | 86 |
|
| |
| TACs were used to design primer pairs | 2,231 |
| Total numbers of primer pairs designed | 2,474 |
Correspondences of chickpea genic molecular markers to Medicago.
| Chickpea linkage groups | Chickpea uniqueloci (no.) | Mt1 | Mt2 | Mt3 | Mt4 | Mt5 | Mt6 | Mt7 | Mt8 | Mtx | Total |
| CaLG01 | 69 | 0 | 62 | 1 | 2 | 0 | 0 | 0 | 0 | 2 | 67 |
| CaLG02 | 61 | 1 | 0 | 1 | 2 | 43 | 8 | 4 | 0 | 2 | 61 |
| CaLG03 | 62 | 2 | 0 | 0 | 3 | 2 | 1 | 50 | 1 | 3 | 62 |
| CaLG04 | 95 | 58 | 0 | 5 | 11 | 8 | 3 | 4 | 0 | 6 | 95 |
| CaLG05 | 93 | 3 | 0 | 74 | 2 | 4 | 2 | 2 | 1 | 5 | 93 |
| CaLG06 | 76 | 0 | 2 | 3 | 35 | 4 | 2 | 2 | 24 | 4 | 76 |
| CaLG07 | 46 | 1 | 0 | 1 | 35 | 1 | 0 | 0 | 8 | 0 | 46 |
| CaLG08 | 53 | 0 | 0 | 0 | 2 | 42 | 4 | 4 | 0 | 1 | 53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
Distribution of ISRs on chickpea linkage groups.
| Chickpea linkage group | Number of ISR markers showing inferred position | Markers selected for analysis | Markers amplified |
| CaLG01 | 1,773 | 21 | 8 |
| CaLG02 | 1,257 | 20 | 14 |
| CaLG03 | 1,216 | 20 | 5 |
| CaLG04 | 1,643 | 23 | 11 |
| CaLG05 | 1,764 | 25 | 5 |
| CaLG06 | 2,203 | 13 | 7 |
| CaLG07 | 1,392 | 20 | 4 |
| CaLG08 | 861 | 16 | 2 |
|
|
|
|
|
Distribution ISRs on chickpea contigs.
| Number of contigs | Number of ISR | Total ISRs |
| 2,473 | 1 | 2,473 |
| 1,342 | 2 | 2,684 |
| 771 | 3 | 2,313 |
| 468 | 4 | 1,872 |
| 265 | 5 | 1,325 |
| 154 | 6 | 924 |
| 88 | 7 | 616 |
| 70 | 8 | 560 |
| 32 | 9 | 288 |
| 34 | 10 | 340 |
| 14 | 11 | 154 |
| 9 | 12 | 108 |
| 3 | 13 | 39 |
| 10 | 14 | 140 |
| 4 | 15 | 60 |
| 2 | 16 | 32 |
| 3 | 19 | 57 |
| 1 | 20 | 20 |
| 1 | 21 | 21 |
| 1 | 27 | 27 |
| 1 | 100 | 100 |
|
|
|