| Literature DB >> 27635342 |
Zunzheng Wei1,2, Zhenzhen Sun3, Binbin Cui4, Qixiang Zhang1, Min Xiong2, Xian Wang2, Di Zhou2.
Abstract
Colored calla lily is the short name for the species or hybrids in section Aestivae of genus Zantedeschia. It is currently one of the most popular flower plants in the world due to its beautiful flower spathe and long postharvest life. However, little genomic information and few molecular markers are available for its genetic improvement. Here, de novo transcriptome sequencing was performed to produce large transcript sequences for Z. rehmannii cv. 'Rehmannii' using an Illumina HiSeq 2000 instrument. More than 59.9 million cDNA sequence reads were obtained and assembled into 39,298 unigenes with an average length of 1,038 bp. Among these, 21,077 unigenes showed significant similarity to protein sequences in the non-redundant protein database (Nr) and in the Swiss-Prot, Gene Ontology (GO), Cluster of Orthologous Group (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Moreover, a total of 117 unique transcripts were then defined that might regulate the flower spathe development of colored calla lily. Additionally, 9,933 simple sequence repeats (SSRs) and 7,162 single nucleotide polymorphisms (SNPs) were identified as putative molecular markers. High-quality primers for 200 SSR loci were designed and selected, of which 58 amplified reproducible amplicons were polymorphic among 21 accessions of colored calla lily. The sequence information and molecular markers in the present study will provide valuable resources for genetic diversity analysis, germplasm characterization and marker-assisted selection in the genus Zantedeschia.Entities:
Keywords: Colored calla lily; EST-SSRs; Illumina transcriptome sequencing; The genus Zantedeschia; de novo assembly
Year: 2016 PMID: 27635342 PMCID: PMC5012260 DOI: 10.7717/peerj.2378
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Summary of transcriptome statistics and functional annotation for colored calla lily ‘Rehmannii’.
| Number | Percentage | |
|---|---|---|
| Raw reads | 59,882,890 | |
| Total sizes (nt) | 6,048,171,890 | |
| Clean reads | 46,343,613 | |
| Transcripts | 62,382 | |
| Unigenes | 39,298 | |
| Unigenes (300–500 nt) | 13,367 | 34.02% |
| Unigenes (500–1,000 nt) | 13,088 | 33.30% |
| Unigenes (1,000–1,500 nt) | 4,975 | 12.65% |
| Unigenes (1,500–2,000 nt) | 2,988 | 7.60% |
| Unigenes (>2,000 nt) | 4,880 | 12.41% |
| Mean length (nt) | 1,038 | |
| N50 (nt) | 1,476 | |
| GC% | 45.74% | |
| Annotated in Nr | 21,029 | 53.51% |
| Annotated in Swiss-Prot | 16,908 | 43.03% |
| Annotated in COG | 6,731 | 17.13% |
| Annotated in GO | 15,552 | 39.57% |
| Annotated in KEGG | 4,532 | 11.53% |
| Annotated in at least one database | 21,077 | 53.63% |
| Total unigenes | 39,298 | 100% |
Figure 1Characteristics of homology search for colored calla lily ‘Rehmannii’ unigenes against non-redundant protein database (Nr) with an E-value = 1E-10−5.
(A) The E-value distribution of BLASTx hit for each assembled unigene; (B) the similarity distribution of BLASTx hits for each assembled unigenes; (C) species-based distribution of the top BLASTx hits for each assembled unigenes.
Figure 2The classification of colored calla lily ‘Rehmannii’ unigenes.
(A) Distribution of Cluster of Orthologous Group (COG) classification. A total of 6,731 assembled unigenes were annotated and assigned to 24 functional categories. The x-axis indicates the subgroups in the COG classification while the y-axis indicates the number of genes in each main category. (B) Gene ontology (GO) classification of assembled unigenes at level 2. A total of 15,552 unigenes were grouped into three main GO categories: ‘Biological Processes’, ‘Cellular Component’, and ‘Molecular Function’. The x-axis indicates the subgroups in GO annotation while the y-axis indicates the percentage of specific categories of genes in each main category. (C) The top 20 KEGG metabolic pathways of assembled unigenes. The x-axis indicates the number of genes in each metabolic pathway while the y-axis indicates the subgroups in the top 20 KEGG metabolic pathways.
Features of the SSR repeat types identified in colored calla lily ‘Rehmannii’ unigenes.
| Feature | Colored calla lily |
|---|---|
| Total number of sequences examined | 39,298 |
| Total size of examined sequences (Mb) | 40.78 |
| Total number of identified SSRs | 9,933 |
| Number of SSR-containing sequences | 7,997 |
| Number of sequences containing more than one SSR locus | 1,556 |
| Number of SSRs present in compound formation | 580 |
Summary of EST-SSRs identified from the unigenes of colored calla lily ‘Rehmannii’.
| Repeat motif | Number of repeats | |||||||
|---|---|---|---|---|---|---|---|---|
| 5 | 6 | 7 | 8 | 9 | 10 | >10 | Total | |
| Di- (3,482, 59.78%) | ||||||||
| AG/CT | 0 | 652 | 568 | 525 | 536 | 381 | 106 | 2,768 |
| AT/AT | 0 | 108 | 68 | 61 | 46 | 49 | 28 | 360 |
| AC/GT | 0 | 132 | 70 | 54 | 25 | 32 | 15 | 328 |
| CG/CG | 0 | 14 | 7 | 0 | 3 | 2 | 0 | 26 |
| Tri- (2,261, 38.82%) | ||||||||
| AGG/CCT | 316 | 116 | 43 | 4 | 1 | 0 | 0 | 480 |
| AAG/CTT | 258 | 136 | 67 | 6 | 0 | 0 | 1 | 468 |
| AGC/CTG | 252 | 99 | 39 | 2 | 0 | 0 | 0 | 392 |
| CCG/CGG | 245 | 85 | 39 | 3 | 0 | 0 | 0 | 372 |
| ATC/ATG | 96 | 37 | 21 | 3 | 0 | 0 | 0 | 157 |
| ACC/GGT | 84 | 42 | 20 | 3 | 1 | 0 | 0 | 150 |
| Other | 146 | 49 | 35 | 11 | 1 | 0 | 0 | 242 |
| Tetra- (62, 1.06%) | ||||||||
| AAAG/CTTT | 14 | 2 | 0 | 0 | 0 | 0 | 0 | 16 |
| AGAT/ATCT | 12 | 1 | 0 | 0 | 0 | 0 | 0 | 13 |
| ACAT/ATGT | 4 | 1 | 1 | 0 | 0 | 0 | 0 | 6 |
| AAAT/ATTT | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 5 |
| Others | 17 | 3 | 1 | 1 | 0 | 0 | 0 | 22 |
| Penta- (13,0.22%) | 13 | 0 | 0 | 0 | 0 | 0 | 0 | 13 |
| Hexa- (7, 0.12%) | 3 | 0 | 2 | 1 | 0 | 0 | 1 | 7 |
| Total | 1,465 | 1,477 | 981 | 674 | 613 | 464 | 151 | 5,825 |
| Percentage | 25.15% | 25.36% | 16.84% | 11.57% | 10.52% | 7.97% | 2.59% | 100% |
Summary of SNPs identified from unigenes of colored calla lily ‘Rehmannii’.
| Transitions | Number | Transversions | Number |
|---|---|---|---|
| C/T | 2,262 | A/T | 650 |
| A/G | 2,188 | A/C | 647 |
| T/G | 625 | ||
| C/G | 790 | ||
| Total | 4,450 | Total | 2,712 |
Characteristics of the 58 polymorphic EST-SSR markers in 21 colored calla lily accessions.
| Locus | Na | Ne | Ho | He | PIC | Locus | Na | Ne | Ho | He | PIC |
|---|---|---|---|---|---|---|---|---|---|---|---|
| CallaRe015 | 3 | 2.057 | 0.684 | 0.514 | 0.425 | CallaRe110 | 2 | 1.324 | 0.286 | 0.245 | 0.215 |
| CallaRe016 | 6 | 3.320 | 0.600 | 0.699 | 0.654 | CallaRe117 | 3 | 2.410 | 0.000 | 0.585 | 0.513 |
| CallaRe028 | 4 | 3.756 | 1.000 | 0.734 | 0.685 | CallaRe118 | 4 | 2.028 | 0.667 | 0.507 | 0.462 |
| CallaRe030 | 3 | 2.095 | 1.000 | 0.523 | 0.409 | CallaRe120 | 4 | 2.932 | 0.333 | 0.659 | 0.593 |
| CallaRe031 | 2 | 1.446 | 0.286 | 0.308 | 0.261 | CallaRe128 | 4 | 3.556 | 0.550 | 0.719 | 0.670 |
| CallaRe032 | 2 | 1.930 | 0.810 | 0.482 | 0.366 | CallaRe129 | 2 | 1.984 | 0.545 | 0.496 | 0.373 |
| CallaRe036 | 2 | 1.296 | 0.263 | 0.229 | 0.202 | CallaRe131 | 4 | 2.930 | 0.286 | 0.659 | 0.601 |
| CallaRe040 | 4 | 2.139 | 0.278 | 0.532 | 0.483 | CallaRe135 | 2 | 1.220 | 0.200 | 0.180 | 0.164 |
| CallaRe041 | 3 | 1.156 | 0.048 | 0.135 | 0.130 | CallaRe144 | 4 | 2.766 | 0.588 | 0.638 | 0.589 |
| CallaRe042 | 2 | 1.265 | 0.238 | 0.210 | 0.188 | CallaRe146 | 3 | 2.085 | 0.952 | 0.520 | 0.408 |
| CallaRe049 | 2 | 1.960 | 0.857 | 0.490 | 0.370 | CallaRe147 | 3 | 1.841 | 0.353 | 0.457 | 0.411 |
| CallaRe050 | 6 | 3.630 | 0.571 | 0.724 | 0.683 | CallaRe151 | 3 | 2.455 | 0.667 | 0.593 | 0.505 |
| CallaRe055 | 3 | 2.057 | 0.526 | 0.514 | 0.425 | CallaRe155 | 2 | 1.724 | 0.000 | 0.420 | 0.332 |
| CallaRe056 | 2 | 1.946 | 0.833 | 0.486 | 0.368 | CallaRe156 | 4 | 2.309 | 0.667 | 0.567 | 0.486 |
| CallaRe061 | 4 | 3.469 | 0.286 | 0.712 | 0.661 | CallaRe160 | 2 | 1.893 | 0.000 | 0.472 | 0.360 |
| CallaRe066 | 2 | 1.265 | 0.238 | 0.210 | 0.188 | CallaRe165 | 2 | 2.000 | 1.000 | 0.412 | 0.375 |
| CallaRe075 | 2 | 1.358 | 0.313 | 0.264 | 0.229 | CallaRe166 | 4 | 2.520 | 0.333 | 0.603 | 0.541 |
| CallaRe078 | 4 | 2.303 | 0.952 | 0.566 | 0.471 | CallaRe170 | 3 | 2.597 | 0.737 | 0.615 | 0.536 |
| CallaRe080 | 4 | 2.285 | 0.381 | 0.562 | 0.519 | CallaRe175 | 6 | 4.762 | 0.800 | 0.790 | 0.757 |
| CallaRe081 | 3 | 1.407 | 0.333 | 0.289 | 0.266 | CallaRe178 | 2 | 1.835 | 0.700 | 0.455 | 0.351 |
| CallaRe082 | 3 | 2.182 | 0.000 | 0.542 | 0.460 | CallaRe179 | 2 | 1.992 | 0.188 | 0.498 | 0.374 |
| CallaRe089 | 2 | 1.995 | 0.857 | 0.499 | 0.374 | CallaRe180 | 3 | 1.340 | 0.190 | 0.254 | 0.237 |
| CallaRe090 | 3 | 2.829 | 0.500 | 0.646 | 0.571 | CallaRe185 | 2 | 1.637 | 0.412 | 0.389 | 0.314 |
| CallaRe095 | 3 | 2.524 | 1.000 | 0.604 | 0.525 | CallaRe187 | 2 | 1.960 | 0.000 | 0.490 | 0.370 |
| CallaRe097 | 3 | 1.956 | 0.619 | 0.489 | 0.407 | CallaRe189 | 3 | 2.111 | 0.550 | 0.526 | 0.431 |
| CallaRe100 | 3 | 2.246 | 0.500 | 0.555 | 0.456 | CallaRe190 | 2 | 1.498 | 0.316 | 0.332 | 0.277 |
| CallaRe101 | 3 | 1.893 | 0.095 | 0.472 | 0.397 | CallaRe191 | 2 | 1.205 | 0.188 | 0.170 | 0.155 |
| CallaRe106 | 4 | 2.431 | 0.684 | 0.589 | 0.506 | CallaRe194 | 2 | 1.600 | 0.500 | 0.375 | 0.305 |
| CallaRe109 | 3 | 2.256 | 0.619 | 0.557 | 0.462 | CallaRe198 | 3 | 2.492 | 0.650 | 0.599 | 0.514 |
Figure 3An NJ dendrogram of 21 colored calla lily accessions based on 58 polymorphic EST-SSR markers.