| Literature DB >> 26504533 |
Merianne Alkio1, Uwe Jonas1, Myriam Declercq1, Steven Van Nocker2, Moritz Knoche1.
Abstract
The exocarp, or skin, of fleshy fruit is a specialized tissue that protects the fruit, attracts seed dispersing fruit eaters, and has large economical relevance for fruit quality. Development of the exocarp involves regulated activities of many genes. This research analyzed global gene expression in the exocarp of developing sweet cherry (Prunus avium L., 'Regina'), a fruit crop species with little public genomic resources. A catalog of transcript models (contigs) representing expressed genes was constructed from de novo assembled short complementary DNA (cDNA) sequences generated from developing fruit between flowering and maturity at 14 time points. Expression levels in each sample were estimated for 34 695 contigs from numbers of reads mapping to each contig. Contigs were annotated functionally based on BLAST, gene ontology and InterProScan analyses. Coregulated genes were detected using partitional clustering of expression patterns. The results are discussed with emphasis on genes putatively involved in cuticle deposition, cell wall metabolism and sugar transport. The high temporal resolution of the expression patterns presented here reveals finely tuned developmental specialization of individual members of gene families. Moreover, the de novo assembled sweet cherry fruit transcriptome with 7760 full-length protein coding sequences and over 20 000 other, annotated cDNA sequences together with their developmental expression patterns is expected to accelerate molecular research on this important tree fruit crop.Entities:
Year: 2014 PMID: 26504533 PMCID: PMC4591669 DOI: 10.1038/hortres.2014.11
Source DB: PubMed Journal: Hortic Res ISSN: 2052-7276 Impact factor: 6.793
Figure 1Growth and development of the sweet cherry ‘Regina’ fruit analyzed in this study. (a) Fruit mass and surface area from flowering to maturity. Stage I, cell division and expansion; Stage II (gray shading), seed development and pit hardening; Stage III, cell expansion. Color change from green to red occurred between 59 and 66 DAFB (arrow). (b) Mass of CM per fruit and calculated rate of CM deposition. Data points in a and b show the average of 30 measurements; error bars represent s.e. (not visible if smaller than symbols). Time is given in DAFB. (c) Representative photos of the analyzed fruit and sample codes identifying the RNA-seq samples. The numbers in images indicate the developmental age of the fruit in DAFB. Sample codes contain information on fruit age in DAFB (3–94), tissue type (G, whole ovaries after removal of other floral organs; E, exocarp-enriched tissue; M, mesocarp only) and replicate number (1 or 2) if applicable. Photos not to scale.
Figure 2Summary of the RNA-seq experiment, pre-processing of raw reads and de novo assembly of the sequence data. Details are given in section on ‘Material and methods’, Supplementary Method S1 and Supplementary Table S2.
Summary of all contigs. Contigs were assembled de novo from Illumina sequenced cDNA fragments generated from sweet cherry ‘Regina’ fruit sampled at different developmental stages. Group 1 and 3 were termed ‘high abundance’ and Group 2 ‘low abundance’ contigs, based on the number of mapped reads per contig; threshold 30 mapped reads per sample or 75 reads total in all 24 samples. Group 3 consists of contigs with BLASTn hits (e-value <1×−100) to bacterial, viral, rRNA or other sources as described in the section on ‘Materials and methods’
| Total | Group 1 ‘high abundance’ | Group 2 ‘low abundance’ | Group 3 ‘contaminants’ | |
|---|---|---|---|---|
| Number of contigs | 68 101 | 34 695 | 32 712 | 694 |
| Number of bases in all contigs | 45 000 984 | 34 807 121 | 9 602 971 | 590 892 |
| Minimum contig length (bp) | 107 | 200 | 107 | 200 |
| Maximum contig length (bp) | 14 149 | 1 ,485 | 1482 | 14 149 |
| Median contig length (bp) | 370 | 740 | 261 | 436 |
| N50 contig length (bp) | 1070 | 1413 | 291 | 1526 |
| Number of contigs ≥1 kb | 12 867 | 12 727 | 9 | 131 |
| Number of contigs ≥N50 | 11 874 | 7881 | 12 228 | 87 |
| GC content of the contigs (%) | 42.3 | 43.0 | 39.5 | 45.5 |
| Number of mapped reads | 438 980 173 | 330 282 577 | 2 398 623 | 106 298 973 |
| Percentage of mapped reads | 66.0 | 49.6 | 0.4 | 16.0 |
50% of the bases in the contig set are in contigs longer/shorter than N50.
Raw reads were trimmed to 65 bp and mapped to contigs, one mismatch allowed.
Figure 3Length distribution of assembled sweet cherry ‘Regina’ contigs in Group 1 (‘high abundance’, 34695 contigs, length 200–12 485 bp) and Group 2 (‘low abundance’, 32 712 contigs, 107–1482 bp). Length distribution of predicted transcripts in the P. persica genome (v.1.0) (28 702 sequences, 96–15 738 bp) is shown for reference. The x-values give the center of each bin; bin width is 100 bp, except for the first bin which is from 1 to 98 bp. Note logarithmic scale of the y-axis; bins with 0 sequences not shown.
Figure 4Distribution of the mapped reads between the contigs in Groups 1, 2 and 3 in each of the 24 RNA samples from sweet cherry ‘Regina’ fruit. For sample codes, see Figure 1.
Annotation summary of all Group 1F contigs and of the Group 1F contigs containing a full-length ORF. Group 1F consists of Group 1 contigs differing less than 10% points in the relative expression levels between two biological replicates. Similarity searches were performed via the platform PAVE against P. persica, M. domestica, V. vinifera, A. thaliana and P. avium peptides and via Blast2GO (B2G) against Swiss-Prot, NCBI RefSeq and nr databases as described in the section on ‘Materials and methods’
| All contigs | Contigs with complete ORF | |||
|---|---|---|---|---|
| Number | % | Number | % | |
| Total | 29 955 | 100 | 7628 | 100 |
| Has BLASTx hit(s) (PAVE) | 23 603 | 79 | 7628 | 100 |
| Best hit to | 21 381 | 71 | 7350 | 96 |
| Best hit to | 1845 | 6.2 | 221 | 2.9 |
| Best hit to | 263 | 0.9 | 21 | 0.3 |
| Best hit to | 70 | 0.2 | 2 | 0.03 |
| Best hit to | 44 | 0.1 | 34 | 0.4 |
| Has BLASTx hit(s) (B2G) | 22 259 | 74 | 7581 | 99 |
| First hit to Swiss-Prot | 15 696 | 52 | 5849 | 77 |
| First hit to RefSeq | 5557 | 19 | 1554 | 20 |
| First hit to nr | 1006 | 3.3 | 178 | 2.3 |
| Has GO annotation (B2G) | 17 610 | 59 | 6,268 | 82 |
| GO biological process | 13 724 | 46 | 4,992 | 65 |
| GO cellular component | 14 492 | 48 | 5,528 | 72 |
| GO molecular function | 14 041 | 51 | 4,930 | 65 |
| InterProScan match | 21 459 | 72 | 7,407 | 97 |
e-value ≤1×10−15.
e-value ≤1×10−10.
Figure 5GO terms of 7628 Group 1F contigs with predicted full-length open reading frames. GO terms in categories biological process, molecular function and cellular component were retrieved from combined graph analyses performed in Blast2GO platform (sequence filter 100, score alpha 0.6, node score filter 100). The GO terms are sorted in descending graph score order; numbers in parentheses indicate annotation levels.
Over-represented GO terms in 219 Group 1F contigs with predicted full-length open reading frame and preferential expression in the exocarp (FPKM values in the mesocarp <1% of total FPKM values in all 24 samples). GO term enrichment was determined in a Fisher’s exact test and the result reduced to most specific terms; 20 most over-represented terms are shown. Complete results including contig identifiers available as Supplementary Table S6
| GO term (category) | Nr. | FDR |
|---|---|---|
| Wax biosynthetic process (P) | 10 | 6.04×10−8 |
| Cutin biosynthetic process (P) | 7 | 4.03×10−7 |
| Response to salicylic acid stimulus (P) | 20 | 4.03×10−7 |
| Carboxylesterase activity (F) | 10 | 1.07×10−6 |
| Response to jasmonic acid stimulus (P) | 19 | 1.07×10−6 |
| Trichome morphogenesis (P) | 10 | 8.12×10−5 |
| Cinnamic acid biosynthetic process (P) | 6 | 8.12×10−5 |
| Response to wounding (P) | 17 | 5.42×10−4 |
| Response to UV-B (P) | 9 | 7.92×10−4 |
| Sequence-specific DNA binding TF activity (F) | 21 | 1.10×10−3 |
| Mucilage bios. p. involved in seed coat development (P) | 5 | 1.45×10−3 |
| Response to ethylene stimulus (P) | 13 | 2.35×10−3 |
| Flavonoid biosynthetic process (P) | 10 | 2.76×10−3 |
| Long-chain fatty acid transport (P) | 4 | 3.15×10−3 |
| Response to auxin stimulus (P) | 15 | 3.36×10−3 |
| Monooxygenase activity (F) | 10 | 4.27×10−3 |
| Apoplast (C) | 13 | 4.68×10−3 |
| Response to abscisic acid stimulus (P) | 21 | 8.27×10−3 |
| Defense response to fungus (P) | 12 | 9.46×10−3 |
| Very long-chain fatty acid metabolic process (P) | 5 | 9.46×10−3 |
Abbreviations: C, cellular component; F, molecular function; FDR, false discovery rate; FPKM, fragments (reads) per kilobase of contig per million fragments mapped; Nr, number of contigs annotated with the given GO term; P, biological process; TF, transcription factor; UV, ultraviolet.
Forty most abundant mRNA sequences (contigs) in the transcriptome of developing sweet cherry ‘Regina’ fruit. Additional information about these sequences is given in Supplementary Table S4
| Contig ID | Length (bp) | FPKM total | B2G best hit | B2G best hit definition | Tentative function or process |
|---|---|---|---|---|---|
| Pa_24402 | 537 | 523 781 | 1.3×10−135 | Glucan endo-1,3-beta-glucosidase; TLP; allergen Pru av 2 | Stress |
| Pa_20288 | 606 | 269 571 | 6.9×10−35 | Extensin-3 | Cell wall |
| Pa_49067 | 411 | 250 297 | 2.9×10−49 | Extensin-3 | Cell wall |
| Pa_47876 | 472 | 169 152 | 1.8×10−31 | Proline-rich extensin-like protein EPR1 | Cell wall |
| Pa_11846 | 690 | 160 873 | 3.6×10−112 | Major allergen Pru av 1 | Stress |
| Pa_25400 | 432 | 152 331 | No hit | Unknown | |
| Pa_47716 | 483 | 140 243 | 5.4×10−28 | Dehydrin COR47; cold-induced COR47 protein | Stress |
| Pa_29501 | 581 | 112 730 | 5.3×10−79 | Nonspecific lipid-transfer protein; allergen Pru av 3 | Transport |
| Pa_53685 | 298 | 109 269 | No hit | Unknown | |
| Pa_50270 | 369 | 104 741 | No hit | Unknown | |
| Pa_10726 | 820 | 99 920 | 8.1×10−42 | Protein E6 | Unknown |
| Pa_39995 | 357 | 94 862 | 1.8×10−27 | Phosphoprotein ECPP44 | Stress |
| Pa_49183 | 405 | 85 504 | no hit | Unknown | |
| Pa_13183 | 905 | 76 502 | 0 | Phosphoenolpyruvate carboxykinase | Gluconeogenesis |
| Pa_46196 | 679 | 75 477 | 4.2×10−107 | Probable xyloglucan endotransglucosylase/hydrolase | Cell wall |
| Pa_21625 | 829 | 69 722 | 2.4×10−138 | Glucan endo-1,3-beta-glucosidase; TLP; allergen Pru av 2 | Stress |
| Pa_28260 | 537 | 65 818 | 1.2×10−21 | Conserved hypothetical protein | Unknown |
| Pa_11084 | 806 | 62 814 | 3.0×10−13 | Metallothionein-like protein type 3 | Transport |
| Pa_35173 | 469 | 62 774 | 4.0×10−14 | Arabinogalactan peptide 15 | Cell wall |
| Pa_21347 | 750 | 58 838 | 1.2×10−100 | Major allergen Pru ar 1 | Stress |
| Pa_21722 | 398 | 57 020 | 3.0×10−12 | Polygalacturonase inhibitor 1; PGIP-1 | Stress |
| Pa_06786 | 1089 | 55 517 | 0 | Leucoanthocyanidin dioxygenase | Flavonoid bs. |
| Pa_11167 | 1090 | 54 012 | 4.5×10−59 | Extensin-1 | Cell wall |
| Pa_30361 | 338 | 53 527 | 1.6×10−47 | Chalcone synthase 2; Naringenin-chalcone synthase 2 | Flavonoid bs. |
| Pa_59559 | 240 | 48 770 | No hit | Unknown | |
| Pa_13457 | 894 | 48 742 | 2.1×10−18 | Major latex allergen Hev b 5 | Stress |
| Pa_44191 | 283 | 47 095 | No hit | Unknown | |
| Pa_54252 | 291 | 46 745 | No hit | Unknown | |
| Pa_16626 | 537 | 46 455 | 1.0×10−61 | Ribulose bisphosphate carboxylase small chain | Photosynth. |
| Pa_11118 | 1257 | 44 566 | 1.4×10−151 | Catalase isozyme 1 | ROS detoxification |
| Pa_23720 | 638 | 41 644 | 7.9×10−17 | Universal stress protein A-like protein | Stress |
| Pa_05806 | 1627 | 41 123 | 0 | Polyubiquitin | Protein turn-over |
| Pa_18352 | 1480 | 37 174 | 2.7×10−104 | Cysteine proteinase RD21a | Protein turn-over |
| Pa_32379 | 348 | 36 122 | 6.6×10−55 | Glycine-rich cell wall structural protein; precursor | Cell wall |
| Pa_01373 | 771 | 35 506 | 4.0×10−62 | Probable xyloglucan endotransglucosylase/hydrolase | Cell wall |
| Pa_21518 | 645 | 35 305 | 3.6×10−40 | Metallothionein-like protein type 2 | Transport |
| Pa_21558 | 303 | 35 275 | 1.8×10−29 | Tubulin alpha chain | Cytoskeleton |
| Pa_10497 | 1760 | 35 073 | 4.0×10−175 | Stress | |
| Pa_29840 | 342 | 34 516 | 1.3×10−43 | Elongation factor 1-alpha | Translation |
| Pa_44188 | 348 | 33 983 | No hit | Unknown |
Abbreviations: bs., biosynthesis; FPKM, fragments (reads) per kilobase of contig per million fragments mapped; ROS, reactive oxygen species.
Contig was predicted to contain a full-length open reading frame.
Expression preferentially in exocarp (FPKM values in mesocarp 24 and 80 DAFB below 1% of the total sum of FPKM values in all samples).
Figure 6Selected expression patterns within the sweet cherry ‘Regina’ fruit skin transcriptome. (a) All 29 955 contigs in Group 1F (G1F) were first clustered in five clusters applying the NG algorithm on the normalized expression patterns. Clusters NG1, NG3 and NG4 are shown. (b) Each NG cluster was reclustered applying the QT clustering algorithm (cluster diameters adapted to data, minimum cluster size 20 contigs). Numbers in parentheses indicate the number of contigs in each cluster. Selected clusters are shown. Sample codes as in Figure 1. The complete set of cluster plots available as Supplementary Fig. S3.
Selected contigs representing sweet cherry genes with predicted functions in cell wall modification, lipid metabolism, regulation of transcription, transport or other processes
| Process or function | Contig | Length (bp) | FPKM total | B2G best hit definition | Exo | Cluster |
|---|---|---|---|---|---|---|
| Cell wall modif. | Pa_03454 | 2238 | 1440 | Endo-1,4-beta glucanase 10 | NG1-QT 44 | |
| Cell wall modif. | Pa_06233 | 1789 | 170 | Polygalacturonase At1g48100 | NG1-QT 44 | |
| Cell wall modif. | Pa_07062 | 1753 | 1622 | Probable pectate lyase 20 | NG1-QT 44 | |
| Cell wall modif. | Pa_07115 | 848 | 4649 | Pectin methylesterase 3 | NG1-QT 44 | |
| Defense | Pa_11565 | 719 | 6818 | Snakin-1, precursor | NG1-QT 44 | |
| Lipid metabolism | Pa_11433 | 2293 | 537 | Long chain acyl-CoA synthetase 1 | Exo | NG1-QT 44 |
| TF | Pa_22147 | 1828 | 521 | Transcription factor MYB76 | Exo | NG1-QT 44 |
| Cell wall modif. | Pa_06731 | 1621 | 10 223 | Probable pectate lyase 5 | NG1-QT 43 | |
| Lipid metabolism | Pa_00817 | 2053 | 2401 | Probable glycerol-3-phosphate acyltransferase 8 | Exo | NG1-QT 43 |
| Lipid metabolism | Pa_01043 | 2508 | 2335 | Cytochrome P450 86A2 (ATT1) | Exo | NG1-QT 43 |
| Lipid metabolism | Pa_01742 | 2243 | 1935 | 3-ketoacyl-CoA synthase 10 | Exo | NG1-QT 43 |
| Lipid metabolism | Pa_01940 | 1555 | 4216 | GDSL esterase/lipase At2g04570 | Exo | NG1-QT 43 |
| Lipid metabolism | Pa_06204 | 2351 | 614 | Long-chain acyl-CoA synthetase 2 | Exo | NG1-QT 43 |
| Lipid metabolism | Pa_10336 | 1604 | 1524 | Protein HOTHEAD | Exo | NG1-QT 43 |
| TF | Pa_02691 | 991 | 8695 | Ethylene-responsive transcription factor ERF023 | NG1-QT 43 | |
| TF | Pa_19618 | 1277 | 351 | Ethylene-responsive transcription factor WIN1; SHN1 | Exo | NG1-QT 43 |
| Cell wall modif. | Pa_10720 | 2015 | 784 | Endo-1,4-beta glucanase 11 | NG1-QT 39 | |
| Lipid metabolism | Pa_00867 | 2120 | 2553 | BAHD acyltransferase DCR | Exo | NG1-QT 39 |
| Lipid metabolism | Pa_04572 | 1399 | 1908 | GDSL esterase/lipase; lithium tolerant lipase 1 | Exo | NG1-QT 39 |
| Lipid metabolism | Pa_14044 | 1946 | 447 | Putative aminoacrylate hydrolase RutD | Exo | NG1-QT 39 |
| Lipid metabolism | Pa_14662 | 1595 | 66 | Long-chain-alcohol O-fatty-acyltransferase WSD1 | Exo | NG1-QT 39 |
| TF | Pa_00973 | 1757 | 241 | AP2-like ethylene-responsive transcription factor | Exo | NG1-QT 39 |
| TF | Pa_23194 | 761 | 176 | Transcription factor WER; AtMYB66 | Exo | NG1-QT 39 |
| Transport | Pa_22556 | 1120 | 1708 | Uncharacterized GPI-anchored protein At1g27950 | Exo | NG1-QT 39 |
| TF | Pa_08841 | 1284 | 1090 | Ethylene-responsive transcription factor SHINE 2 | Exo | NG1-QT 37 |
| Lipid metabolism | Pa_16669 | 1477 | 5943 | GDSL esterase/lipase At5g33370 | Exo | NG1-QT 36 |
| Lipid metabolism | Pa_22209 | 1638 | 645 | GDSL esterase/lipase At1g29670 | Exo | NG1-QT 36 |
| Cell wall modif. | Pa_13423 | 887 | 25 363 | 21 kDa protein; pectin methylesterase inhibitor | NG1-QT 32 | |
| Lipid metabolism | Pa_00478 | 2422 | 9571 | Protein WAX2 | Exo | NG1-QT 28 |
| Lipid metabolism | Pa_06166 | 2334 | 2920 | 3-ketoacyl-CoA synthase 6 | Exo | NG1-QT 28 |
| Cell wall modif. | Pa_03006 | 2240 | 1688 | Probable pectate lyase 1 | NG1-QT 18 | |
| Cell wall modif. | Pa_00883 | 2902 | 4084 | Endo-1,4-beta glucanase 6 | NG1-QT uncl. | |
| Lipid metabolism | Pa_08907 | 2100 | 256 | 3-ketoacyl-CoA synthase 19 | Exo | NG1-QT uncl. |
| Hormone related | Pa_06858 | 1988 | 823 | ABA 8′-hydroxylase 4; Cytochrome P450 707A4 | NG3-QT 51 | |
| Transport | Pa_01687 | 2127 | 3767 | Polyol transporter 5; sugar-proton symporter PLT5 | Exo | NG3-QT 50 |
| Cell wall modif. | Pa_03056 | 1375 | 766 | Xyloglucan endotransglucosylase/hydrolase protein 33 | NG3-QT 46 | |
| Lipid metabolism | Pa_04929 | 1898 | 6030 | Cytochrome P450 716B2; CYPA2 | Exo | NG3-QT 46 |
| TF | Pa_05584 | 2841 | 1563 | Homeobox-leucine zipper protein ANL2 | Exo | NG3-QT 38 |
| Lipid metabolism | Pa_05276 | 2709 | 884 | Protein WAX2 | Exo | NG3-QT 37 |
| Transport | Pa_06635 | 2918 | 701 | WBC11 | Exo | NG3-QT 37 |
| Transport | Pa_07137 | 2417 | 400 | ABC transporter G family member 15; AtWBC22 | Exo | NG3-QT uncl. |
| Cell wall modif. | Pa_01792 | 2212 | 1802 | Pectin methylesterase 3 | NG4-QT 71 | |
| Cell wall modif. | Pa_07321 | 1445 | 8058 | Xyloglucan endotransglucosylase/hydrolase protein 28 | NG4-QT 71 | |
| TF | Pa_02092 | 2197 | 7361 | NAC domain containing protein 18 | NG4-QT 71 | |
| TF | Pa_06479 | 1790 | 6415 | Transcription factor MYB75 | NG4-QT 71 | |
| TF | Pa_00639 | 2490 | 9253 | Ethylene-responsive transcription factor ERF071 | NG4-QT 70 | |
| Cell wall modif. | Pa_09434 | 1516 | 6951 | Expansin A1 | NG4-QT 67 | |
| Cell wall modif. | Pa_14097 | 1560* | 10 667 | Probable pectate lyase 18 | NG4-QT 67 | |
| Other | Pa_02283 | 2605 | 2991 | CTP synthase 1 | NG4-QT 67 | |
| Transport | Pa_05395 | 1898 | 3094 | Polyol transporter 5; sugar-proton symporter PLT5 | Exo | NG4-QT 66 |
| Lipid metabolism | Pa_14729 | 1878* | 2560 | Glycerol-3-phosphate acyltransferase 6 | Exo | NG4-QT uncl. |
Abbreviations: FPKM, fragments (reads) per kilobase of contig per million fragments mapped; uncl., unclustered.
See also Supplementary Table S4.
Exo, expression mostly in exocarp (FPKM values in mesocarp samples <1% of the total FPKM sum).
See Figure 6 and Supplementary Fig. S4.
Contig contains a complete open reading frame.