Literature DB >> 28683065

Genome-wide identification of pistil-specific genes expressed during fruit set initiation in tomato (Solanum lycopersicum).

Kentaro Ezura1, Kim Ji-Seong2, Kazuki Mori3, Yutaka Suzuki4, Satoru Kuhara3, Tohru Ariizumi1,2, Hiroshi Ezura1,2.   

Abstract

Fruit set involves the developmental transition of an unfertilized quiescent ovary in the pistil into a fruit. While fruit set is known to involve the activation of signals (including various plant hormones) in the ovary, many biological aspects of this process remain elusive. To further expand our understanding of this process, we identified genes that are specifically expressed in tomato (Solanum lycopersicum L.) pistils during fruit set through comprehensive RNA-seq-based transcriptome analysis using 17 different tissues including pistils at six different developmental stages. First, we identified 532 candidate genes that are preferentially expressed in the pistil based on their tissue-specific expression profiles. Next, we compared our RNA-seq data with publically available transcriptome data, further refining the candidate genes that are specifically expressed within the pistil. As a result, 108 pistil-specific genes were identified, including several transcription factor genes that function in reproductive development. We also identified genes encoding hormone-like peptides with a secretion signal and cysteine-rich residues that are conserved among some Solanaceae species, suggesting that peptide hormones may function as signaling molecules during fruit set initiation. This study provides important information about pistil-specific genes, which may play specific roles in regulating pistil development in relation to fruit set.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28683065      PMCID: PMC5500324          DOI: 10.1371/journal.pone.0180003

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

The pistil is a single reproductive organ that develops into a fruit after fruit set. The efficiency of fruit set is one of the most important traits that determine yield in many fruit-bearing crops such as tomato (Solanum lycopersicum L.). Because of its worldwide production and availability, tomato has been widely accepted as a model system for investigating fruit set. In general, fruit set is induced after successful development of the pistil upon pollination and following fertilization [1]. Through conventional molecular, genetic, and biochemical analyses of tomato, plant hormones such as auxin and gibberellic acid (GA) have been shown to play important roles in various plant developmental processes, including inducing fruit set in the pistil [1-5]. Mimicking fruit set signals by exogenous application of these hormones and mutation of the genes related to hormone signaling or metabolism induce fruit set without pollination/fertilization, a process known as parthenocarpy [6]. Furthermore, endogenous induction of auxin biosynthesis in ovules through genetic engineering is one of the most effective approaches for inducing parthenocarpy [7]. However, the key mechanisms and signals that induce fruit set in conjunction with plant hormones in the pistil remain largely unknown. To investigate this issue, it would be useful to obtain transcriptome profiles in the pistil to uncover genes regulated by signals related to fruit set. Microarray and next generation sequencing of transcripts (RNA-Seq) are two major transcriptome profiling systems that have been widely used in molecular biology [8]. One of the benefits of transcriptome analysis is that it allows the global gene expression profiles of thousands to nearly 40,000 genes to be investigated in a single experiment. Recently, RNA-seq has become more popular than microarray analysis for obtaining transcriptome profiles and the associated quantitative data. Comparative transcriptomics by RNA-seq produces massive amounts of accurate information about differentially expressed genes between various biological events and among related individuals, providing many clues about the mechanisms underlying plant development, growth, responses to various environmental signals, and the evolution of plant species [9-15]. In studies investigating fruit development, RNA-seq-based transcriptome analyses have revealed important biological pathways and gene sets associated with fruit development and ripening [16,17-22]. However, only a limited number of transcriptome studies have targeted pistils during fruit set in tomato [20,23-25]. These studies have identified various gene sets that appear to be expressed during fruit set, such as genes related to plant hormone metabolism and sensitivity, transcription factors regulating meristem differentiation and floral organ development, and those involved in carbohydrate metabolism [20,26]. Because of their multiple effects on various aspects of plant development, it is still difficult to narrow down candidate genes or biological pathways that directly influence the induction and completion of fruit set downstream of plant hormone signaling. Pistil comprises a mixture of heterogeneous tissues consisting of ovules, style, placenta, and pericarp (ovary wall), which often hinders the elucidation of the detailed mechanism of early fruit development due to this inherent complexity. The development of each tissue may directly influence the success of fruit set and subsequent fruit growth. After pollination, pollen enters the ovule through the style. The fertilized ovules become seeds, which provide growth signals to the entire fruit, while the rate of cell division in the ovary wall and placenta determines the final size of the fruit [1]. Recently, cell-type-specific transcriptomes of the pistil during fruit set were uncovered by two independent groups using wild tomato S. pimpinellifolium and tomato cultivar ‘Moneymaker’, providing important information about cell type-specific transcriptomes during fruit set [23,27]. In addition, several individual pistil-specific genes (PSGs) were identified, which play important roles in processes such as pollen tube extension, pollen-pistil interactions, and ovule development, highlighting the importance of PSGs in the regulation of tissue-specific development in the pistil, including two polygalacturonase genes (PG7 and TAPG4) in tomato [28], one extensin-like glycoprotein gene (PELP3) in Nicotiana tabacum [29-31], one endo-1,4-β-D-glucanase gene, and one MADS box transcription factor gene (SEEDSTOCK/AGL11) in Arabidopsis thaliana [32,33]. Nonetheless, few studies have focused on the isolation of PSGs due to technical difficulties such as the small size of the tissue. Recently, anther-specific genes were identified in various species using a transcriptomic approach, which play important roles in tissue differentiation and specification [34,35]. The isolation of genes expressed in specific tissues not only provides new insights into the development of each tissue, but it also provides genetic engineering tools for molecular breeding [36]. Therefore, to extend our understanding of the molecular mechanism underlying fruit set and to generate new tools for pistil-specific regulation of fruit set-associated genes, it is important to identify PSGs that are specifically expressed during fruit set initiation. In this study, we conducted genome-wide analysis of PSGs in tomato by RNA-seq and compared the results with publicly available data. As a result, we identified about one hundred of PSGs including genes encoding signaling-related proteins, several transcription factors, and peptide hormone-like proteins, in addition to many genes of unknown function. Further analysis of these mined genes would increase our understanding of the mechanisms underlying of pistil development and fruit set and would be useful for generating genetic engineering tools, such as tissue-specific promoters.

Material and methods

Plant materials, hormone treatment, and cDNA synthesis

Tomato cv ‘Micro-Tom’ was used in this study. The seeds were incubated on wet filter paper in a Petri dish at 25°C to stimulate germination, followed by growth in a cultivation room under a 16 h/8 h light/dark cycle at 25°C/22°C (day/night). Total RNA was extracted using an RNeasy Plant Mini Kit (Qiagen, USA) from 17 samples of different organs at different developmental stages: pistil and fruit samples (#1–8): pistils of 2–2.5 mm buds (#1), 3–4 mm buds (#2), 1 day before flowering (1 DBF) (#3), at anthesis (#4), 5 days after flowering (5 DAF) (#5), 5 mm ovaries of 7 days after flowering (7 DAF) (#6), mature green fruits at 33 days after flowering (MG) (#7), red fruits at 44 days after flowering (RED) (#8); stamens and other floral organ samples (#9–11): stamens of 3–4 mm buds (#9), 1 DBF (#10) and at anthesis (#11), sepals at anthesis (#12), petals at anthesis (#13), vegetative organs (#14–17 samples): 3-week-old leaves (#14), mature leaves (#15), stems (#16), and roots (#17). The total RNA was treated with DNase to remove contaminating DNA using a DNA-free RNA Kit (Zymo Research, USA). The cDNA was synthesized with 2 μg of total RNA using SuperScript VILO MasterMix (Thermo Fisher, USA) according to the manufacturer’s instructions. The cDNA libraries for RNA-seq were prepared using a TruSeq RNA Sample Prep Kit v2 (Illumina) according to manufacturer’s protocol.

RNA-seq, processing, mapping of Illumina reads, and detection of PSGs

The 35-nt and 100-nt single-end sequencing analysis was conducted on the Illumina Genome Analyzer IIx system and Illumina HiSeq 2000, respectively. To identify the transcriptome of each tissue, “direct-mapping method” was conducted. For the direct-mapping method, the quality of Illumina raw FASTQ data was checked by FastQC before and after trimming with Trimmomatic according to the instruction manual (S1 Fig) [37]. After trimming, only sequences with a minimum length of 20 bp were retained. The trimmed sequence data were imported into CLC Genomics Workbench ver 7.0.4 (QIAGEN, Germany) and mapped to the tomato genome SL2.50 and gene model SL2.40. Gene expression data were obtained as gene length by reads per kilobase of exon per million mapped reads (RPKM) values [38]. The data were normalized by the quantile method to reduce obscuring of variation among samples, and a logarithmic transformation part 2 subjects the normalized data after adding 1 to each values to logarithmic transformation for heat map analysis [39,40]. To narrow down the candidate genes expressed specifically in pistils, an RPKM value of 0.5 was used as the cutoff value to determine specific expression in each sample. Based on this criterion, candidate tomato PSGs with values higher than 0.5 in pistils and lower than 0.5 other tissues were identified by comparing the transcriptome data for each tissue.

Data mining of publically available RNA-seq data

To examine the expression patterns of the identified genes in tissues other than pistils, publically available data were downloaded from transcriptome analyses of tomato from the Tomato Functional Genomics Database (http://ted.bti.cornell.edu/cgi-bin/TFGD/digital/home.cgi). Data from nine different vegetative samples from tomato cv. Heinz and wild tomato species S. pimpinellifolium were extracted and investigated to determine whether the candidate genes were expressed in these tissues. To estimate the regions in the pistil in which the candidate genes are expressed, tissue-specific transcriptome data from the pistils of tomato wild relative S. pimpinellifolium [27] were used to identify genes with expression levels higher than RPM (reads per million mapped reads) = 2 in at least one sample. The expression levels of the top-ten genes in each tissue were then examined. To confirm the expression patterns of the candidate genes in the pistil, their expression levels were also investigated using transcriptome data from tomato cv. ‘Moneymaker’ [23]. If the expression level was higher than FPKM (Fragments Per Kilobase of exon per Million mapped fragments) 0.5 in at least one sample, it was judged to be an expressed gene. To investigate the responses of the genes to plant hormone treatment, a publically available dataset from the transcriptomes of pollinated or parthenocarpic fruit induced by hormone treatment was utilized [20]. To compare the list of differentially expressed genes with the candidate genes, unigene numbers were converted to ITAG IDs using the Unigene converter in the SGN database.

Gene ontology analysis

ITAG IDs of the candidate PSGs were used as input with the AgriGO agricultural gene ontology (GO) analysis tool (http://bioinfo.cau.edu.cn/agriGO/analysis.php) to elucidate enriched GO terms. A false discovery rate (FDR; e-value corrected for list size) of ≤0.05 was used as the criterion to obtain enriched GO terms.

Gene expression analysis by RT-PCR

To confirm the expression patterns of the candidate genes by RNA-seq analysis, RT-PCR was performed using cDNA samples derived from vegetative and reproductive organs, including young leaves, mature leaves, mature stems, mature roots, flower buds, and flower from 3-week-old plants. To analyze the expression patterns of the genes in ovaries or fruits before/after pollination, RT-PCR was performed using cDNA samples from tomato pistils and fruits at the corresponding developmental stages: A, pistils from 2–2.5 mm flower buds at 10 days before flowering (10 DBF); B, pistils from 3–4 mm flower buds at 7 days before flowering (7 DBF); C, pistils at 1 day before flowering (1 DBF); D, pistils at anthesis/pollination (0 DAF); E, pistils at 5 days after flowering (5 DAF); F, 5 mm ovaries at 7 days after flowering (7 DAF); G, mature green fruits at 33 days after flowering (MG); H, red fruits at 44 days after flowering (RED). Semi-quantitative reverse transcription polymerase chain reaction (RT-PCR) analysis was performed with Mastercycler ProS (Eppendorf, Germany) using an ExTaq Kit (TaKaRa Bio, Japan) and the primer sets listed in S5 Table. As an internal control for expression analysis in different organs, SAND expression was monitored using the primers SAND-F (5’- TTGCTTGGAGGAACAGACG -3’) and SAND-R (5’- GCAAACAGAACCCCTGAATC -3’) [41].

Sequence analysis of genes with unknown functions

Protein sequences were downloaded from the Sol genomic network. Sequence alignments were conducted using ClustalW in DDBJ (http://clustalw.ddbj.nig.ac.jp/). Phylogenetic trees were generated using CLC Genomic Workbench. The presence of secretion signals in the small proteins was investigated using SignalP 4.1 Server (http://www.cbs.dtu.dk/services/SignalP/). The conserved domains and motifs within the identified proteins were searched using NCBI's Conserved Domain Database (CDD) (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?) [42].

Availability of RNA-seq dataset

Transcriptome data are available at the GEO database under accession number DRA005810.

Results and discussion

Transcriptome analysis of various tomato tissues

To obtain transcriptome profiles of various tomato organs in order to identify PSGs, we performed RNA-seq analysis of 17 different floral and vegetative samples at different developmental stages (Fig 1A). We initially selected six different stages for the pistil samples (P1 to P6) and three different stages for the anther samples (A1 to A3). P1 to P3 and A1 to A2 represent samples at pre-anthesis; P1 corresponds to pistils in 2–2.5 mm flower bud, P2 and A1 correspond to pistils and anthers, respectively, in 3–4 mm flower bud, and P3 and A2 correspond to those in flower buds 1 day before flowering (1 DBF), while P4 and A3 correspond to those in flower buds at anthesis (0 DAF). P5 and P6 represent samples from post-anthesis stages: P5 and P6 correspond to pistils/fruits in flowers at 5 days after flowering (5 DAF) and in 5 mm ovaries at 7 days after flowering (7 DAF) samples, respectively. In addition, we used eight samples from different tissues. We conducted 35 nt and 50 nt single reads sequencing by Illumina GAIIx and Hiseq 2000, respectively (S1 Table). We used the “direct-mapping method” to identify sets of PSGs (Fig 1B). In the direct-mapping method, whole sequenced short reads were directly mapped onto the tomato reference genome.
Fig 1

Experimental design for RNA-seq analysis.

(A) The 17 samples used for transcriptome analysis. For vegetative organs, four samples were collected, including mature leaves from 3-week-old plants and young leaves, stems, and roots from 1-week-old plants. For reproductive organs and fruits, 13 samples were collected, including pistils and anthers of 2–2.5 mm buds, pistils and anthers of 3–4 mm buds, pistils at 1 day-before-flowering (1 DBF), pistils and anthers at anthesis (0 DAF), ovaries of 5-days after flowering (5 DAF), 5 mm ovaries (7 DAF), sepals and petals at anthesis, mature green fruits (MG), and red fruits (RED). (B) Work flow of transcriptome analyses. For the direct-mapping method, whole transcriptome data from short reads were obtained, which were directly mapped onto the tomato reference genome, and expressed genes were identified.

Experimental design for RNA-seq analysis.

(A) The 17 samples used for transcriptome analysis. For vegetative organs, four samples were collected, including mature leaves from 3-week-old plants and young leaves, stems, and roots from 1-week-old plants. For reproductive organs and fruits, 13 samples were collected, including pistils and anthers of 2–2.5 mm buds, pistils and anthers of 3–4 mm buds, pistils at 1 day-before-flowering (1 DBF), pistils and anthers at anthesis (0 DAF), ovaries of 5-days after flowering (5 DAF), 5 mm ovaries (7 DAF), sepals and petals at anthesis, mature green fruits (MG), and red fruits (RED). (B) Work flow of transcriptome analyses. For the direct-mapping method, whole transcriptome data from short reads were obtained, which were directly mapped onto the tomato reference genome, and expressed genes were identified. The direct-mapping method is a common approach for transcriptome analysis in which sequence reads are mapped onto the reference genome of a target organism [8,38]. Our RNA-seq generated different amounts of raw data ranging from 8.32 to 39.42 million reads. After quality checking and trimming of low quality reads and adapter sequences, we obtained 7.28 to 35.23 million clean reads for mapping (S1 Table). We analyzed the reads using CLC Genomic Workbench ver. 7.0.4, a user-friendly mapping tool; 82.3% to 90.5% of the clean reads from each sample were mapped to the tomato genome SL2.40 [43] (S1 Table). An RPKM cutoff value of 0.5 was utilized to declare a locus expressed, resulting in an average of approximately 25,000 genes above the expression threshold in 17 samples (S2A Fig). Before isolating PSGs, we examined the quality of our transcriptome data, as we used only one replicate per sample. Initially, to characterize the transcriptome data, we conducted principal component analysis (PCA) with CLC Genomic Workbench ver. 7.0.4. Component 1 explained 91% of the variation, while component 2 explained 2% of the variation, indicating that the two components together explained 93% of the variation of the 17 original variables. Samples from vegetative and reproductive organs were separated into two groups, with samples such as petals and sepals (which are components of reproductive organ but are composed of vegetative cells) located in the middle of the two groups (Fig 2A), indicating specialized transcriptomes. We investigated the expression patterns of the homologs that had already been identified as tissue-specific genes, such as YABBY transcription factor genes. In Arabidopsis, two YABBY transcription factors, CRABS CLAW (CRC) and INNER NO OUTER (INO), show pistil-specific expression and are involved in the pistil and early fruit development [44-46]. Thus, we expected the expression patterns of their homologs in tomato to show pistil-specific expression, and we therefore examined this possibility. Three of nine YABBY transcription factor genes found in the tomato genome, SlCRCa (Solyc01g0101240), SlCRCb (Solyc05g012050), and SlINO (Solyc05g005240), were specifically expressed in flower buds and flower at the anthesis stage, which is consistent with the results obtained in a previous study [47] (S2B Fig). SlCRCa was expressed in the early stage of pistil development, and SlCRCb and SlINO were expressed through all stages of pistil development, while they were barely expressed in the other tissues (S2B Fig). These data support the quality of the transcriptome dataset. Next, according to RPKM values, we narrowed down the list of genes to those with RPKM values greater than 0.5 in at least one pistil sample and less than 0.5 in the other tissues, resulting in the identification of 532 of the initial candidate PSGs obtained by the direct-mapping method (Fig 2B).
Fig 2

Identification of genes preferentially expressed in pistils based on the direct-mapping method.

(A) PCA analysis of the RNA-seq data. (B) Identification and validation of the expression of 532 candidate genes with RPKM values greater than 0.5 in at least one pistil sample and less than 0.5 in the other tissue samples using publically available data. Out of 532 genes, 206 were found to be expressed in the transcriptome produced by Pattison et al. (2015) [27]. On the other hand, the expression of 376 genes was detected in at least one sample from many different tissues and conditions, and 275 of these showed RPKM values less than 1 in nine vegetative samples (Floral organ specific genes). Finally, by comparing the two gene sets, 108 genes were found in both sets, identified as pistil-specific genes (PSGs). (C) Heatmap of the expression of 108 genes in 17 different samples. Normalized Log2-transformed expression data were visualized by constructing a heatmap using MeV software. Hierarchical clustering by Pearson correlation was conducted. (D) Gene ontology analysis was performed using AgriGO (http://bioinfo.cau.edu.cn/agriGO/). Only one category, carboxylesterase activity (GO:0004091), was significantly (FDR<0.05) represented in the gene set, while 56 genes were not annotated and were not assigned to GO terms. bottom table; Gene ID and functional annotation described in the SGN database.

Identification of genes preferentially expressed in pistils based on the direct-mapping method.

(A) PCA analysis of the RNA-seq data. (B) Identification and validation of the expression of 532 candidate genes with RPKM values greater than 0.5 in at least one pistil sample and less than 0.5 in the other tissue samples using publically available data. Out of 532 genes, 206 were found to be expressed in the transcriptome produced by Pattison et al. (2015) [27]. On the other hand, the expression of 376 genes was detected in at least one sample from many different tissues and conditions, and 275 of these showed RPKM values less than 1 in nine vegetative samples (Floral organ specific genes). Finally, by comparing the two gene sets, 108 genes were found in both sets, identified as pistil-specific genes (PSGs). (C) Heatmap of the expression of 108 genes in 17 different samples. Normalized Log2-transformed expression data were visualized by constructing a heatmap using MeV software. Hierarchical clustering by Pearson correlation was conducted. (D) Gene ontology analysis was performed using AgriGO (http://bioinfo.cau.edu.cn/agriGO/). Only one category, carboxylesterase activity (GO:0004091), was significantly (FDR<0.05) represented in the gene set, while 56 genes were not annotated and were not assigned to GO terms. bottom table; Gene ID and functional annotation described in the SGN database.

Validation of the expression specificity of the candidate genes using publically available datasets

To reconfirm the tissue-specific expression of the 532 candidate genes, we performed comparative analyses between our transcriptome dataset and two publicly available transcriptome datasets (Experiment 1 and Experiment 2) from 26 samples, including vegetative tissues and floral tissues derived from tomato cv. Heinz and wild relative S. pimpinellifolium (strain. LA1589) available in the Tomato Functional genomics database (http://ted.bti.cornell.edu/); Experiment 1 (Exp1; Tomato Genome Consortium, 2012), Experiment 2 (Exp2; accession no. PRJNA179156). As a result, 376 of the 532 candidate genes were detected in at least one of the 26 samples from the public data, suggesting that these genes are most likely expressed in tomato plants. We investigated the expression levels of the 376 genes in nine different vegetative samples. We then excluded genes whose RPKM values were >1 in any of nine vegetative samples and identified 275 genes as “Floral organ-specific genes” (Fig 2B). Alternatively, to obtain information about the cell types in which the candidate genes are expressed, we investigated their expression patterns in cell-type-specific transcriptome data from pistils of wild tomato (S. pimpinellifolium) [27]. We then selected genes expressed in pistils based on the criterion used by Pattison et al. [27]; genes with RPM values > 2 in at least one sample were chosen. In total, 206 genes were defined as “Pistil expressed genes”; their expression was evident in the pistil, especially after anthesis, while the other 326 genes excluded by this step may not be expressed in the pistil or may be expressed only at the earlier stages than 1 DBF (Fig 2B). We compared “Floral organ-specific genes” and “Pistil expressed genes” and selected redundant genes, ultimately identifying 108 genes as PSGs by the direct-mapping method (Fig 2B and 2C, Table 1 and S2 Table). Among these, 56 genes had not been characterized. Public transcriptome data analysis provided information about both the organs and cell types in which the 108 PSGs were expressed. Using cell-type-specific transcriptome dataset from pistils of wild relative S. pimpinellifolium [27], we obtained spatial information about the expression of PSGs within the pistil (S3 Table). Hierarchical heat mapping clearly showed their cell-type-specific expression profiles (Fig 3A). Remarkably, roughly two-thirds of the genes appeared to show highly tissue-specific expression in the ovule and/or seed tissues (embryo, endosperm, seed coat). While many genes were preferentially expressed in the ovule and the seed tissues except seed coat, several genes were preferentially expressed in the pericarp at anthesis, in the placenta, and in the seed coat after pollination (Fig 3A). For example, five genes were preferentially expressed in the pericarp before pollination: genes encoding cinnamoyl CoA reductase-like protein (Solyc01g008540), Unknown Protein (Solyc04g074890), homeobox-leucine zipper-like protein (Solyc01g010600), B3 domain-containing protein Os03g0212300–like protein (Solyc06g074160), and Unknown Protein (Solyc03g123770). Furthermore, the gene encoding cytokinin oxidase/dehydrogenase 8 (SlCKX8, Solyc10g017990), TNFR/CD27/30/40/95 cysteine-rich region (Solyc04g014750), Unknown Protein (Solyc03g031660), Unknown Protein (Solyc07g053400), and Ramosa1 C2H2 zinc-finger transcription factor (Solyc09g089590) were preferentially expressed in the seed coat. Solyc09g089590 encodes one of two homologous proteins of Arabidopsis SUPERMAN (SUP), which regulates auxin biosynthesis [48]. In addition, the expression of 63 out of 108 genes was detected also in the recently published ovary transcriptome dataset derived from cultivated tomato ‘Moneymaker’ [23], in which RNA-seq analyses against ovule and ovary wall tissue were conducted; their average expression levels were over FPKM of 0.5 [23] (S4 Table). The 55 other genes were not detected in that dataset, indicating that these 55 genes were barely expressed in cultivated tomato or were only expressed in other type of tissues such as the placenta and septum, which were excluded from their experiment.
Table 1

List of 108 pistil-specific genes (PSGs) identified by the direct-mapping-based method.

#ITAG IDDescription in ITAG2.40Homologue in Arabidopsislength (aa)Identities (%)
1Solyc01g007270Cytokinin riboside 5&apos;-monophosphate phosphoribohydrolase LOG (AHRD V1 **—LOG_ORYSJ)AT5G0630021756/6882
2Solyc01g008540Cinnamoyl CoA reductase-like protein (AHRD V1 ***- B9HNY0_POPTR); Interpro domain(s) IPR016040 NAD(P)-binding domainAT5G19440326223/31571NAD(P)-binding Rossmann-fold superfamily protei
3Solyc01g010600Homeobox-leucine zipper-like protein (AHRD V1 *-*- Q3HRT1_PICGL); contains In contains terpro domain(s) IPR001356 HomeoboxAT1G69780294150/30050ATHB13
4Solyc01g016530Unknown Protein (AHRD V1); contains Interpro domain(s) IPR008507 Protein of unknown function DUF789AT1G7321031432/6946Protein of unknown function (DUF789) 
5Solyc01g068440Os06g0207500 protein (Fragment) (AHRD V1 ***- Q0DDQ9_ORYSJ); contains Interpro domain(s) IPR004253 Protein of unknown function DUF231, plantAT2G42570367166/34149TBL39 (TRICHOME BIREFRINGENCE-LIKE 39 )
6Solyc01g079560B3 domain-containing protein Os11g0197600 (AHRD V1 ***- Y1176_ORYSJ); contains Interpro domain(s) IPR003340 Transcriptional factor B3AT3G1899034130/9233VRN1, REM39
7Solyc01g081360Unknown Protein (AHRD V1)---- 
8Solyc01g090300Ethylene responsive transcription factor 1b (AHRD V1 *-*- C0J9I8_9ROSA); contains Interpro domain(s) IPR001471 Pathogenesis-related transcriptional factor and ERF, DNA-bindingAT2G4484022669/10764ATERF13, EREBP, ERF13
9Solyc01g090820Expansin B1 (AHRD V1 ***- C8CC40_RAPSA); contains Interpro domain(s) IPR007112 Expansin 45, endoglucanase-likeAT1G65680273119/24948ATEXPB2, EXPB2, ATHEXP BETA 1.4
10Solyc01g095760UDP-glucosyltransferase (AHRD V1 ***- Q8LKG3_STERE); contains Interpro domain(s) IPR002213 UDP-glucuronosyl/UDP-glucosyltransferaseAT5G49690460164/47135UDP-Glycosyltransferase superfamily protein
11Solyc01g104390Blue copper protein (AHRD V1 **—B6TT37_MAIZE); contains Interpro domain(s) IPR003245 Plastocyanin-likeAT1G1780012949/11642ARPN
12Solyc01g106140F-box protein family-like (AHRD V1 *-*- Q6ZCS3_ORYSJ); contains Interpro domain(s) IPR005174 Protein of unknown function DUF295AT3G2575034841/16225F-box family protein with a domain of unknown function (DUF295)
13Solyc01g106730MADS box transcription factor 1 (AHRD V1 *-*- D9IFM1_ONCHC); contains Interpro domain(s) IPR002100 Transcription factor, MADS-boxAT5G6044029995/16059AGL62 
14Solyc01g106980Endo-1 4-beta-xylanase (AHRD V1 *—B6SW51_MAIZE); contains Interpro domain(s) IPR013781 Glycoside hydrolase, subgroup, catalytic coreAT4G33840576276/54551Glycosyl hydrolase family 10 protein
15Solyc01g108380Protease inhibitor protein (AHRD V1 -**- B3FNP9_HEVBR); contains Interpro domain(s) IPR000864 Proteinase inhibitor I13, potato inhibitor IAT2G389008827/6144Serine protease inhibitor, potato inhibitor I-type family protein
16Solyc02g022860FAD-binding domain-containing protein (AHRD V1 **—D7MFI0_ARALY); contains Interpro domain(s) IPR006094 FAD linked oxidase, N-terminalAT4G20820532243/53246FAD-binding Berberine family protein
17Solyc02g032150Unknown Protein (AHRD V1)-----
18Solyc02g067630Polygalacturonase 1 (AHRD V1 ***- O22311_SOLLC); contains Interpro domain(s) IPR000408 Regulator of chromosome condensation, RCC1 IPR000743 Glycoside hydrolase, family 28AT2G43860384232/38960Pectin lyase-like superfamily protein
19Solyc02g069330Unknown Protein (AHRD V1); contains Interpro domain(s) IPR006501 Pectinesterase inhibitorAT5G6462018026/8033C/VIF2, ATC/VIF2
20Solyc02g072280Subtilisin-like protease (AHRD V1 **—Q9LWA3_SOLLC); contains Interpro domain(s) IPR015500 Peptidase S8, subtilisin-relatedAT5G67360757335/76144ARA12
21Solyc02g077170X1 (Fragment) (AHRD V1 *—Q7FSP8_MAIZE); contains Interpro domain(s) IPR005379 Region of unknown function XHAT1G1591063496/25937XH/XS domain-containing protein
22Solyc02g078090Unknown Protein (AHRD V1)-----
23Solyc02g079080F-box family protein (AHRD V1 ***- B9GFH4_POPTR); contains Interpro domain(s) IPR001810 Cyclin-like F-boxAT5G02930469108/44025F-box/RNI-like superfamily protein
24Solyc02g084140Unknown Protein (AHRD V1)-----
25Solyc02g085190GATA transcription factor 19 (AHRD V1 *-** B6TS85_MAIZE); contains Interpro domain(s) IPR000679 Zinc finger, GATA-typeAT3G50870295127/29243MNP, HAN, GATA18
26Solyc02g086290Receptor serine/threonine kinase (AHRD V1 ***- Q9FF31_ARATH)AT1G6694033276/26728protein kinase-related
27Solyc02g087490Prolyl 4-hydroxylase alpha subunit-like protein (AHRD V1 ***- Q9LSI6_ARATH); contains Interpro domain(s) IPR006620 Prolyl 4-hydroxylase, alpha subunitAT3G28490288176/26566Oxoglutarate/iron-dependent oxygenase 
28Solyc02g092030Cbs domain containing protein expressed (Fragment) (AHRD V1 *—A6N095_ORYSI); contains Interpro domain(s) IPR002550 Protein of unknown function DUF21AT2G14520423283/42367CBS domain-containing protein with a domain of unknown function (DUF21)
29Solyc02g093540Cytochrome P450AT3G50660513193/47041DWF4, CYP90B1, CLM, SNP2, SAV1, PSC1
30Solyc03g020000Pentatricopeptide repeat-containing protein (AHRD V1 *-*- D7L041_ARALY); contains Interpro domain(s) IPR002885 Pentatricopeptide repeatAT2G22410681181/48737SLO1
31Solyc03g025240Multidrug resistance protein mdtK (AHRD V1 *—MDTK_YERP3); contains Interpro domain(s) IPR002528 Multi antimicrobial extrusion protein MatEAT4G25640514273/39869DTX35
32Solyc03g031660Unknown Protein (AHRD V1)-----
33Solyc03g058330Unknown Protein (AHRD V1)AT5G0676015857/14440LEA4-5
34Solyc03g096190Receptor like kinase, RLKAT3G475701010441/100343Leucine-rich repeat protein kinase family protein
35Solyc03g111190Auxin-independent growth promoter-like protein (AHRD V1 ***- Q9FMW3_ARATH); contains Interpro domain(s) IPR004348 Protein of unknown function DUF246, plantAT5G63390559343/55762O-fucosyltransferase family protein 
36Solyc03g115350Expansin 2 (AHRD V1 ***- C0KLG9_PYRPY); contains Interpro domain(s) IPR002963 ExpansinAT5G39280259146/22365ATEXPA23, ATEXP23, ATHEXP ALPHA 1.17
37Solyc03g116410Zinc finger CCCH domain-containing protein 39 (AHRD V1 ***- C3H39_ARATH); contains Interpro domain(s) IPR000571 Zinc finger, CCCH-typeAT3G1936038654/19927Zinc finger (CCCH-type) family protein
38Solyc03g123770Unknown Protein (AHRD V1)-----
39Solyc03g123970Lipid-binding serum glycoprotein family protein (AHRD V1 *-*- D7LAX8_ARALY)AT3G2027072226/5151lipid-binding serum glycoprotein family 
40Solyc04g007310Thaumatin-like protein (AHRD V1 ***- C1K3P2_PYRPY); contains Interpro domain(s) IPR001938 Thaumatin, pathogenesis-relatedAT4G38670253108/25243Pathogenesis-related thaumatin superfamily protein
41Solyc04g008670Gibberellin 2-beta-dioxygenase 7 (AHRD V1 **** B6SZM8_MAIZE); contains Interpro domain(s) IPR005123 Oxoglutarate and iron-dependent oxygenaseAT4G21200336166/30255ATGA2OX8, GA2OX8 
42Solyc04g014750TNFR/CD27/30/40/95 cysteine-rich region (AHRD V1 ***- Q2HT38_MEDTR)AT1G1206410934/7347Unkown protein
43Solyc04g025740Homeobox-leucine zipper protein ROC3 (AHRD V1 ***- ROC3_ORYSJ); contains Interpro domain(s) IPR001356 HomeoboxAT1G7336072252/12542HDG11, EDT1, ATHDG11
44Solyc04g051070Unknown Protein (AHRD V1)-----
45Solyc04g058040Laccase (AHRD V1 ***- Q9AUI3_PINTA); contains Interpro domain(s) IPR011707 Multicopper oxidase, type 3AT5G0936056982/21239LAC14
46Solyc04g072870Beta-D-xylosidase (AHRD V1 **** Q8W011_HORVU); contains Interpro domain(s) IPR001764 Glycoside hydrolase, family 3, N-terminalAT1G78060767445/75659Glycosyl hydrolase family protein
47Solyc04g074320Zinc finger protein (AHRD V1 *—D7KHP2_ARALY); contains Interpro domain(s) IPR007087 Zinc finger, C2H2-typeAT1G34790303143/20072TT1, WIP1
48Solyc04g074890Unknown Protein (AHRD V1)-----
49Solyc04g078240Natural resistance associated macrophage protein (AHRD V1 *—B3W4E1_BRAJU); contains Interpro domain(s) IPR001046 Natural resistance-associated macrophage proteinAT1G4724053073/9577NRAMP2, ATNRAMP2
50Solyc04g081180Unknown Protein (AHRD V1)-----
51Solyc04g082520Ring zinc finger protein (Fragment) (AHRD V1 *—A6MH00_LILLO); contains Interpro domain(s) IPR008166 Protein of unknown function DUF23AT4G37420588233/50047Domain of unknown function (DUF23)
52Solyc05g005240YABBY-like transcription factor CRABS CLAW-like protein (AHRD V1 **-* Q6SRZ7_ANTMA); contains Interpro domain(s) IPR006780 YABBY proteinAT1G23420231100/18454INO
53Solyc05g008320Fasciclin-like arabinogalactan protein (AHRD V1 ***- B9N201_POPTR); contains Interpro domain(s) IPR000782 FAS1 domainAT5G40940424114/32835FLA20
54Solyc05g010190Unknown Protein (AHRD V1)AT3G4256511948/12140ECA1 gametogenesis related family protein
55Solyc05g010200Unknown Protein (AHRD V1)-----
56Solyc05g013230Unknown Protein (AHRD V1)AT3G2388036421/5737F-box and associated interaction domains-containing protein
57Solyc05g052440Os03g0291800 protein (Fragment) (AHRD V1 **—Q0DSS4_ORYSJ); contains Interpro domain(s) IPR004253 Protein of unknown function DUF231, plantAT2G40320425279/41168TBL33
58Solyc05g052530Endoglucanase 1 (AHRD V1 ***- B6U0J0_MAIZE); contains Interpro domain(s) IPR001701 Glycoside hydrolase, family 9AT2G44550490292/47656ATGH9B10
59Solyc06g007380Os08g0119500 protein (Fragment) (AHRD V1 *-*- Q0J8C9_ORYSJ)AT5G01710513258/51051methyltransferases
60Solyc06g048400Unknown Protein (AHRD V1); contains Interpro domain(s) IPR008502 Protein of unknown function DUF784, Arabidopsis thalianaAT3G3038711534/9735Protein of unknown function (DUF784)
61Solyc06g060450Transmembrane emp24 domain-containing protein 10 (AHRD V1 ***- B6SSF8_MAIZE); contains Interpro domain(s) IPR000348 emp24/gp25L/p24AT1G2190216108/21051emp24/gp25L/p24 family/GOLD family protein
62Solyc06g070950ATP-binding cassette (ABC) transporter 17 (AHRD V1 ***- Q4H493_RAT); contains Interpro domain(s) IPR003439 ABC transporter-likeAT3G47780935503/93754ATATH6, ATH6
63Solyc06g073100GDSL esterase/lipase At3g27950 (AHRD V1 ***- GDL54_ARATH); contains Interpro domain(s) IPR001087 Lipase, GDSLAT3G27950361197/37553GDSL-like Lipase/Acylhydrolase superfamily protein
64Solyc06g074160B3 domain-containing protein Os03g0212300 (AHRD V1 ***- Y3123_ORYSJ); contains Interpro domain(s) IPR003340 Transcriptional factor B3AT3G0616037438/13129AP2/B3-like transcriptional factor family protein
65Solyc06g075200Unknown Protein (AHRD V1)AT5G374748028/8334Putative membrane lipoprotein
66Solyc07g007520Unknown Protein (AHRD V1)-----
67Solyc07g032700Unknown Protein (AHRD V1)-----
68Solyc07g043410UDP-glucosyltransferase family 1 protein (AHRD V1 **** C6KI43_CITSI); contains Interpro domain(s) IPR002213 UDP-glucuronosyl/UDP-glucosyltransferaseAT2G15480484166/48734UGT73B5
69Solyc07g053400Unknown Protein (AHRD V1)-----
70Solyc07g054360Unknown Protein (AHRD V1)-----
71Solyc07g062320Unknown Protein (AHRD V1)-----
72Solyc07g064780Unknown Protein (AHRD V1)-----
73Solyc08g015750F-box family protein (AHRD V1 ***- B9I6K2_POPTR); contains Interpro domain(s) IPR001810 Cyclin-like F-boxAT5G0292046958/20031F-box/RNI-like superfamily protein 
74Solyc08g061120Unknown Protein (AHRD V1)-----
75Solyc08g066400Protein kinase (Fragment) (AHRD V1 *-*- A2Q5N5_MEDTR)AT2G25760676217/33365Protein kinase family protein
76Solyc08g074920Aspartic proteinase nepenthesin I (AHRD V1 **—A9ZMF9_NEPAL); contains Interpro domain(s) IPR001461 Peptidase A1AT5G33340437206/43747CDR1 
77Solyc08g080020Serine protease inhibitor potato inhibitor I-type family protein (AHRD V1 ***- D7LT19_ARALY); contains Interpro domain(s) IPR000864 Proteinase inhibitor I13, potato inhibitor IAT3G468608532/8637Serine protease inhibitor, potato inhibitor I-type family protein
78Solyc08g082260Integrin-linked kinase-associated serine/threonine phosphatase 2C (AHRD V1 **** ILKAP_RAT); contains Interpro domain(s) IPR015655 Protein phosphatase 2CAT2G29380362134/29845HAI3 
79Solyc09g011280Unknown Protein (AHRD V1); contains Interpro domain(s) IPR006501 Pectinesterase inhibitorAT3G1722017331/13124ATPMEI2
80Solyc09g011290Invertase inhibitor homolog (AHRD V1 ***- O49603_ARATH); contains Interpro domain(s) IPR006501 Pectinesterase inhibitorAT5G6462018052/17330C/VIF2, ATC/VIF2
81Solyc09g025200Ribosomal protein L18 (AHRD V1 *-*- B7FMF5_MEDTR); contains Interpro domain(s) IPR000039 Ribosomal protein L18eAT3G0559018731/5062RPL18
82Solyc09g042760ZIP4/SPO22 (AHRD V1 **—A5Y6I6_ARATH); contains Interpro domain(s) IPR013940 Meiosis specific protein SPO22AT5G48390936527/93656ATZIP4
83Solyc09g047860HAT family dimerisation domain containing protein (AHRD V1 *-*- Q2R1C3_ORYSJ); contains Interpro domain(s) IPR008906 HAT dimerisationAT5G3340650952/17330hAT dimerisation domain-containing protein / transposase-related
84Solyc09g056030Unknown Protein (AHRD V1)AT4G1257087317/4439UPL5
85Solyc09g056040Ubiquitin-protein ligase 1 (AHRD V1 ***- Q5CHN2_CRYHO); contains Interpro domain(s) IPR000569 HECTAT4G12570873153/41337UPL5
86Solyc09g066050Homeodomain-containing transcription factor FWA (AHRD V1 **-* B5BQ02_ARASU); contains Interpro domain(s) IPR002913 Lipid-binding STARTAT1G73360722211/58736HDG11, EDT1, ATHDG11
87Solyc09g073020Unknown Protein (AHRD V1)-----
88Solyc09g075110Unknown Protein (AHRD V1)-----
89Solyc09g089590Ramosa1 C2H2 zinc-finger transcription factor (AHRD V1 *-*- D0UTY8_ZEAMM); contains Interpro domain(s) IPR007087 Zinc finger, C2H2-typeAT3G2313020478/19278SUP, FON1, FLO10
90Solyc09g089960Unknown Protein (AHRD V1)---- 
91Solyc09g091300Self-incompatibility protein (Fragment) (AHRD V1 -**- C8C1B5_9MAGN); contains Interpro domain(s) IPR010264 Plant self-incompatibility S1AT3G2688016135/13533Plant self-incompatibility protein S1 family
92Solyc10g005170Purine permease (AHRD V1 *—* B6TET5_MAIZE); contains Interpro domain(s) IPR004853 Protein of unknown function DUF250AT1G30840382208/33063ATPUP4, PUP4 
93Solyc10g005440Serine/threonine-protein kinase receptor (AHRD V1 **** B6U2B7_MAIZE); contains Interpro domain(s) IPR002290 Serine/threonine protein kinaseAT4G21390849440/85851B120, S-locus lectin protein kinase family protein
94Solyc10g017990Cytokinin oxidase/dehydrogenase 2 (AHRD V1 *-** C0LPA7_SOLTU); contains Interpro domain(s) IPR015345 Cytokinin dehydrogenase 1, FAD and cytokinin bindingAT2G41510575214/52541ATCKX1, CKX1
95Solyc10g044690Annexin (AHRD V1 ***- D2D2Z9_GOSHI); contains Interpro domain(s) IPR009118 Annexin, type plantAT5G12380316173/31655ANNAT8
96Solyc10g047720Unknown Protein (AHRD V1)AT5G2680515644/16327unknown protein
97Solyc10g050750Pectinacetylesterase like protein (Fragment) (AHRD V1 *—Q56WP8_ARATH); contains Interpro domain(s) IPR004963 PectinacetylesteraseAT4G19420397234/38161Pectinacetylesterase family protein
98Solyc10g051370LRR receptor-like serine/threonine-protein kinase, RLPAT2G16250915105/19853Leucine-rich repeat protein kinase family protein
99Solyc10g055600S-phase kinase-associated protein 1A (AHRD V1 **—B2VUU5_PYRTR); contains Interpro domain(s) IPR001232 SKP1 componentAT4G3421015238/4781ASK11, SK11
100Solyc11g005500ECA1 protein (AHRD V1 *-*- Q53JF8_ORYSJ); contains Interpro domain(s) IPR010701 Protein of unknown function DUF1278AT1G7675015863/12451EC1.1
101Solyc11g005540ECA1 protein (AHRD V1 *-*- Q53JF8_ORYSJ); contains Interpro domain(s) IPR010701 Protein of unknown function DUF1278AT2G2175012561/13047EC1.3
102Solyc11g006840Unknown Protein (AHRD V1)-----
103Solyc11g012650TPD1 (AHRD V1 *-*- Q6TLJ2_ARATH)AT1G3258317966/11259TPD1-like
104Solyc11g043160Endo-1 4-beta-xylanase (AHRD V1 ***- B6SW51_MAIZE); contains Interpro domain(s) IPR013781 Glycoside hydrolase, subgroup, catalytic coreAT4G33840576217/54540Glycosyl hydrolase family 10 protein
105Solyc11g070010F8A5.6 protein (AHRD V1 **—Q9ZP57_ARATH)AT1G60500669117/39130DRP4C
106Solyc11g072650Trans-2-enoyl CoA reductase (AHRD V1 **—C5MRG3_9ROSI); contains Interpro domain(s) IPR002085 Alcohol dehydrogenase superfamily, zinc-containingAT3G45770375215/33564Polyketide synthase, enoylreductase
107Solyc12g019050Exostosin-like (AHRD V1 ***- A4Q7M8_MEDTR); contains Interpro domain(s) IPR004263 Exostosin-likeAT3G42180470203/34957Exostosin family protein
108Solyc12g042340Genomic DNA chromosome 5 P1 clone MAC9 (AHRD V1 ***- Q9FLS4_ARATH)AT5G61865417136/36835unknown protein
Fig 3

Characterization and validation of the 108 genes.

(A) Many of the 108 PSGs were predominantly expressed in ovules and/or seeds in the pistil. Reads per million (RPM) values of PSGs in S. pimpinellifolium were visualized by constructing a heatmap using MeV software. Hierarchical clustering by Pearson correlation was conducted. The transcriptome data were obtained from [27]. OPE; ovule preferentially expressed genes, EPE; embryo preferentially expressed genes. (B) Validation of the expression of ovule preferentially expressed (OPE) genes, embryo preferentially expressed (EPE) genes, and several transcription factor genes by RT-PCR. Most of the genes were specifically expressed in the pistil. Three pistil-specific transcription factor genes, SlATHB13/23-like (Solyc01g010600), SlINO (Solyc05g005240), and SlTT1 (Solyc10g051370) showed pistil-specific expression before anthesis. Bottom one represents the expression of the internal control gene SAND [41].

Characterization and validation of the 108 genes.

(A) Many of the 108 PSGs were predominantly expressed in ovules and/or seeds in the pistil. Reads per million (RPM) values of PSGs in S. pimpinellifolium were visualized by constructing a heatmap using MeV software. Hierarchical clustering by Pearson correlation was conducted. The transcriptome data were obtained from [27]. OPE; ovule preferentially expressed genes, EPE; embryo preferentially expressed genes. (B) Validation of the expression of ovule preferentially expressed (OPE) genes, embryo preferentially expressed (EPE) genes, and several transcription factor genes by RT-PCR. Most of the genes were specifically expressed in the pistil. Three pistil-specific transcription factor genes, SlATHB13/23-like (Solyc01g010600), SlINO (Solyc05g005240), and SlTT1 (Solyc10g051370) showed pistil-specific expression before anthesis. Bottom one represents the expression of the internal control gene SAND [41].

Validation of gene expression patterns by RT-PCR

We then verified the expression patterns of the PSGs by RT-PCR analysis. Since many of these genes were highly expressed in the ovule and/or seed, especially the embryo (Fig 3A), we initially focused on genes specifically expressed in these tissues. Among the 108 PSG candidates, the top-five PSGs highly expressed in 0 DAF ovules were designated Ovule Preferentially Expressed genes 1–10 (OPE1–5) (S5 Table). We verified the tissue-specific expression of five of these genes by RT-PCR analysis (Fig 3B). OPE1 was preferentially but not exclusively expressed in the pistil at anthesis, OPE2 and OPE5 were specifically expressed in the pistil at both 1 DBF and 0 DAF, and the expression of OPE3 in the pistil was not detected in this experiment. OPE4 was expressed in the pistil at 0 DAF and mature green fruits. We also designated the top-five PSGs that were highly expressed in 4 DAF embryos as Embryo Preferentially Expressed genes 1–5 (EPE1–5) (S5 Table). EPE1, encoding a self-incompatibility protein-like protein according to SGN, might function in pollen-pistil interactions, while most of the EPEs had not been functionally characterized or annotated in previous studies. Like the OPEs, we investigated the expression of the five EPEs (EPE1–5) by RT-PCR to validate their tissue-specific expression patterns. Three genes, EPE1-EPE3, were specifically expressed in the pistil and EPE5 was preferentially expressed in the pistil especially before anthesis, although we failed to detect the expression of EPE4 in our RT-PCR analysis (Fig 3B). EPE1 was specifically expressed in the pistil throughout pistil/fruit development but was not expressed in mature red fruits. EPE3 was also specifically expressed in the pistil, but only after anthesis. EPE2 was expressed exclusively during fruit set initiation between 1 DBF and 0 DAF (Fig 3B). In summary, three OPEs and four EPEs were specifically expressed in pistils, confirming their tissue-specific expression in the pistil (Fig 3B). Therefore, we confirmed the tissue-specific expression of PSGs in the pistil. These results indicate that the direct-mapping method also successfully identified true PSGs.

GO analysis using AgriGO

To elucidate the enriched functional categories of the 108 identified PSGs, we performed GO analysis using AgriGO. A false discovery rate (FDR; e-value corrected for list size) of < 0.05 was used as the criterion to obtain enriched GO terms. Consequently, only one category, Carboxylesterase activity (GO:0004091), showed significant abundance (p-value = 0.0017, FDR = 0.037) (Fig 2D). This category includes five genes (listed in Fig 2D), three of which (Solyc02g069330, Solyc09g011280, and Solyc09g011290) were assigned to the sub-term “Pectinesterase inhibitor”. Even though Solyc09g011290 was classified as a “Pectinesterase inhibitor”, it was labeled as an “invertase inhibitor homolog” in the SGN database and has higher sequence homology with the invertase inhibitor group that includes invertase inhibitor 1 (INVINH1, Solyc12g099200), which specifically regulates cell wall invertase activity in early developing fruits [49]. Pectin, a major component of the primary cell walls of higher plants, is methyl-esterified by pectin methyltransferase (PMT) before its transport to the cell wall following its biosynthesis in Golgi bodies [50,51], whereas pectin methylesterase (PME) catalyzes the removal of methyl esters from pectin [52-54]. The removal of methyl group from pectin allows carboxyl groups to form Ca2+- and Mg2+-mediated linkages, leading to the hardening of pectin [55,56]. In addition, pectin methylesterase inhibitors (PMEIs) directly interact with PME and inhibit its activity, affecting pectin composition in the cell wall. Lionetti et al. [57] reported that overexpressing Arabidopsis PMEI increased the degree of pectin methylesterification by approximately 16%, resulting in longer roots due to the promotion of cell elongation. Therefore, the degree of methylation and demethylation of pectin determines the balance between extensibility and rigidity, affecting growth and cell shape. In tomato, PMEU1, a ubiquitously expressed pectin methylesterase gene, is expressed during early fruit development [58]. Terao et al. [59] recently reported the occurrence of rapid pectin metabolism during the early stage of fruit development in tomato: immunolocalization analysis demonstrated that methyl-esterified pectin levels in the ovary increased from 1 DBF to 3 DAF [59]. During fruit set, the transition of cell state from cell division to cell expansion occurs during a short period of time, and the regulation of this process is important for determining the size of the fruit. Therefore, it would be interesting to investigate whether PMEI plays a role in the post-translational regulation of PME and cell wall state during fruit set. In addition, the pectinesterase inhibitor protein family includes several enzyme inhibitors such as invertase (Beta-fructofuranosidase) inhibitors, each of which has a specific target [60,61]. Solyc09g011290 was annotated as an invertase inhibitor homolog in the SGN database. Invertase inhibitors regulate specific invertases in a post-translational manner, negatively affecting the enzyme activity of their targets [49,61]. We found that Solyc09g011290 was specifically and highly expressed in the ovule/seed (S3 Table). The expression of Solyc09g011290 was induced during anthesis and remained at high levels in the absence of pollination but was down-regulated by pollination and hormone treatments (S6 Fig). Several studies on the cell wall invertase (CWIN) and INVINH1 in tomato suggest that these proteins play important roles in seed set and fruit set by regulating the unloading of sugar from the phloem during the ovary-to-fruit transition [4,49,62,63]. Thus, the expression pattern of Solyc09g011290, the up-regulation during flowering and the down-regulation by the fruit-set stimulus (S6 Fig), suggests that Solyc09g011290 may also participate in the modulation of the sugar unloading to unpollinated pistil via post-translational inhibition of invertase activity.

Identification of pistil-specific transcriptional regulators

Pistil-specific transcription factors

Next, we performed similar RT-PCR analyses of several transcription factor genes listed among the PSGs and confirmed the tissue-specific expression of three transcription factor genes (Fig 3B). Solyc04g074320 (SlTT1-like) shares high homology (71.5%) with Arabidopsis zinc-finger protein TRANPARENT TESTA1 (TT1) [64]. Arabidopsis TT1 expression is restricted to developing ovules and young seeds and functions in the accumulation of proanthocyanidin pigments in the seed coat [65,66], while in the current study, tomato SlTT1-like transcripts were exclusively detected in the ovule, embryo, and endosperm but not in the seed coat (Fig 3B). Mazzucato et al. [67] provided evidence that higher anthocyanin content is associated with increased early fruit growth in non-pollinated flowers. Furthermore, there is an evidence that the alteration of the flavonoid pathway via the regulation of biosynthesis genes induces seedless fruit development in both a pollination-dependent and pollination-independent manner [68,69]. Further elucidation of the function of SlTT1-like in the control of flavonoid-related genes may provide insight into the role of flavonoids during fruit set initiation. SlINO (Solyc05g005240) was identified as a pistil-specific YABBY transcription factor gene (S2B Fig and Fig 3B). YABBY family proteins contain two conserved domains, i.e., a C2C2 zinc-finger-like domain in their N-termini and a helix-loop-helix domain known as the YABBY domain [70]. In Arabidopsis, two YABBY genes, INO and CRC, show tissue-specific expression in the pistil and are involved in pistil and early fruit development [44-46]. Nine YABBY genes were previously identified in tomato, three of which (SlCRCa, Solyc01g0101240; SlCRCb, Solyc05g012050; SlINO, Solyc05g005240) are specifically expressed in the flower bud and in open flowers at anthesis [50]. In the current study, we found that SlCRCa, SlCRCb, and SlINO were preferentially expressed in the pistil (S2B Fig). SlCRCa was expressed in the early stage of pistil development, while SlCRCb and SlINO were expressed during all stage of pistil development. Furthermore, we confirmed the tissue-specific expression of SlINO in pistils by RT-PCR analysis (Fig 3B), suggests its role in the regulation of pistil development [46]. Solyc01g010600 (SlATHB13/23-like), which encodes a homeodomain leucine zipper 1 transcription factor (HD-Zip TF), shares similarity with Arabidopsis class-1 HD-Zip genes AtHB13 and AtHB23 and was specifically expressed in the pistil before anthesis (Fig 3B). The HD-Zip TF family forms a large gene family that is divided into four classes; 58 HD-Zip proteins found in both Arabidopsis and tomato are listed in PlantTFDB version 3.0 (http://planttfdb.cbi.pku.edu.cn) [71,72]. Although the molecular functions of class-1 HD-Zip proteins in the regulation of pistil development remain elusive, AtHB13 and AtHB23 were shown to play negative roles in inflorescence stem elongation by affecting cell division, and AtHB13 also regulates pollen hydration and development [73]. In tomato, class-1 HD-Zip SlHZ24 functions as a transcriptional activator of SlGMP3 (encoding GDP-D-mannosepyrophosphorylase), which plays an important role in the production of the antioxidant ascorbate [74]. In addition, virus-induced gene silencing of class-1 HD-Zip LeHB1 reduced the mRNA accumulation of LeACO1 and inhibited ripening [75]. Further, Lin et al. [75] also reported that ectopic overexpression of LeHB1 led to the conversion of sepals into carpel-like structures. We also found that SlATHB13/23-like was highly expressed in the ovary wall in the pistil at anthesis (S3 and S4 Tables), suggesting its regulatory role in carpel and ovary wall development. Besides, we identified the type-I MADS box gene, Solyc01g106730 among PSGs (Table 1). MADS box genes encode transcription factors which are generally involved in homeotic regulation of reproductive organs and can be divided into two subfamilies (type-I and type-II) according to the presence of conserved domains [75, 76]. Some type-II MIKC MADS box genes play key roles as regulators of meristem identity, flowering time, and fruit and seed development [77,78], whereas little information is available about the function of type-I MADS box proteins. Most (38 out of 61) Arabidopsis type-I MADS box genes are expressed in the female gametophyte [79], whereas others exhibit highly specific expression such as in the central cell and embryo sac [80]. For example, Arabidopsis type-I AGAMOUS-LIKE61(AGL61)/DIANA (DIA) is expressed exclusively in the central cell and early endosperm and plays crucial role in endosperm development after fertilization through transcriptional control in the central cells [79-82]. AtAGL62 also plays important role in the endosperm and seed coat development [83-85]. Gene expression analysis using tissue-specific transcriptome data from wild tomato S. pimpinellifolium [27] revealed that the type-I MADS box gene Solyc01g106730 is preferentially expressed in the ovule (S3 and S4 Tables). In addition to the relationship between type-I MADS box genes and seed development, there is an evidence that down-regulation or mutation in type-II MADS box genes, such as TM29, TAP3, TM8, SlAGL11 or SlAGL6, results in parthenocarpy [86-90]. Thus, it would be important to investigate the roles of Solyc01g106730 in pistil, seed, and fruit development.

Pistil-specific peptide hormone-like small peptide genes and receptor-like proteins

The role of peptide hormones in plant signaling pathways is a popular focus of study [91-95]. The peptide hormone signaling system involves two main components: (1) small ligand proteins such as small cysteine-rich peptides (CRPs) and (2) receptor proteins such a leucine-rich receptor-like kinases (LRR-RLKs) [96]. CRPs function as signaling molecules (peptide hormones) in various plant species, which are required for many aspects of development including antimicrobial defense, pollen tube guidance, stomatal patterning, and early embryo patterning [97-104]. CRPs contain four, six, or eight conserved cysteine residues at their C-termini in addition to a secretion signal at their N-termini. Interestingly, a substantial number of PSGs identified in this study encode small proteins (44 out of 108 genes identified by the mapping-based method [40%]) less than 200 amino acids in length (Table 2). Small proteins are defined as proteins smaller than 200 amino acids according to previous reports [94,97,105]. Since peptide hormone-like small proteins share a conserved structure, we performed a sequence similarity search of the 44 identified small proteins and one TAPETUM DETERMINANT 1 (TPD1)-like protein (204 aa) by BLAST analysis and SignalP 4.1 server (http://www.cbs.dtu.dk/services/SignalP/) manually to investigate whether they have conserved residues or functional domains. Roughly half of these proteins also have a secretion signal in their N-termini (Table 2). Notably, through subsequent sequence analysis of these small proteins, four tissue-specific CRPs including an unknown gene (Solyc06g075200) and two LRR-RLK-like proteins were identified (Listed in Table 2, S5 Fig).
Table 2

List of pistil-specific or preferentially expressed small proteins.

#ITAG IDDescription in ITAG2.40length (aa)*Presence of predictedsecreted signal (aa)Homologue in Arabidopsislength (aa)Identities (%)Description"Expression in pistil of Moneymaker (Ovule and/or ovary wall) from Zhang et al (2016)[23]"
PSSP1Solyc01g016530Unknown Protein (AHRD V1); contains Interpro domain(s) IPR008507 Protein of unknown function DUF78987-AT1G7321031432/6946Protein of unknown function (DUF789)Ovule
PSSP2Solyc01g081360Unknown Protein (AHRD V1)1511–29-----Ovule
PSSP3Solyc01g108380Protease inhibitor protein (AHRD V1 -**- B3FNP9_HEVBR); contains Interpro domain(s) IPR000864 Proteinase inhibitor I13, potato inhibitor I77-AT2G389008827/6144Serine protease inhibitor, potato inhibitor I-type family proteinOvule
PSSP4Solyc02g069330Unknown Protein (AHRD V1); contains Interpro domain(s) IPR006501 Pectinesterase inhibitor1801–19AT5G6462018026/8033C/VIF2, ATC/VIF2Ovule
PSSP5Solyc03g058330Unknown Protein (AHRD V1)108-AT5G0676015857/14440LEA4-5Ovule
PSSP6Solyc04g081180Unknown Protein (AHRD V1)79------Ovule
PSSP7Solyc05g010200Unknown Protein (AHRD V1)1151–25-----Ovule
PSSP8Solyc06g048400Unknown Protein (AHRD V1); contains Interpro domain(s) IPR008502 Protein of unknown function DUF784, Arabidopsis thaliana155-AT3G3038711534/9735Protein of unknown function (DUF784)Ovule
PSSP9Solyc06g075200Unknown Protein (AHRD V1)811–22AT5G374748028/8334Putative membrane lipoproteinOvule
PSSP10Solyc07g062320Unknown Protein (AHRD V1)79------Ovule
PSSP11Solyc08g080020Serine protease inhibitor potato inhibitor I-type family protein (AHRD V1 ***- D7LT19_ARALY); contains Interpro domain(s) IPR000864 Proteinase inhibitor I13, potato inhibitor I1041–19AT3G468608532/8637Serine protease inhibitor, potato inhibitor I-type family proteinOvule
PSSP12Solyc09g011280Unknown Protein (AHRD V1); contains Interpro domain(s) IPR006501 Pectinesterase inhibitor1781–23AT3G1722017331/13124ATPMEI2Ovule
PSSP13Solyc09g089590Ramosa1 C2H2 zinc-finger transcription factor (AHRD V1 *-*- D0UTY8_ZEAMM); contains Interpro domain(s) IPR007087 Zinc finger, C2H2-type197-AT3G2313020478/19278SUP, FON1, FLO10Ovule
PSSP14Solyc11g005500ECA1 protein (AHRD V1 *-*- Q53JF8_ORYSJ); contains Interpro domain(s) IPR010701 Protein of unknown function DUF12781301–26AT1G7675015863/12451EC1.1Ovule
PSSP15Solyc11g005540ECA1 protein (AHRD V1 *-*- Q53JF8_ORYSJ); contains Interpro domain(s) IPR010701 Protein of unknown function DUF12781361–16AT2G2175012561/13047EC1.3Ovule
PSSP16Solyc11g006840Unknown Protein (AHRD V1)126------Ovule
PSSP17Solyc09g025200Ribosomal protein L18 (AHRD V1 *-*- B7FMF5_MEDTR); contains Interpro domain(s) IPR000039 Ribosomal protein L18e721–22AT3G0559018731/5062RPL18Ovary wall
PSSP18Solyc09g056030Unknown Protein (AHRD V1)82-AT4G1257087317/4439UPL5Ovary wall
PSSP19Solyc01g007270Cytokinin riboside 5&apos;-monophosphate phosphoribohydrolase LOG (AHRD V1 **—LOG_ORYSJ)70-AT5G0630021756/6882-Not detected
PSSP20Solyc01g079560B3 domain-containing protein Os11g0197600 (AHRD V1 ***- Y1176_ORYSJ); contains Interpro domain(s) IPR003340 Transcriptional factor B3109-AT3G1899034130/9233VRN1, REM39Not detected
PSSP21Solyc02g032150Unknown Protein (AHRD V1)147------Not detected
PSSP22Solyc02g084140Unknown Protein (AHRD V1)132------Not detected
PSSP23Solyc03g116410Zinc finger CCCH domain-containing protein 39 (AHRD V1 ***- C3H39_ARATH); contains Interpro domain(s) IPR000571 Zinc finger, CCCH-type117-AT3G1936038654/19927Zinc finger (CCCH-type) family proteinNot detected
PSSP24Solyc04g025740Homeobox-leucine zipper protein ROC3 (AHRD V1 ***- ROC3_ORYSJ); contains Interpro domain(s) IPR001356 Homeobox148-AT1G7336072252/12542HDG11, EDT1, ATHDG11Not detected
PSSP25Solyc04g051070Unknown Protein (AHRD V1)80------Not detected
PSSP26Solyc04g078240Natural resistance associated macrophage protein (AHRD V1 *-—B3W4E1_BRAJU); contains Interpro domain(s) IPR001046 Natural resistance-associated macrophage protein161-AT1G4724053073/9577NRAMP2, ATNRAMP2Not detected
PSSP27Solyc05g013230Unknown Protein (AHRD V1)118-AT3G2388036421/5737F-box and associated interaction domains-containing proteinNot detected
PSSP28Solyc07g054360Unknown Protein (AHRD V1)142------Not detected
PSSP29Solyc08g061120Unknown Protein (AHRD V1)190------Not detected
PSSP30Solyc09g073020Unknown Protein (AHRD V1)50------Not detected
PSSP31Solyc09g075110Unknown Protein (AHRD V1)63------Not detected
PSSP32Solyc10g047720Unknown Protein (AHRD V1)172-AT5G2680515644/16327unknown proteinNot detected
PSSP33Solyc10g055600S-phase kinase-associated protein 1A (AHRD V1 **—B2VUU5_PYRTR); contains Interpro domain(s) IPR001232 SKP1 component51-AT4G3421015238/4781ASK11, SK11Not detected
PSSP34Solyc01g104390Blue copper protein (AHRD V1 **—B6TT37_MAIZE); contains Interpro domain(s) IPR003245 Plastocyanin-like1221–27AT1G1780012949/11642ARPNBoth
PSSP35Solyc02g078090Unknown Protein (AHRD V1)1051–26-----Both
PSSP36Solyc03g123770Unknown Protein (AHRD V1)112------Both
PSSP37Solyc03g123970Lipid-binding serum glycoprotein family protein (AHRD V1 *-*- D7LAX8_ARALY)1161–17AT3G2027072226/5151lipid-binding serum glycoprotein familyBoth
PSSP38Solyc04g014750TNFR/CD27/30/40/95 cysteine-rich region (AHRD V1 ***- Q2HT38_MEDTR)1051–32AT1G1206410934/7347Unkown proteinBoth
PSSP39Solyc05g005240YABBY-like transcription factor CRABS CLAW-like protein (AHRD V1 **-* Q6SRZ7_ANTMA); contains Interpro domain(s) IPR006780 YABBY protein192-AT1G23420231100/18454INOBoth
PSSP40Solyc05g010190Unknown Protein (AHRD V1)1381–23AT3G4256511948/12140ECA1 gametogenesis related family proteinBoth
PSSP41Solyc07g032700Unknown Protein (AHRD V1)120------Both
PSSP42Solyc07g053400Unknown Protein (AHRD V1)97------Both
PSSP43Solyc09g011290Invertase inhibitor homolog (AHRD V1 ***- O49603_ARATH); contains Interpro domain(s) IPR006501 Pectinesterase inhibitor1881–24AT5G6462018052/17330C/VIF2, ATC/VIF2Both
PSSP44Solyc09g091300Self-incompatibility protein (Fragment) (AHRD V1 -**- C8C1B5_9MAGN); contains Interpro domain(s) IPR010264 Plant self-incompatibility S11481–23AT3G2688016135/13533Plant self-incompatibility protein S1 familyBoth
PSSP45Solyc11g012650TPD1 (AHRD V1 *-*- Q6TLJ2_ARATH)2041–28AT1G3258317966/11259TPD1-likeOvule

* Presence of secreted signal sequence in each protein was predicted by SignalP 4.1 Server with default setting.

* Presence of secreted signal sequence in each protein was predicted by SignalP 4.1 Server with default setting. OPE4 (Solyc11g012650) was homologous to Arabidopsis TAPETUM DETERMINANT (AtTPD1, AT4G24972) (6e-37), encoding a peptide hormone that functions as a ligand molecule to regulate the specification of tapetum cells in coordination with receptor protein EMS/EXS [106, 107], while BLAST searches of the tomato genome identified three other homologs, designated SlTPD1L1 (Solyc03g097530), SlTPD1L2/OPE4 (Solyc11g012650), and SlTPD1L3 (Solyc11g006850), based on sequence similarity to AtTPD1, with 59.4% (1e-49), 55% (6e-37), and 50% (4e-33) sequence similarity, respectively. Like AtTPD1, we confirmed the presence of a secretion signal in the N-terminal region and conserved cysteine residues at the C-terminus among the three deduced proteins (S3A Fig). Although the sequence of N-terminal secretion signal region varied among the three proteins, an alignment of each SlTPD1L compared to amino acids 26–179 of AtTPD1 revealed a high degree of similarity (48–56%) (S3B Fig). Although it is known that AtTPD1 is also expressed in inflorescence meristems, floral meristems, carpel primordia, and ovule primordia, its function in these tissues remains unknown [106, 107]. The notion that SlTPD1L2/OPE4 (Solyc11g012650) showed pistil-specific expression (Fig 3B), and that OPE4 is shown to be preferentially expressed in the ovule both in S. pimpinellifolium and tomato cultivar ‘Moneymaker’ (S3 and S4 Tables), it was suggested that SlTPD1L2/OPE4 might play a tissue-specific role in ovule development. Solyc11g005540, Solyc11g005500, and Solyc05g010190 share sequence similarity with Arabidopsis ECA1-like protein (EC1) genes, which involved in gamete fusion [108] (S4B and S4C Fig). mRNA from the five Arabidopsis EC1 genes (EC1.1 to EC1.5; belonging to the ECA1 [Early Culture Abundant] gametogenesis-related cysteine-rich protein subfamily) is present only in egg cells before fertilization. And, small proteins encoded by EC1 genes are secreted into the outer region of the egg cell to redundantly regulate the fusion of the germ cell to the sperm cell during double fertilization [82,108]. Three tomato EC1-like protein homologs are expressed in the ovule/seed in S. pimpinellifolium (S3 and S4 Tables), while in the current study, we confirmed the pistil-specific expression of Solyc11g005500 and Solyc11g005540 (S4A Fig). Sequence similarity analysis of these protein sequences with AtEC1s identified five tomato homologs, which share six conserved cysteine residues in the C-terminal region and an N-terminal secretion signal peptide (S4B Fig). Therefore, these genes were designated S. lycopersicum SlEC1-like genes (SlECLs; e.g., SlECL1 [Solyc11g005500], SlECL3 [Solyc05g010190], and SlECL5 [Solyc11g005540]), suggesting that they play a conserved role in the pistil at fertilization. In addition to several ligand-like, pistil-specific CRPs, we also identified two proteins with RLK-related domain; OPE5 (Solyc10g051370) and Solyc03g096190. OPE5 was expressed in ovule and seed, and Solyc03g096190 was expressed in placenta and septum (S3 Table). In tomato, the LRR-RLK family represents the largest family of RLKs [109]. Molecular genetic studies have shown that LRR-RLKs are involved in a wide range of plant development, such as stem cell maintenance [110,111], cell fate determination and patterning [103,112], and brassinosteroid signaling [113-115]. Generally, LRR-RLK proteins, such as EMS1/EXS and CLAVATA1 (CLV1), function as receptors by interacting with small ligand proteins at LRR domain and subsequently transmit external signals into cells by activating kinase domain, inducing various cellular responses [116-118]. While Solyc03g096190 possesses the LRR domain, one transmembrane domain and the cytoplasmic kinase domain, OPE5 possesses only small LRR domain composed of five repeats (S5 Fig). Some receptor-like proteins with extracellular LRR domains lacking the internal kinase domain also have been identified in other plants, such as Arabidopsis CLAVATA2 (CLV2) [119], which regulates various developmental and immunity signaling pathways by interacting with other RLKs including CLV1 [119-122]. Unraveling the role of pistil-specific RLK like proteins may help understand the detailed mechanism of signal transduction during fruit set.

Relationship between PSGs and fruit set signaling

We further characterized and identified PSGs associated with fruit set in the pistil. Specifically, we investigated their expression in a dataset of differentially expressed genes (DEGs) in the pistils of plants treated with 2,4-D and GA3 from Tang et al. [20]. Among total 4764 and 6875 DEGs in ovaries undergoing fruit set after induction by plant hormones (auxin and GA) or pollination at 4 DAF compared to unpollinated ovaries at 2 DBF and unpollinated ovaries at 4 DAF, respectively, three genes, including invertase inhibitor homolog (Solyc09g011290), SlATHB13/23-like (Solyc01g010600), SlCKX8 (Solyc10g017990) were included out of 108 PSGs, suggesting their roles in pistil development and/or fruit set initiation (S6 Fig). The mRNA levels of SlATHB13/23-like (Solyc01g010600) were significantly reduced at 4 DAF by either plant hormone (auxin or GA) treatment or pollination compared to that at 2 DBF, while no significant difference was found compared to ovaries at 6-days post-emasculation (6 DPE). By contrast, the mRNA levels of SlCKX8 (Solyc10g017990), invertase inhibitor homolog (Solyc09g011290) were significantly lower in ovaries undergoing fruit set compared to those at 6 DPE, indicating that these genes are up-regulated in the absence of fruit set signaling after flowering.

Regulatory regions of PSGs as genetic engineering tools

Tissue-specific promoters are useful tools for regulating the expression of genes of interest in a spatial and temporal manner, which could provide new insights into various biological mechanisms and facilitate their application to molecular breeding, such as generating genetically modified organisms [123]. Constitutive promoters such as the 35S promoter from Cauliflower mosaic virus (CaMV) is widely used to regulate the expression of target genes in plants [124,125]. However, the constitutive regulation of target genes is not always useful, since it sometimes induces additional undesirable effects that hamper agronomic applications. Thus, in the past several decades, many tissue-specific promoters have been isolated, such as fruit-specific promoters [126-129], root-specific promoters [130-132], seed-specific promoters [133-135], many pollen and/or anther-specific promoters [36,136-139], and so on. Several individual promoters targeting the stigma and/or style in the pistil have also been evaluated in several plant species, such as the stigma- and style-specific thaumatin/PR5-like protein (PsTL1) promoter from Japanese pear (Pyrus serotina) [140-142]. Importantly, the use of the ovule-specific DefH9 and INO promoters allows parthenocarpy to be efficiently induced via the ovule-specific activation of auxin signaling without producing substantial undesirable fruit traits in tomato [143]. Moreover, several reproductive tissue-specific promoters, such as sperm cell- and egg cell-specific promoters, have been successfully used to induce targeted mutagenesis by CRISPR/Cas9 [36,144]. For example, Wang et al. [144] used the promoter of egg cell-specific EC1.2 (homolog of SlEC1.2 and SlEC1.3 identified in the current study) for CRISPR/Cas9 to generate homozygous mutants for multiple target genes in a single generation in Arabidopsis. These findings indicate that pistil-specific promoters, especially ovule-specific promoters, can be highly effective tools for plant breeding. Here, we identified various types of PSGs in tomato, e.g., SlINO and two homologs of Arabidopsis EC1.2, which were specifically expressed in the ovule (Table 2). Thus, these promoters could contribute to tissue-specific regulation or genome editing of target genes to improve fruit set by inducing parthenocarpy.

Conclusion

In conclusion, in this study, we conducted global analysis of tomato PSGs in tomato, which might be involved in the ovary development or fruit set process, by performing RNA-seq analysis and comparisons with publically available data. This study successfully identified several genes encoding signaling-related transcription factor and peptide hormone-like proteins, in addition to many genes with unknown functions (Fig 4). Although their biological functions remain to be determined, our findings lay the foundation for further analysis of the precise gene regulatory network and developmental mechanisms underlying fruit set, in addition to the usage of promoter region of PSGs for genetic engineering and molecular breeding.
Fig 4

PSGs identified from this study.

Many candidate transcriptional regulators in tomato pistils were identified, including cysteine-rich peptide (CRP)-like proteins, indicating their roles in the development of specific types of cells in the pistil. Ovule- and embryo-specific genes: SlINO, SlTT1-like, SlECL1, SlECL5. Pericarp-specific gene; SlATHB13/23-like. Seed coat-specific gene; SlCKX8.

PSGs identified from this study.

Many candidate transcriptional regulators in tomato pistils were identified, including cysteine-rich peptide (CRP)-like proteins, indicating their roles in the development of specific types of cells in the pistil. Ovule- and embryo-specific genes: SlINO, SlTT1-like, SlECL1, SlECL5. Pericarp-specific gene; SlATHB13/23-like. Seed coat-specific gene; SlCKX8.

Results of trimming and filtering of raw read data from each sample obtained by RNA-seq.

Trimming was performed using FastQC. #1 and #2 represent FastQC analysis of original and trimmed data from petals, respectively. Pistil and fruit samples (#3–10): pistils of 2–2.5 mm buds (#3), 3–4 mm buds (#4), 1 DBF (#5), anthesis (#6), 5 DAF (#7), 5 mm ovaries at 7 DAF (#8), mature green fruits (#9), and red fruits (#10); Stamen and other floral organ samples (#12–14): stamens of 3–4 mm buds (#12), 1 DBF (#13) and anthesis (#14). sepals; Vegetative organs (#15–18): 3-week-old leaves (#15), mature leaves (#16), stems (#17), roots (#18), from left to right, respectively. (TIF) Click here for additional data file.

Identification of pistil-specific genes based on the direct-mapping method.

(A) Number of expressed genes in different tissue/stages. Genes with RPKM values greater than 0 and 0.5 are shown in the top and bottom panels, respectively. (B) Expression of tomato YABBY transcription factor family genes. The expression of nine YABBY transcription factor genes was examined. SlCRCa, SlCRCb, and SlINO appeared to be preferentially expressed in the pistil. Vertical axis represents the expression value (RPKM). Horizontal axis represents the 17 samples used for RNA-seq analysis. (TIF) Click here for additional data file.

TPD1-like cysteine-rich peptides specifically expressed in pistils.

(A) Alignment of five TPD1-like proteins in tomato, including SlTPD1 (Solyc11g005500), SlTPD1-like1 (Solyc12g009850), TPD1-like2 (Solyc05g010190), TPD1-like3 (Solyc04g071640), and TPD1 (AT4G24972). rice MIL2/TPD1A (Os12g0472500), maize MAC1 (JN247438). (B) Phylogenetic tree of three tomato TPD1-like proteins and several orthologs of Arabidopsis, rice and maize. Numbers above the branches indicate bootstrap values (10,000 replicates). (TIF) Click here for additional data file.

EC1-like cysteine-rich peptides specifically expressed in pistils.

(A) Expression of tomato EC1-like (ECL) genes. Both SlECL1 and SlECL5 were specifically expressed in the pistil at anthesis. Bottom one represents the expression of the internal control gene SAND [41]. (B) Alignment of five ECA1-like proteins in tomato, including SlECL1 (Solyc11g005500), SlECL2 (Solyc05g010190), SlECL3 (Solyc12g009850), SlECL4 (Solyc04g071640), and SlECL5 (Solyc11g005540). Arabidopsis EC1s and Tomato ECLs share six conserved cysteine residues at their C-termini (asterisk). (C) Neighbor-joining tree of amino acid sequences of five tomato ECLs proteins and five Arabidopsis EC1s. Numbers above the branches indicate bootstrap values (10,000 replicates). (TIF) Click here for additional data file.

Second structure of CRPs and LRR-RLKs identified in this study.

Conserved domains and motifs were searched using CDD in NCBI. The presence of secretion signal and transmembrane region was investigated using SignalP 4.1 Server and TMHMM Server v. 2.0 (http://www.cbs.dtu.dk/services/TMHMM/). (TIF) Click here for additional data file.

Five PSGs that are differentially regulated during fruit set.

The data were obtained from Tang et al. 2015. Vertical axis represents the expression values normalized to transcripts per million (TPM) 6. Horizontal axis represents the pistil sample types. (TIF) Click here for additional data file.

Tomato tissues subject to RNA-seq and summary of sequencing results.

(XLSX) Click here for additional data file.

Gene expression patterns of the 108 genes in various tissues of Micro-Tom.

(XLSX) Click here for additional data file.

Gene expression patterns of the 108 PSGs in the pistils of wild tomato relative S. pimpinellifolium based on published data from Pattison et al. (2015).

(XLSX) Click here for additional data file.

Gene expression patterns of the 108 PSGs based on published data from Zhang et al. (2016).

(XLSX) Click here for additional data file.

List of oligonucleotide primers used for RT-PCR in this study. List of ovule preferentially expressed genes (OPEs) and embryo preferentially expressed genes (EPEs).

(XLSX) Click here for additional data file.
  138 in total

Review 1.  Transcriptome analysis using next-generation sequencing.

Authors:  Kai-Oliver Mutz; Alexandra Heilkenbrinker; Maren Lönne; Johanna-Gabriela Walter; Frank Stahl
Journal:  Curr Opin Biotechnol       Date:  2012-09-25       Impact factor: 9.740

2.  Plant science. Genomic-scale exchange of mRNA between a parasitic plant and its hosts.

Authors:  Gunjune Kim; Megan L LeBlanc; Eric K Wafula; Claude W dePamphilis; James H Westwood
Journal:  Science       Date:  2014-08-15       Impact factor: 47.728

Review 3.  Unraveling the signal scenario of fruit set.

Authors:  Mariana Sotelo-Silveira; Nayelli Marsch-Martínez; Stefan de Folter
Journal:  Planta       Date:  2014-06       Impact factor: 4.116

4.  A. thaliana TRANSPARENT TESTA 1 is involved in seed coat development and defines the WIP subfamily of plant zinc finger proteins.

Authors:  Martin Sagasser; Gui-Hua Lu; Klaus Hahlbrock; Bernd Weisshaar
Journal:  Genes Dev       Date:  2002-01-01       Impact factor: 11.361

5.  The 14-amino acid CLV3, CLE19, and CLE40 peptides trigger consumption of the root meristem in Arabidopsis through a CLAVATA2-dependent pathway.

Authors:  Martijn Fiers; Elzbieta Golemiec; Jian Xu; Lonneke van der Geest; Renze Heidstra; Willem Stiekema; Chun-Ming Liu
Journal:  Plant Cell       Date:  2005-07-29       Impact factor: 11.277

6.  Characterization and localization of the transmitting tissue-specific PELPIII proteins of Nicotiana tabacum.

Authors:  B H J de Graaf; B A Knuiman; J Derksen; C Mariani
Journal:  J Exp Bot       Date:  2003-01       Impact factor: 6.992

7.  The Arabidopsis CLAVATA2 gene encodes a receptor-like protein required for the stability of the CLAVATA1 receptor-like kinase.

Authors:  S Jeong; A E Trotochaud; S E Clark
Journal:  Plant Cell       Date:  1999-10       Impact factor: 11.277

Review 8.  Peptide signaling in plant development.

Authors:  Leron Katsir; Kelli A Davies; Dominique C Bergmann; Thomas Laux
Journal:  Curr Biol       Date:  2011-05-10       Impact factor: 10.834

9.  Changes in the distribution of cell wall polysaccharides in early fruit pericarp and ovule, from fruit set to early fruit development, in tomato (Solanum lycopersicum).

Authors:  Azusa Terao; Hiromi Hyodo; Shinobu Satoh; Hiroaki Iwai
Journal:  J Plant Res       Date:  2013-03-02       Impact factor: 2.629

10.  Genes that influence yield in tomato.

Authors:  Tohru Ariizumi; Yoshihito Shinozaki; Hiroshi Ezura
Journal:  Breed Sci       Date:  2013-03-01       Impact factor: 2.086

View more
  10 in total

1.  Gibberellins negatively modulate ovule number in plants.

Authors:  Maria D Gomez; Daniela Barro-Trastoy; Ernesto Escoms; Maite Saura-Sánchez; Ines Sánchez; Asier Briones-Moreno; Francisco Vera-Sirera; Esther Carrera; Juan-José Ripoll; Martin F Yanofsky; Isabel Lopez-Diaz; José M Alonso; Miguel A Perez-Amador
Journal:  Development       Date:  2018-07-09       Impact factor: 6.868

2.  Arabinogalactan glycoprotein dynamics during the progamic phase in the tomato pistil.

Authors:  Cecilia Monserrat Lara-Mondragón; Cora A MacAlister
Journal:  Plant Reprod       Date:  2021-04-16       Impact factor: 3.767

3.  Inferring the Genetic Basis of Sex Determination from the Genome of a Dioecious Nightshade.

Authors:  Meng Wu; David C Haak; Gregory J Anderson; Matthew W Hahn; Leonie C Moyle; Rafael F Guerrero
Journal:  Mol Biol Evol       Date:  2021-06-25       Impact factor: 16.240

4.  Genome-wide mining seed-specific candidate genes from peanut for promoter cloning.

Authors:  Cuiling Yuan; Quanxi Sun; Yingzhen Kong
Journal:  PLoS One       Date:  2019-03-28       Impact factor: 3.240

5.  Loss-of-Function of a Tomato Receptor-Like Kinase Impairs Male Fertility and Induces Parthenocarpic Fruit Set.

Authors:  Hitomi Takei; Yoshihito Shinozaki; Ryoichi Yano; Sachiko Kashojiya; Michel Hernould; Christian Chevalier; Hiroshi Ezura; Tohru Ariizumi
Journal:  Front Plant Sci       Date:  2019-04-16       Impact factor: 5.753

6.  Expanding Alternative Splicing Identification by Integrating Multiple Sources of Transcription Data in Tomato.

Authors:  Sarah Clark; Feng Yu; Lianfeng Gu; Xiang Jia Min
Journal:  Front Plant Sci       Date:  2019-05-28       Impact factor: 5.753

7.  Functional analysis of SlNCED1 in pistil development and fruit set in tomato (Solanum lycopersicum L.).

Authors:  Wenbin Kai; Ying Fu; Juan Wang; Bin Liang; Qian Li; Ping Leng
Journal:  Sci Rep       Date:  2019-11-15       Impact factor: 4.379

Review 8.  Lectin Receptor-Like Kinases: The Sensor and Mediator at the Plant Cell Surface.

Authors:  Yali Sun; Zhenzhen Qiao; Wellington Muchero; Jin-Gui Chen
Journal:  Front Plant Sci       Date:  2020-12-10       Impact factor: 5.753

9.  In-depth assembly of organ and development dissected Picrorhiza kurroa proteome map using mass spectrometry.

Authors:  Manglesh Kumari; Upendra Kumar Pradhan; Robin Joshi; Ashwani Punia; Ravi Shankar; Rajiv Kumar
Journal:  BMC Plant Biol       Date:  2021-12-22       Impact factor: 4.215

10.  Identification and functional study of a mild allele of SlDELLA gene conferring the potential for improved yield in tomato.

Authors:  Yoshihito Shinozaki; Kentaro Ezura; Jianhong Hu; Yoshihiro Okabe; Camille Bénard; Duyen Prodhomme; Yves Gibon; Tai-Ping Sun; Hiroshi Ezura; Tohru Ariizumi
Journal:  Sci Rep       Date:  2018-08-13       Impact factor: 4.379

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.