Literature DB >> 30921362

Genome-wide mining seed-specific candidate genes from peanut for promoter cloning.

Cuiling Yuan1,2,3, Quanxi Sun2, Yingzhen Kong1.   

Abstract

Peanut seeds are ideal bioreactors for the production of foreign recombinant proteins and/or nutrient metabolites. Seed-Specific Promoters (SSPs) are important molecular tools for bioreactor research. However, few SSPs have been characterized in peanut seeds. The mining of Seed-Specific Candidate Genes (SSCGs) is a prerequisite for promoter cloning. Here, we described an approach for the genome-wide mining of SSCGs via comparative gene expression between seed and nonseed tissues. Three hundred thirty-seven SSCGs were ultimately identified, and the top 108 SSCGs were characterized. Gene Ontology (GO) analysis revealed that some SSCGs were involved in seed development, allergens, seed storage and fatty acid metabolism. RY REPEAT and GCN4 motifs, which are commonly found in SSPs, were dispersed throughout most of the promoters of SSCGs. Expression pattern analysis revealed that all 108 SSCGs were expressed specifically or preferentially in the seed. These results indicated that the promoters of the 108 SSCGs may perform functions in a seed-specific and/or seed-preferential manner. Moreover, a novel SSP was cloned and characterized from a paralogous gene of SSCG29 from cultivated peanut. Together with the previously characterized SSP of the SSCG5 paralogous gene in cultivated peanut, these results implied that the method for SSCG identification in this study was feasible and accurate. The SSCGs identified in this work could be widely applied to SSP cloning by other researchers. Additionally, this study identified a low-cost, high-throughput approach for exploring tissue-specific genes in other crop species.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 30921362      PMCID: PMC6438489          DOI: 10.1371/journal.pone.0214025

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Peanut (Arachis hypogaea L., which is also referred to as groundnut) is one of the most important oil crop species worldwide and plays important roles in human nutrition [1]. Peanut seeds, which are rich in oleic acid, linoleic acid, proteins and other nutrients, are ideal bioreactors for the production of foreign recombinant proteins or other beneficial metabolites. As important molecular tools, promoters are usually used in gene functional analysis [2-4] and are also widely used for plant quality improvement [5-8]. Seed-specific promoters (SSPs), which can drive the expression of foreign genes specifically in seeds, are of great importance for genetic engineering of seeds. SSPs have been widely applied in plant molecular pharming, such as that involving golden rice [8], purple endosperm rice [9], purple embryo maize [7] and fish oil canola [10]. The use of SSPs can avoid constitutive expression, which can harm plants [11-13]. Moreover, repetitive use of the same promoter when expressing multiple foreign proteins simultaneously is considered inadvisable owing to the likelihood of transcriptional silencing [14-16]. Therefore, additional peanut SSPs are needed to overexpress or knock down specific genes, regulate seed development, and modify seed content, especially to produce foreign recombinant proteins or secondary metabolites. To date, few SSPs from peanut are available, and those that are available were identified from known genes expressed specifically in the seed [17-19]. Tissue-specific gene expression provides fundamental information for SSP mining. Several methods have been developed to analyze gene expression differences, such as subtractive hybridization [20], suppression subtractive hybridization [21], differential display reverse transcription PCR [22], and cDNA microarrays [23,24]. However, these methods are limited by their specific shortcomings; for example, only known genes can be recognized by microarray chips [23]. With the decreasing cost of transcriptome sequencing, comparative transcriptome sequencing has been widely used to analyze differences in gene expression [25-28]. The diploid peanut ancestors Arachis duranensis (AA) and Arachis ipaensis (BB) are considered the donors of the A and B subgenomes of the allotetraploid cultivated peanut Arachis hypogaea [1]. The release of A. duranensis and A. ipaensis genome sequences [1] made it convenient to obtain genetic information from cultivated peanut. Comparative transcriptome sequencing combined with peanut genome information is a powerful means of genome-wide mining of SSCGs for promoter cloning. In this study, we described a genome-wide comparative transcriptome sequencing-based approach to identify SSCGs for SSP cloning in peanut. A total of 337 SSCGs were identified from peanut, and the top 108 SSCGs according to their Fragments Per Kilobase of transcript per Million mapped reads (FPKMs) were characterized. On the basis of semiquantitative RT-PCR analysis, 94 SSCGs were expressed in a seed-specific manner, and 14 SSCGs were expressed in a seed-preferential manner. One novel SSP was cloned and characterized to verify its seed specificity in transgenic Arabidopsis. Our results could be widely used in the identification of future peanut SSPs.

Materials and methods

Plant materials and RNA extraction

Plants of the cultivated peanut ‘Shitouqi’ were grown at the Laixi experimental station of the Shandong Peanut Research Institute during the summer of 2016. Leaves, roots, stems, pegs and pod shells were collected at the pod-maturing stage. Developing seeds were collected between 20 and 80 days after flowering. All tissues were flash frozen in liquid nitrogen and then stored at -80°C for transcriptome sequencing. Total RNA was isolated from different tissues using TRIzol (Life Technologies, Carlsbad, CA, USA) reagent. The quality and quantity of each RNA sample were assayed using a NanoDrop device (Thermo Fisher, MA, USA).

Illumina sequencing and in silico analysis

The RNA extracted from seeds at different development stages was mixed together as Sample I (seed), while the RNA from the leaves, roots, stems, pegs and pod shells were pooled in equimolar amounts as Sample II (nonseed). Both samples were treated and sequenced using an Illumina HiSeqTM 2500 instrument at Gene Denovo Biotechnology Company (Guangzhou, China). Transcript reads containing adaptor sequences were cleaned, and low-quality reads were filtered and removed. The transcript reads of each sample were then mapped to the A. duranensis and A. ipaensis reference genomes [1] by TopHat2 [29]. The gene expression levels were normalized using FPKM methods. To mining SSCGs, the FPKM value of each transcript in Sample I was divided by the value in Sample II using Excel software. The FPKM values of the SSCGs that were less than 10 in Sample I or greater than 10 in Sample II as well as yield values greater than 50 were considered SSCGs. The SSCGs were subsequently listed according to their FPKM value.

GO annotation, chromosomal location and cis-acting element analysis

Functional annotation and Gene Ontology (GO) analyses of the SSCGs were carried out using BLAST2GO (http://www.geneontology.org/). All SSCG sequences and chromosomal location information were obtained from the PeanutBase database (www.peanutbase.org). These genes were mapped onto the chromosome using the MapInspect software program (http://mapinspect.software.informer.com). To identify cis-acting elements, the 2500 bp promoter regions upstream of the ATG initiation codon of the SSCGs were identified using the New PLACE server (https://sogo.dna.affrc.go.jp/cgi_bin/sogo.cgi?lang=en&pj=640&action=page&page=newplace) [30].

Phylogenetic analysis

To study the phylogenetic relationship of the selected SSCGs, multiple alignments of their DNA sequence were performed using the computer program ClustalW. Unrooted phylogenetic trees were constructed in accordance with the neighbor-joining (NJ) method using MEGA 6.0 software, and the bootstrap test was carried out with 1000 iterations.

Expression analysis of SSCGs in A. duranensis and A. ipaensis

The FPKM data of the 108 selected SSCGs within 20 distinct tissues were retrieved from the work of Clevenger et al. [31]. The FPKM normalized read count data of the SSCGs were log2-transformed and displayed in the form of heat maps via HemI [32].

Semiquantitative RT-PCR analysis in cultivated peanut

To confirm the tissue expression specificity in cultivated peanut further, RNA extracted from the leaves, roots, stems, pegs, pod shells and seeds were collected at the pod-maturing stage. Three independent RNA preparations were used for semiquantitative RT-PCR. Twenty-six amplification cycles were used to evaluate and quantify the differences among transcript levels. RT-PCR was performed using the peanut Actin gene as an internal control [33]. PCR was performed using 2*Easy Taq PCR SuperMix (TransGen Biotech, Beijing, China). The PCR conditions were as follows: one initial denaturation step of 94°C for 3 min; 26 cycles of 94°C for 30 s, 58°C for 30 s and 72°C for 30 s; and one final extension step of 72°C for 10 min. Three independent RNA preparations were used for semiquantitative RT-PCR. The primers used for these experiments are listed in S3 Table.

Isolation of an SSP

Peanut genomic DNA was isolated from young leaves of the ‘Shitouqi’ cultivar using a DNAquick Plant System Kit (Tiangen, Beijing, China). Using AHSSP29-specific primers (S3 Table), we performed PCR with PrimeSTAR GXL DNA Polymerase (Takara, Dalian, China). The PCR products were separated by electrophoresis through a 1.5% agarose gel and purified using a gel extraction kit (TransGen Biotech, Beijing, China). All purified PCR products were subcloned into a pEASY-blunt simple vector (TransGen Biotech, Beijing, China). The DNA sequences were sequenced by the Shanghai Sangon Biotechnology Company (Shanghai, China). The promoter fragment AHSSP29 of SSCG29 was excised from the pEASY-blunt simple vector with the restriction enzymes HindIII and BamHI (Thermo Fisher, MA, USA) and ligated into the corresponding restriction sites of the plant transformation vector pBI121 to produce an AHSSP29::β-glucuronidase (GUS) construct.

Generation of transgenic Arabidopsis plants

The recombinant binary plasmid was transferred to Agrobacterium tumefaciens strain GV3101, and kanamycin-resistant colonies were selected on medium containing 50 μg ml-1 kanamycin. A selected colony was grown to stationary phase at 28°C, and the cells were concentrated by centrifugation and then resuspended in a dipping solution that comprised 5% sucrose, 0.03% Silwet-77, and 10 mM MgCl2 [34]. The seeds were harvested and subsequently stored at room temperature. For screening, the seeds were sterilized in 75% (v/v) ethanol for 3 min and then 2.6% NaClO for 10 min, followed by several washes with sterile water. The transformants were screened on one-half-strength Murashige and Skoog (MS) medium that contained 50 μg ml-1 kanamycin.

Transgene detection in the transgenic progeny of Arabidopsis and GUS histochemical staining

Kanamycin-resistant transgenic Arabidopsis plants were identified using GUS gene-specific primers (S3 Table). The positive transgenic plants were then selfed, after which homozygous T2 progeny were obtained. The GUS activity was measured as described previously [35]. The samples were incubated with GUS staining buffer (0.1% Triton X-100, 2 mM 5-bromo-4-chloro-3-indolyl-β-D-glucuronide (X-Gluc), and cyclohexyl ammonium salt in 100 mM sodium phosphate buffer, pH 7.0) at 37°C overnight and then decolorized with 70% ethanol.

Results

Genome-wide mining of SSCGs via comparative transcriptome sequencing

To mining SSCGs, two samples of the cultivated peanut ‘Shitouqi’ (Sample I for seed samples and Sample II for nonseed samples) were used for transcriptome sequencing via an Illumina HiSeqTM 2500 system. Approximately 10 Gb of sequence data (approximately 76.79 million reads from Sample I and 78.93 million reads from Sample II, each 300 bp in length) were obtained; after filtering the adaptor sequences and low-quality reads, approximately 75.37 and 77.81 million reads were used for transcriptome assembly, respectively (S1 Table). All of the reconstructed genes were aligned to the reference genome of A. duranensis and A. ipaensis [1] and were subsequently annotated. A comparative transcript profile was established based on the FPKM values of the assembly transcripts. Three hundred thirty-seven SSCGs were ultimately identified and designated sequentially as SSCG1 to SSCG337 according to their FPKM value. The detailed information of these SSCGs, including their gene symbol, chromosomal location, FPKM value and putative function(s), is listed in Table 1 and S2 Table. GO annotation was performed using BLAST2GO, and the 337 SSCGs were categorized with particular GO annotations (S1 Fig, Table 2). Expectedly, these SSCGs were enriched in metabolic process (120) and catalytic activity (108) GO terms, which suggested the presence of vigorous metabolic activity in the seed, in which fatty acids such as oleic acid are converted into linoleic acid by fatty acid desaturase [36]. To identify promoters that are strongly or specifically expressed in the seed, the most abundant top 108 SSCGs were chosen for further analysis. With the decreasing cost of transcriptome sequencing and the release of the peanut ancestor genome, comparative transcriptome sequencing has become an efficient approach for mining tissue-specific genes from peanut and other less studied crop species.
Table 1

List of 108 SSCGs identified from A. duranensis and A. ipaensis by comparative transcriptome sequencing.

IDGene symbolChromosomal locationNonseed FPKMSeed FPKMSeed FPKM/Nonseed FPKMPutative function
SSCG1Araip.D61U9B06:22151849..221566609.640464.623817.41698Nutrient reservoir protein, putative
SSCG2Aradu.F9TAJA06:1278068..12867327.929576.443743.85316Nutrient reservoir protein, putative
SSCG3Aradu.YGS80A06:1778238..17826136.1323996.243914.55791Nutrient reservoir protein, putative
SSCG4Araip.T82B5B09:145805360..1458099885.4817844.623256.31752Vicilin 47 kDa protein
SSCG5Araip.WQE9QB06:19983167..199878983.6511809.743235.54521Nutrient reservoir, putative
SSCG6Aradu.2H0R0A09:111189111..1111936632.8511232.053941.07018Allergen gly M Bd 28 kDa protein
SSCG7Aradu.YBK6QA06:1263038..12680551.847346.023992.40217Nutrient reservoir, putative
SSCG8Aradu.B98FLA02:14075236..140784021.154528.013937.4Nutrient reservoir protein, putative
SSCG9Araip.5JB56B06:21966364..219709370.883850.034375.03409Nutrient reservoir protein, putative
SSCG10Aradu.I3E1JA01:95190268..951949820.943292.13502.23404PREDICTED: desiccation-related protein PCC13-62-like [Glycine max]
SSCG11Araip.CF8RSB01:132019082..1320242340.932692.82895.48387PREDICTED: desiccation-related protein PCC13-62-like [Glycine max]
SSCG12Araip.4G9JRB07:124167306..1241751514.162538.22610.149038PREDICTED: uncharacterized protein LOC100803807 isoform X4 [Glycine max]
SSCG13Araip.16S9QB06:1939514..19458480.712002.032819.76056Allergen gly M Bd 28 kDa protein
SSCG14Araip.DH1Z0B08:1478391..14872971.741730.87994.752874Seed linoleate 9S-lipoxygenase
SSCG15Araip.UPW6LB06:13539947..135444740.581639.782827.2069Short-chain dehydrogenase-reductase B
SSCG16Araip.XV8NAB09:121459378..1214661470.911567.951723.02198Caleosin-related family protein
SSCG17Araip.930A9B07:103614634..1036181080.371391.543760.91892Plant EC metallothionein-like protein
SSCG18Aradu.L7CNHA07:57016256..570195390.751093.321457.76Dehydrin family protein
SSCG19Araip.GWR7VB06:3643904..36478920.381236.213253.18421Seed maturation protein
SSCG20Aradu.P54FBA06:4694298..46983991.251207.86966.288Short-chain dehydrogenase-reductase B
SSCG21Aradu.CPR44A08:16217242..162250651.321127.39854.083333PREDICTED: uncharacterized protein LOC100803807 isoform X4 [Glycine max]
SSCG22Araip.TR541B07:64847542..648508230.27935.313464.11111Dehydrin family protein
SSCG23Araip.E99Y9B08:1487051..14967312.19885.49404.333333Seed linoleate 9S-lipoxygenase
SSCG24Aradu.A02RYA06:12429384..124334250.36868.982413.83333Seed maturation protein
SSCG25Aradu.8NU6IA03:108778760..1087819467.04778.99110.651989Unknown protein
SSCG26Araip.GVB7UB02:104940525..1049471200.1764.277642.7Adenine nucleotide α-hydrolase-like superfamily protein
SSCG27Aradu.TC8DFA09:100677241..1006818790.15749.564997.06667Caleosin-related family protein
SSCG28Aradu.DWL7LA05:84098086..841027860.46706.681536.26087Water-selective transport intrinsic membrane protein 1
SSCG29Aradu.YC8MHA05:9432005..94369630.36705.761960.44444Nutrient reservoir, putative
SSCG30Araip.MGW36B05:10060638..100649890.44699.311589.34091Allergen gly M Bd 28 kDa protein
SSCG31Araip.SK1ENB06:21959550..219646660.11660.716006.45455Nutrient reservoir, putative
SSCG32Aradu.UQE92A07:7869609..78739570.41630.021536.63415Unknown protein
SSCG33Araip.XXN6RB05:14183132..141885910.37607.631642.24324PREDICTED: vacuolar-processing enzyme-like [Glycine max]
SSCG34Araip.LJX8ZB05:144093715..1440979320.2595.642978.2Water-selective transport intrinsic membrane protein 1
SSCG35Aradu.F1JZ5A05:16468072..164726291.5584.6389.733333Hydroxysteroid dehydrogenase 5
SSCG36Aradu.1QI16A06:1256618..12604950.17529.573115.11765Nutrient reservoir, putative
SSCG37Aradu.WX5KPA08:23296802..233033032.52517.02205.166667Seed linoleate 9S-lipoxygenase
SSCG38Aradu.7S7IWA03:1743950..17496850.23495.132152.73913Allergen gly M Bd 28 kDa protein
SSCG39Araip.S2F61B06:13664618..136696830.37489.341322.54054PREDICTED: basic 7S globulin [Glycine max]
SSCG40Araip.YX1UIB03:132064839..1320690750.24484.562019Alkyl hydroperoxide reductase/Thiol-specific antioxidant/Mal allergen
SSCG41Araip.LRG7EB06:21098054..211022620.4482.271205.675Nutrient reservoir, putative
SSCG42Aradu.G7AM5A06:1289026..12940660.07472.416748.71429Nutrient reservoir, putative
SSCG43Araip.213GNB08:22780923..227841551.49442.09296.704698Defensin related
SSCG44Aradu.I953DA06:14925068..149294640.09413.624595.77778Vicilin 47 kDa protein
SSCG45Araip.IGC50B06:22165994..221751820.17399.542350.23529Nutrient reservoir, putative
SSCG46Araip.I9427B03:3690475..36989920.42399.48951.142857Allergen gly M Bd 28 kDa protein
SSCG47Araip.C23HGB03:106887073..1068910660.05396.237924.6Short-chain dehydrogenase-reductase B
SSCG48Araip.Z0Q6QB10:134692303..1347021463.72394.85106.142473Annexin 8
SSCG49Araip.18621B06:62925430..629268400.04386.859671.2535 kDa seed maturation protein [Glycine max]
SSCG50Aradu.Y7IVDA05:13381777..133874240.33378.861148.06061PREDICTED: vacuolar-processing enzyme-like [Glycine max]
SSCG51Araip.K8QIEB10:4310337..43132231.81373.44206.320442Maturation protein pPM32 [Glycine max]
SSCG52Araip.PTL0NB06:11564210..115704672.1364.85173.738095Nodulin MtN21/EamA-like transporter family protein
SSCG53Araip.8Y88RB07:15516942..155212342.43363.4149.547325Sugar transporter SWEET
SSCG54Araip.FWQ8EB10:10536351..105406432.43363.4149.547325Sugar transporter SWEET
SSCG55Aradu.QK6K1A02:91132444..911365230.27343.331271.59259Adenine nucleotide α-hydrolase-like superfamily protein
SSCG56Araip.U0WFWB09:145740761..1457447665.31343.2364.6384181Seed maturation protein
SSCG57Aradu.QV0LRA03:131080569..1310838150.31329.271062.161291-Cysteine peroxiredoxin 1
SSCG58Aradu.7HS2DA06:5723487..57298522.23328.45147.286996Nodulin MtN21/EamA-like transporter family protein
SSCG59Araip.534K5B05:17042251..170469050.04327.328183Oxidoreductase, short-chain dehydrogenase/reductase family protein, expressed
SSCG60Araip.YF7VPB10:2845816..28496090.14326.692333.5Protein of unknown function
SSCG61Araip.55BM4B06:4524371..45318660.06323.45390PREDICTED: probable galactinol-sucrose galactosyltransferase 2-like isoform X2 [Glycine max]
SSCG62Aradu.ZQ8HDA06:57618672..576228570.3320.31067.6666735 kDa Seed maturation protein [Glycine max]
SSCG63Aradu.F3KB2A09:120096458..1200999360.06320.125335.33333AWPM-19-like family protein
SSCG64Araip.Z2VYZB05:4533706..45368890.07307.964399.42857PREDICTED: ethylene-responsive transcription factor 13-like [Glycine max]
SSCG65Aradu.KPI4BA06:110359168..1103627504.95301.260.8484848Gibberellin-regulated family protein
SSCG66Aradu.HX36XA08:32778205..327832030.24300.891253.70833Seed biotin-containing protein SBP65 [Glycine max]
SSCG67Aradu.TVV1LA10:6209012..62135151.76299.09169.9375Sugar transporter SWEET
SSCG68Aradu.G1YNFA09:114688277..1146932671.22293.18240.311475Fatty acid desaturase 2
SSCG69Aradu.Y6LUXA09:1576235..15824271.13288.63255.424779Late embryogenesis abundant protein (LEA) family protein
SSCG70Aradu.N27YBA09:70208795..702144295.56287.1951.6528777PREDICTED: uncharacterized protein LOC100802932 isoform X2 [Glycine max]
SSCG71Aradu.BXD3BA03:104961788..1049659710.25285.351141.4Oxidoreductase, short-chain dehydrogenase/reductase family protein
SSCG72Aradu.KQ35FA02:91130378..911340930285.2--
SSCG73Aradu.Z8JSIA03:127984231..1279910120.04268.056701.25Flowering locus protein T
SSCG74Aradu.9S6MIA06:4648907..46526690.62267.51431.467742PREDICTED: basic 7S globulin [Glycine max]
SSCG75Aradu.XDS84A06:1477479..14817750.13263.92030Nutrient reservoir, putative
SSCG76Araip.27I5UB03:125840273..1258435580.39261.23669.820513Gibberellin-regulated protein n = 1 Tax = Medicago truncatula
SSCG77Araip.FYJ9UB01:123329168..1233334314.19259.6661.9713604Late embryogenesis abundant protein (LEA), putative/LEA protein, putative
SSCG78Araip.XR8KBB10:129753362..1297595944.64252.9754.5193966PREDICTED: probable 2-Oxoglutarate/Fe(II)-dependent dioxygenase-like [Glycine max]
SSCG79Aradu.9Z0RXA02:93238735..932434170.18249.631386.83333Late embryogenesis abundant protein (LEA), putative
SSCG80Araip.JTL3LB02:12132486..121374010.13249.011915.46154Seed maturation protein
SSCG81Araip.DK4JWB08:11762581..117676340.43239.01555.837209Seed biotin-containing protein SBP65 [Glycine max]
SSCG82Aradu.X3CG0A02:59745890..597511170.03223.677455.66667Nutrient reservoir, putative
SSCG83Araip.91947B01:128237278..1282442100.51220.45432.254902Glutamine synthetase 2
SSCG84Aradu.XM2MRA06:101837477..1018425742.15215.63100.293023Acyl-[acyl-carrier-protein] desaturase
SSCG85Araip.S3GXYB09:142678437..1426831380.63214.14339.904762Fatty acid desaturase 2
SSCG86Aradu.X9GQ3A01:102122128..1021258022.27212.8993.784141Early nodulin related
SSCG87Araip.H2E95B09:132437996..1324431280.13212.421634Papain family cysteine protease
SSCG88Aradu.HYY79A10:2715010..27178540.59209.18354.542373Late embryogenesis abundant protein (LEA) group 3 protein
SSCG89Araip.QXV0RB03:144454..1487560.03199.726657.33333Cell wall protein EXP3
SSCG90Aradu.FPC2CA09:111265722..1112697120.61187.52307.409836Seed maturation protein
SSCG91Aradu.IZQ3ZA10:107916715..1079273490.97179.33184.876289Annexin 8
SSCG92Araip.84L6BB09:2158523..21597160.41176.67430.902439Late embryogenesis abundant protein (LEA) family protein
SSCG93Aradu.S2SYEA08:49306243..493077190.07176.312518.71429Cell wall protein EXP3
SSCG94Aradu.440M4A08:37377595..373783580.21174.86832.666667Defensin related
SSCG95Araip.JYP5GB03:2187447..21917551.76173.9598.8352273NAD+:PROTEIN (ADP-ribosyl)-transferase
SSCG96Araip.9C0MUB04:3455685..34565420173.83--
SSCG97Araip.V3XTLB04:67447309..674484620.27172.23637.888889Unknown protein
SSCG98Aradu.80WBVA04:1154520..11685691.11160.05144.189189Subtilisin-like serine protease 2
SSCG99Aradu.UJ6Z9A06:99636586..996386730.36157.81438.361111Aldo/keto reductase family oxidoreductase
SSCG100Aradu.UU57QA09:120002036..1200043680.39157.26403.230769Papain family cysteine protease
SSCG101Aradu.XGA9XA07:72920123..729231350.14155.351109.64286PREDICTED: transcription factor HBP-1b(c1)-like isoform X2 [Glycine max]
SSCG102Araip.97QE1B04:2178764..21819860.27150.15556.111111Protein of unknown function
SSCG103Araip.SP2PFB09:132209230..1322101890.18149.66831.444444AWPM-19-like family protein
SSCG104Aradu.RDK4XA02:86334489..863399210.03149.584986PREDICTED: probable pectinesterase/pectinesterase inhibitor 36-like [Glycine max]
SSCG105Araip.A9IK4B04:7236775..72380800137.8--
SSCG106Araip.BIZ4BB09:2161015..21615800136.04--
SSCG107Araip.WF9GZB03:128599514..1286040320.13134.561035.07692Flowering locus protein T
SSCG108Araip.X6DZUB02:99317123..993247240.16134.51840.6875PREDICTED: probable pectinesterase/pectinesterase inhibitor 36-like [Glycine max]
Table 2

GO classification of 337 SSCGs from A. duranensis and A. ipaensis.

OntologyClassGene Number
Biological Processsingle-organism process108
response to stimulus44
multicellular organismal process16
reproductive process10
reproduction10
multiorganism process6
biological regulation37
immune system process2
developmental process17
signaling10
localization29
metabolic process120
cellular component organization or biogenesis10
cellular process83
Molecular Functionnutrient reservoir activity23
molecular function regulator11
nucleic acid binding transcription factor activity14
antioxidant activity4
transporter activity14
electron carrier activity2
signal transducer activity2
catalytic activity108
binding92
Cellular Componentextracellular region6
membrane part29
cell part65
cell65
membrane37
organelle44
organelle part15
macromolecular complex2

Characterization of the top 108 SSCGs from A. duranensis and A. ipaensis

SSPs are usually isolated from seed storage proteins and/or other proteins related to seed development, such as Brassica napus Napin, which was isolated from a 2S storage protein [37], indicating that gene characterization may reflect the specificity of its promoter. To predict the activity of their promoters, we therefore characterized the 108 SSCGs. Among the top 108 SSCGs, 96 had putative functions, and 12 had unknown functions. The 96 SSCGs were classified into 14 groups according to their annotations, and 54 of those SSCGs were involved in lipid metabolism and seed maturation or coded for nutrient reservoir proteins, allergens, and seed storage proteins (Fig 1C), which revealed that these top 108 SSCGs might perform functions within peanut seeds.
Fig 1

Characterization of the top 108 SSCGs from A. duranensis and A. ipaensis.

(A) Chromosomal distribution of the 108 SSCGs. The chromosome numbers are shown at the top of each chromosome (black bars). The names on the left of each chromosome correspond to the approximate location of each SSCG. (B) Numbers of SSCGs on each chromosome of A. duranensis and A. ipaensis. (C) Functional classification of the 108 SSCGs.

Characterization of the top 108 SSCGs from A. duranensis and A. ipaensis.

(A) Chromosomal distribution of the 108 SSCGs. The chromosome numbers are shown at the top of each chromosome (black bars). The names on the left of each chromosome correspond to the approximate location of each SSCG. (B) Numbers of SSCGs on each chromosome of A. duranensis and A. ipaensis. (C) Functional classification of the 108 SSCGs. As shown in Fig 1A, SSCGs were randomly dispersed across 10 chromosomes. In A. duranensis, chromosome A6 contained the greatest number of SSCGs (15), while chromosome A4 contained the fewest SSCGs (1). In A. ipaensis, 13 SSCGs were distributed on chromosome B6, whereas only 3 SSCGs were found on chromosomes B1 and B3 (Fig 1B). Several SSCGs were located on the chromosomes in clusters; for example, 6 SSCGs (SSCG2, SSCG3, SSCG7, SSCG36, SSCG42, SSCG75) were within the 1.26–1.8 cM region on chromosome A6 (Fig 1A); functional prediction revealed that these SSCGs encoded nutrient reservoir proteins (Table 1). SSCG14 and SSCG23, both of which coded for seed linoleate 9S-lipoxygenase, were located at the same locus of chromosome B8. These results suggested that these clustered genes might function together in coordination. In this study, we identified 39 orthologous gene pairs between A. duranensis and A. ipaensis based on phylogenetic relationships (S2 Fig, Table 3), among which 36 orthologous gene pairs were found at the syntenic locus on the A. duranensis and A. ipaensis chromosomes (Fig 1A, Table 3). The orthologous genes from A. duranensis and A. ipaensis exhibited similar functions; for example, both SSCG63 (A9) and SSCG103 (B9) encode the AWPM-19-like family protein, and both SSCG87 (B9) and SSCG100 (A9) encode the papain family cysteine protease (Tables 1 and 3). Although the sequences of some orthologous gene pairs are highly similar, their promoter sequences were sometimes quite different. For example, SSCG43 (Araip.213GN) and SSCG94 (Aradu.440M4) had the same sequence, but their promoter sequences were quite different. Whether the promoters of orthologous gene pairs displayed the same specificity needs to be further determined. The location of 2 SSCGs in the A genome (SSCG21 and SSCG93) did not correspond to the same location of their orthologous genes in the B genome (SSCG12 and SSCG89). Interestingly, SSCG53, located on chromosome B7, had the same sequence as its orthologous gene, SSCG54, on chromosome B10.
Table 3

Orthologous gene pairs of the top 108 SSCGs from A. duranensis and A. ipaensis.

Gene pairChromosomeCDS identity (%)Protein identity (%)
SSCG1-SSCG5B06-A0694.2687.76
SSCG2-SSCG9A06-B0653.8152.85
SSCG3-SSCG7B06-A0690.7197.17
SSCG4-SSCG6B09-A0986.9885.62
SSCG10-SSCG11A01-B0198.9198.03
SSCG12-SSCG21B07-A0887.0180.41
SSCG13-SSCG44B06-A0695.2993.35
SSCG15-SSCG20B06-A0693.1597.90
SSCG16-SSCG27B09-A0998.7077.69
SSCG18-SSCG22A07-B0798.0897.58
SSCG19-SSCG24B06-A0673.7870.00
SSCG23-SSCG37B08-A0896.1896.06
SSCG26-SSCG72B02-A0298.9173.76
SSCG28-SSCG34A05-B0588.5788.30
SSCG29-SSCG30A05-B0593.0490.00
SSCG31-SSCG42B06-A0694.7294.57
SSCG33-SSCG50B05-A0598.8995.09
SSCG35-SSCG59A05-B0596.3096.10
SSCG36-SSCG45A06-B0646.8046.05
SSCG38-SSCG46A03-B0357.6557.86
SSCG39-SSCG74B06-A0690.1688.11
SSCG40-SSCG57B03-A0374.7952.84
SSCG41-SSCG75B06-A0698.4797.70
SSCG43-SSCG94B08-A08100100
SSCG47-SSCG71B03-A0391.5790.31
SSCG48-SSCG91B10-A1094.1290.37
SSCG49-SSCG62B06-A0695.4793.07
SSCG51-SSCG88B10-A1098.5587.50
SSCG52-SSCG58B06-A0698.8497.30
SSCG53-SSCG54B07-B10100100
SSCG56-SSCG90B09-A0998.1897.66
SSCG63-SSCG103A09-B0997.2598.34
SSCG66-SSCG81A08-B0896.9295.01
SSCG68-SSCG85A09-B0986.7486.04
SSCG69-SSCG92A09-B0997.6396.10
SSCG73-SSCG107A03-B0396.8096.59
SSCG87-SSCG100B09-A0998.2198.11
SSCG89-SSCG93B03-A0898.9498.41
SSCG104-SSCG108A02-B0297.1695.03

Expression patterns of the top 108 SSCGs

To confirm the tissue expression specificity of the top 108 SSCGs, we first analyzed the expression profiles using the expression information provided by Clevenger et al. [31]. The heat map results showed that all the top 108 genes were expressed in the seed; most were expressed only in the seed, whereas the rest were preferentially expressed in the seed (Fig 2). The expression patterns of the orthologous genes from the A and B genomes were similar. For example, SSCG12 and SSCG21 were highly expressed during the Pt6, Pt7, Pt8 and Pt10 seed stages but weakly expressed in other tissues, such as mainstem leaves, the reproductive shoot tip, nodule roots, stamens and the aerial gynophore tip. SSCG78 was expressed in the early seed development stage (SeedPt5-7), while SSCG106 was expressed in the late seed development stage (SeedPt7, 8, 10). Their promoters could be used to express genes at different seed development stages. Notably, SSCG1-12 was extremely highly expressed in the seeds, and specifically, SSCG1 and SSCG6 were abundantly expressed during all five seed development stages (Fig 2). Functional prediction analysis revealed that these SSCGs encoded nutrient reservoir proteins or allergen proteins (Table 1), whose transcripts are considered widely expressed specifically in mature peanut seed [38,39].
Fig 2

Expression profiles of the top 108 SSCGs in 20 different tissues of A. duranensis and A. ipaensis.

The FPKM data of 20 distinct tissues for the top 108 SSCGs were retrieved from the work of Clevenger et al. [31]. The FPKM value of each gene was log2-transformed and displayed in the form of heat maps by HemI. The color scale in the lower right represents the relative expression level: green represents a low level, and red indicates a high level. Twenty different tissues are shown on top of the heat map. The SSCGs are listed on the right of the heat map.

Expression profiles of the top 108 SSCGs in 20 different tissues of A. duranensis and A. ipaensis.

The FPKM data of 20 distinct tissues for the top 108 SSCGs were retrieved from the work of Clevenger et al. [31]. The FPKM value of each gene was log2-transformed and displayed in the form of heat maps by HemI. The color scale in the lower right represents the relative expression level: green represents a low level, and red indicates a high level. Twenty different tissues are shown on top of the heat map. The SSCGs are listed on the right of the heat map. We further examined the tissue expression specificity of the SSCGs in cultivated peanut via semiquantitative RT-PCR. Because the orthologous gene pairs had similar sequences, they were considered a single gene, and to investigate their expression patterns, primers were designed based on their same sequence. As shown in Fig 3, similar to the heat map results, most of these 108 SSCGs were expressed specifically and/or preferentially in the seed. Ninety-four out of the 108 SSCGs were expressed exclusively in the seed, accounting for 87%. Only a few SSCGs (SSCG13, 25, 41, 44, 51, 52, 58, 70, 75, 83, 84, 86, 88, 98) were also weakly expressed in other tissues, such as the roots, stems, pegs, pod shells and leaves.
Fig 3

Semiquantitative RT-PCR analysis of the top 108 SSCGs in different tissues of cultivated peanut.

Orthologous genes were considered a single gene. Expression patterns were detected in the roots (Rt), stems (St), leaves (Lf), pegs (Pg), pod shells (Ps) and seeds (Sd) using the Actin gene as an internal control.

Semiquantitative RT-PCR analysis of the top 108 SSCGs in different tissues of cultivated peanut.

Orthologous genes were considered a single gene. Expression patterns were detected in the roots (Rt), stems (St), leaves (Lf), pegs (Pg), pod shells (Ps) and seeds (Sd) using the Actin gene as an internal control. Overall, based on the expression pattern analysis above, the SSCGs described in this study are potential resources for seed-specific and/or preferential promoter cloning.

Cis-acting elements in the promoter regions of the top 108 SSCGs

Gene expression specificity was mediated by cis-elements in the promoter region [40,41]. To identify the regulatory cis-elements in the promoter region of SSCGs, we extracted the 2500 bp promoter sequence upstream of the start codon of the top 108 SSCGs. The results showed that there were 92 promoters containing RY REPEAT motifs and 33 promoters containing GCN4 motifs. Thirty-seven promoters contained more than three RY REPEAT motifs, and there were five motifs in SSCG28 (Aradu.DWL7L) and SSCG99 (Aradu.UJ6Z9) and six in SSCG74 (Aradu.9S6MI). Twenty-nine promoter sequences contained both motifs (Table 4). The RY REPEAT (CATGCA) [42] and GCN4 (TGAGTCA) [43,44] motifs are commonly located within seed- and/or embryo-specific promoter sequences. These results implied that most of the promoters of the top 108 SSCGs were seed specific.
Table 4

Numbers of two elements, RY REPEAT and GCN4 elements, in the promoter region of the top 108 SSCGs from A. duranensis and A. ipaensis.

Gene IDGene symbolRY REPEATGCN4
SSCG1Araip.D61U910
SSCG2Aradu.F9TAJ30
SSCG3Aradu.YGS8010
SSCG4Araip.T82B530
SSCG5Araip.WQE9Q20
SSCG6Aradu.2H0R030
SSCG7Aradu.YBK6Q40
SSCG8Aradu.B98FL30
SSCG9Araip.5JB5641
SSCG10Aradu.I3E1J31
SSCG11Araip.CF8RS21
SSCG12Araip.4G9JR21
SSCG13Araip.16S9Q41
SSCG14Araip.DH1Z041
SSCG15Araip.UPW6L40
SSCG16Araip.XV8NA20
SSCG17Araip.930A921
SSCG18Aradu.L7CNH10
SSCG19Araip.GWR7V00
SSCG20Aradu.P54FB20
SSCG21Aradu.CPR4411
SSCG22Araip.TR54110
SSCG23Araip.E99Y931
SSCG24Aradu.A02RY01
SSCG25Aradu.8NU6I30
SSCG26Araip.GVB7U10
SSCG27Aradu.TC8DF01
SSCG28Aradu.DWL7L51
SSCG29Aradu.YC8MH31
SSCG30Araip.MGW3610
SSCG31Araip.SK1EN20
SSCG32Aradu.UQE9230
SSCG33Araip.XXN6R20
SSCG34Araip.LJX8Z31
SSCG35Aradu.F1JZ511
SSCG36Aradu.1QI1621
SSCG37Aradu.WX5KP21
SSCG38Aradu.7S7IW21
SSCG39Araip.S2F6142
SSCG40Araip.YX1UI10
SSCG41Araip.LRG7E30
SSCG42Aradu.G7AM520
SSCG43Araip.213GN30
SSCG44Aradu.I953D30
SSCG45Araip.IGC5021
SSCG46Araip.I942720
SSCG47Araip.C23HG20
SSCG48Araip.Z0Q6Q21
SSCG49Araip.1862100
SSCG50Aradu.Y7IVD30
SSCG51Araip.K8QIE00
SSCG52Araip.PTL0N20
SSCG53Araip.8Y88R20
SSCG54Araip.FWQ8E20
SSCG55Aradu.QK6K130
SSCG56Araip.U0WFW00
SSCG57Aradu.QV0LR20
SSCG58Aradu.7HS2D30
SSCG59Araip.534K530
SSCG60Araip.YF7VP00
SSCG61Araip.55BM421
SSCG62Aradu.ZQ8HD01
SSCG63Aradu.F3KB220
SSCG64Araip.Z2VYZ20
SSCG65Aradu.KPI4B10
SSCG66Aradu.HX36X31
SSCG67Aradu.TVV1L20
SSCG68Aradu.G1YNF40
SSCG69Aradu.Y6LUX11
SSCG70Aradu.N27YB10
SSCG71Aradu.BXD3B30
SSCG72Aradu.KQ35F21
SSCG73Aradu.Z8JSI30
SSCG74Aradu.9S6MI60
SSCG75Aradu.XDS8420
SSCG76Araip.27I5U20
SSCG77Araip.FYJ9U40
SSCG78Araip.XR8KB00
SSCG79Aradu.9Z0RX00
SSCG80Araip.JTL3L00
SSCG81Araip.DK4JW00
SSCG82Aradu.X3CG020
SSCG83Araip.9194731
SSCG84Aradu.XM2MR10
SSCG85Araip.S3GXY40
SSCG86Aradu.X9GQ320
SSCG87Araip.H2E9521
SSCG88Aradu.HYY7900
SSCG89Araip.QXV0R21
SSCG90Aradu.FPC2C00
SSCG91Aradu.IZQ3Z20
SSCG92Araip.84L6B20
SSCG93Aradu.S2SYE11
SSCG94Aradu.440M421
SSCG95Araip.JYP5G11
SSCG96Araip.9C0MU10
SSCG97Araip.V3XTL10
SSCG98Aradu.80WBV00
SSCG99Aradu.UJ6Z950
SSCG100Aradu.UU57Q30
SSCG101Aradu.XGA9X21
SSCG102Araip.97QE130
SSCG103Araip.SP2PF20
SSCG104Aradu.RDK4X30
SSCG105Araip.A9IK410
SSCG106Araip.BIZ4B20
SSCG107Araip.WF9GZ30
SSCG108Araip.X6DZU00

Characterization of an SSP

To verify promoter tissue specificity, we isolated a 2771 bp promoter fragment (Arachis Hypogaea Seed-Specific Promoter 29, AHSSP29) from the cultivated cultivar peanut ‘Shitouqi’ according to the reference sequence of SSCG29 (Aradu.YC8MH) in its ancestor A. duranensis. SSCG29 encodes a vicilin-like seed storage protein. Several cis-acting elements, including one GCN4 motif [43,44], two RY REPEATs [42], and three 2SSEEDPROTBANAPAs [45], which commonly exist in SSPs, were detected in the AHSSP29 sequence (Table 5). AHSSP29 was then substituted with the CamV35S promoter in a pBI121 vector to produce a AHSSP29::GUS construct, which was subsequently transformed into Arabidopsis. GUS histochemical assays revealed GUS staining in all parts of the seed (Fig 4A–4C), with the exception of the seed testa. GUS staining was hard to observe in seed wrapped in a testa (Fig 4A), while GUS activity was clearly visible in the germinating seed that lacked a testa (Fig 4B and 4C). Definitive staining was also observed in the cotyledons and hypocotyls of the seedlings (Fig 4D), which are components of the seed. No GUS activity was detected in the leaves, stems, flowers, roots and siliques at any time during the plant life cycle (Fig 4E–4H). Nontransformed Arabidopsis plants did not display GUS activity in their mature seeds or any parts of the plants. These results suggested that the AHSSP29 promoter was an SSP.
Table 5

Putative cis-acting elements in the AHSSP29 promoter sequence.

ElementSequenceLocationPutative functions
SEF1 MOTIFATATTTAWW-651 (-), -740 (+)soybean embryo factor 1, found in the 5'-upstream region of the β-conglycinin gene
SEF4 MOTIFRTTTTTR-384 (-), -906 (-), -1727 (-), -1923 (+), -2463 (-), -2490 (-), -2541 (+), -2569 (+)found in soybean 5'-upstream region of the β-conglycinin gene
EBOXBNNAPACANNTG-30 (+), -103 (+), -120 (+), -169 (+), -1652 (+)E-box of the napA storage protein gene of Brassica napus
RY REPEATCATGCA-83 (+), -1782 (+),required for seed-specific expression
GCN4TGAGTCA-443 (-)required for endosperm-specific expression
2SSEEDPROTBANAPACAAACAC-165 (-), -1069 (+), -2401 (-)conserved in many storage protein gene promoters; important for high activity of the napA promoter
CANBNNAPACNAACAC-165 (-), -1069 (+), -1881 (+), -2401 (-), -2698 (+)embryo- and endosperm-specific transcription of the napin (storage protein) gene
CAAT-BoxCAAT-254 (+)common cis-acting element in promoter and enhancer regions
TATA-BoxTATATA-67 (+)core promoter element near -30 of the transcription start site

The symbol ‘+’ or ‘-’ in parentheses represents the DNA strand in which the element is situated.

The negative number indicates the location of elements within AHSSP29.

Fig 4

Functional characterization of the putative promoter AHSSP29 in transgenic Arabidopsis.

(A) Mature seed wrapped in a testa. (B-C) Germinating seed without a testa. (D) Young seedlings with two true leaves. (E) Adult plant. (F) Stem and flower of adult plants.(G-H) Siliques.

Functional characterization of the putative promoter AHSSP29 in transgenic Arabidopsis.

(A) Mature seed wrapped in a testa. (B-C) Germinating seed without a testa. (D) Young seedlings with two true leaves. (E) Adult plant. (F) Stem and flower of adult plants.(G-H) Siliques. The symbol ‘+’ or ‘-’ in parentheses represents the DNA strand in which the element is situated. The negative number indicates the location of elements within AHSSP29.

Discussion

SSPs are valuable tools for the genetic engineering of seed, especially for seed bioreactor research. Peanut seeds are ideal bioreactors for the production of foreign recombinant proteins and other nutrient metabolites. However, only a few seed-specific and/or seed-preferential promoters have been identified from peanut [17-19,46]. Expressing multiple foreign genes using the same promoters is ill advised [14-16]. Therefore, additional SSPs are urgently needed. In this study, we established an effective method for the genome-scale mining of SSCGs via comparative transcriptome sequencing of a mixture of nonseed tissue and seed tissue. A total of 337 SSCGs were identified, and 108 SSCGs in A. duranensis and A. ipaensis were further characterized. At least 94 SSCGs were confirmed via semiquantitative RT-PCR to be expressed specifically in the seed in cultivated peanut, and the rest were preferentially expressed in the seed. This study provided a valuable resource for seed-specific and/or seed-preferential promoter cloning. Among the 108 identified SSCGs, most functioned in relation to seed development or coded for allergen proteins or storage proteins (Fig 1C, Table 1). For example, SSCG1-7 and SSCG9, which encoded allergen proteins, were homologous genes and were extremely highly expressed according to their FPKM values (Table 1), heat map results (Fig 2) and semiquantitative RT-PCR analysis (Fig 3). Peanut allergen proteins were reported to be expressed exclusively in the seed [39] and accounted for a considerable amount of the total seed protein in peanut [47]. This finding is in accordance with the abundant expression of SSCG1-7 and SSCG9 in the peanut seed. These results indicated that these SSCGs were expressed specifically in the seed, and these SSCGs that were most abundantly expressed were the focus of our subsequent promoter cloning. Studies have shown that several cis-acting elements in promoter sequences are responsible for mediating gene expression specificity. For example, the cis-acting elements RY REPEAT and GCN4 are conserved among many SSPs [42,43]. These cis-acting elements were also present throughout most of the SSCGs in this study, which implied that the promoters in most of the SSCGs might drive gene expression in a seed-specific manner. Several promoters of these SSCGs have been characterized as SSPs. For example, the promoter of an SSCG5 paralogous gene, which encodes an allergen protein, was isolated and characterized as an SSP [19]. Together with the novel SSP AHSSP29 of SSCG29 (Aradu.YC8MH) identified in this study, which contained 2 RY REPEAT and 1 GCN4 elements, the results indicate that the SSCG mining strategy in this study seemed effective and accurate. Once these promoters are isolated and characterized, they could be widely used for allergen reduction via gene editing technologies and for other research on seed quality improvement. Geng et al. [48] introduced a method for tissue-specific promoter cloning by comparing expression levels among three tissues: leaves, roots, and seeds. A total of 316 seed-specific candidate transcript assembly contigs (TACs) were identified. In addition, 64.6% of select TACs were expressed exclusively in the seed and not in the leaves, stems, or roots [48]. However, to date, no SSPs have been identified based on these data, which may be attributed to insufficient transcriptome data and the lack of reference genome information. In our study, only two samples were chosen for transcriptome sequencing: seeds from different development stages and a mixture of nonseed tissue from six tissues (including roots, stems, leaves, flowers, pegs, and pod shells). It is much less expensive to sequence the transcriptome of nonseed tissue mixtures than to sequence each individual tissue. Moreover, it becomes simpler and more accurate to screen SSCGs by comparing two samples rather than by comparing numerous samples. Consequently, 337 SSCGs were identified, and 87% of the top 108 SSCGs were expressed exclusively in the seed and not in the five measured tissues (roots, stems, leaves, pegs, and pod shells). These results indicated that additional tissues were necessary as part of the nonseed sample to compare gene expression differences with seed samples. This SSCG information, such as the gene symbols, can be obtained conveniently from Table 1 and S2 Table. Researchers could easily download SSCGs of interest from the PeanutBase website according to this information. With the decreasing transcriptome sequencing cost and the release of the peanut genome, mining tissue-specific genes from peanut via comparative transcriptome sequencing has become a robust approach. For example, contamination with aflatoxin, which is produced in infected peanut seeds by Aspergillus flavus, is one of the major problems in peanut production. Given that peanut pericarps are barriers against A. flavus, pericarp-specific promoters are a good choice for expressing A. flavus-resistant genes specifically in the pericarp to prevent aflatoxin contamination. Pericarp-specific promoters could be identified by the strategy presented in this study.

Conclusions

We identified 337 SSCGs by comparative RNA sequencing (RNA-seq) between seed and nonseed tissues. The top 108 SSCGs, according to their FPKM, were characterized, among which 94 were expressed specifically in the seed, and 14 were preferentially expressed in the seed. In addition, a novel SSP, AHSSP29, was functionally characterized. The strategy presented in this study could facilitate the future exploration of tissue-specific promoters in other crop species. Additionally, the SSCGs identified in this work could be widely applied for SSP cloning by other researchers.

GO annotation of 337 SSCGs identified from A.duranensis and A.ipaensis.

The Y-axis represents the number of genes in a category. (JPG) Click here for additional data file.

Phylogenetic relationships of the 108 SSCGs.

The phylogenetic tree was constructed with MEGA 6.0 using the NJ method with 1000 bootstrap replicates based on a multiple alignment of 108 SSCGs from A.duranensis and A.ipaensis. (JPG) Click here for additional data file.

Summary of the sequence data from Illimina sequencing.

(DOCX) Click here for additional data file.

List of SSCGs (SSCG109-337) identified from A.duranensis and A.ipaensis by comparative transcriptome sequencing.

(DOCX) Click here for additional data file.

Primers used in this study.

(DOCX) Click here for additional data file.
  46 in total

Review 1.  When more is better: multigene engineering in plants.

Authors:  Shaista Naqvi; Gemma Farré; Georgina Sanahuja; Teresa Capell; Changfu Zhu; Paul Christou
Journal:  Trends Plant Sci       Date:  2009-10-21       Impact factor: 18.313

2.  Plant cis-acting regulatory DNA elements (PLACE) database: 1999.

Authors:  K Higo; Y Ugawa; M Iwamoto; T Korenaga
Journal:  Nucleic Acids Res       Date:  1999-01-01       Impact factor: 16.971

3.  Identification of cis-regulatory elements required for endosperm expression of the rice storage protein glutelin gene GluB-1.

Authors:  H Washida; C Y Wu; A Suzuki; U Yamanouchi; T Akihama; K Harada; F Takaiwa
Journal:  Plant Mol Biol       Date:  1999-05       Impact factor: 4.076

4.  Comparative transcriptome analysis reveals significant differences in gene expression and signalling pathways between developmental and dark/starvation-induced senescence in Arabidopsis.

Authors:  Vicky Buchanan-Wollaston; Tania Page; Elizabeth Harrison; Emily Breeze; Pyung Ok Lim; Hong Gil Nam; Ji-Feng Lin; Shu-Hsing Wu; Jodi Swidzinski; Kimitsune Ishizaki; Christopher J Leaver
Journal:  Plant J       Date:  2005-05       Impact factor: 6.417

5.  Disruption of an overlapping E-box/ABRE motif abolished high transcription of the napA storage-protein promoter in transgenic Brassica napus seeds.

Authors:  K Stålberg; M Ellerstöm; I Ezcurra; S Ablov; L Rask
Journal:  Planta       Date:  1996       Impact factor: 4.116

6.  GUS fusions: beta-glucuronidase as a sensitive and versatile gene fusion marker in higher plants.

Authors:  R A Jefferson; T A Kavanagh; M W Bevan
Journal:  EMBO J       Date:  1987-12-20       Impact factor: 11.598

7.  TILLING for allergen reduction and improvement of quality traits in peanut (Arachis hypogaea L.).

Authors:  Joseph E Knoll; M Laura Ramos; Yajuan Zeng; C Corley Holbrook; Marjorie Chow; Sixue Chen; Soheila Maleki; Anjanabha Bhattacharya; Peggy Ozias-Akins
Journal:  BMC Plant Biol       Date:  2011-05-12       Impact factor: 4.215

8.  Cloning and Characterization of 5' Flanking Regulatory Sequences of AhLEC1B Gene from Arachis Hypogaea L.

Authors:  Guiying Tang; Pingli Xu; Wei Liu; Zhanji Liu; Lei Shan
Journal:  PLoS One       Date:  2015-10-01       Impact factor: 3.240

9.  Seed-Specific Expression of the Arabidopsis AtMAP18 Gene Increases both Lysine and Total Protein Content in Maize.

Authors:  Yujie Chang; Erli Shen; Liuying Wen; Jingjuan Yu; Dengyun Zhu; Qian Zhao
Journal:  PLoS One       Date:  2015-11-18       Impact factor: 3.240

10.  Engineering of 'Purple Embryo Maize' with a multigene expression system derived from a bidirectional promoter and self-cleaving 2A peptides.

Authors:  Xiaoqing Liu; Wenzhu Yang; Bona Mu; Suzhen Li; Ye Li; Xiaojin Zhou; Chunyi Zhang; Yunliu Fan; Rumei Chen
Journal:  Plant Biotechnol J       Date:  2018-03-12       Impact factor: 9.803

View more
  5 in total

1.  Characterization of sucrose binding protein as a seed-specific promoter in transgenic tobacco Nicotiana tabacum L.

Authors:  Nasibeh Chenarani; Abbasali Emamjomeh; Hassan Rahnama; Katayoun Zamani; Mahmoud Solouki
Journal:  PLoS One       Date:  2022-06-03       Impact factor: 3.752

2.  Cloning and Functional Characterization of a Pericarp Abundant Expression Promoter (AhGLP17-1P) From Peanut (Arachis hypogaea L.).

Authors:  Yasir Sharif; Hua Chen; Ye Deng; Niaz Ali; Shahid Ali Khan; Chong Zhang; Wenping Xie; Kun Chen; Tiecheng Cai; Qiang Yang; Yuhui Zhuang; Ali Raza; Weijian Zhuang
Journal:  Front Genet       Date:  2022-01-20       Impact factor: 4.599

3.  Epigenetic regulation of seed-specific gene expression by DNA methylation valleys in castor bean.

Authors:  Bing Han; Di Wu; Yanyu Zhang; De-Zhu Li; Wei Xu; Aizhong Liu
Journal:  BMC Biol       Date:  2022-03-01       Impact factor: 7.431

4.  Genome-Wide Identification, Characterization, and Expression Analysis of Tubby-like Protein (TLP) Gene Family Members in Woodland Strawberry (Fragaria vesca).

Authors:  Shuangtao Li; Guixia Wang; Linlin Chang; Rui Sun; Ruishuang Wu; Chuanfei Zhong; Yongshun Gao; Hongli Zhang; Lingzhi Wei; Yongqing Wei; Yuntao Zhang; Jing Dong; Jian Sun
Journal:  Int J Mol Sci       Date:  2022-10-08       Impact factor: 6.208

5.  Comprehensive genomic characterization of NAC transcription factor family and their response to salt and drought stress in peanut.

Authors:  Cuiling Yuan; Chunjuan Li; Xiaodong Lu; Xiaobo Zhao; Caixia Yan; Juan Wang; Quanxi Sun; Shihua Shan
Journal:  BMC Plant Biol       Date:  2020-10-02       Impact factor: 4.215

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.