Literature DB >> 29321580

Genome-wide comparative analysis of papain-like cysteine protease family genes in castor bean and physic nut.

Zhi Zou1, Qixing Huang2, Guishui Xie3, Lifu Yang3.   

Abstract

Papain-like cysteine proteases (PLCPs) are a class of proteolytic enzymes involved in many plant processes. Compared with the extensive research in Arabidopsis thaliana, little is known in castor bean (Ricinus communis) and physic nut (Jatropha curcas), two Euphorbiaceous plants without any recent whole-genome duplication. In this study, a total of 26 or 23 PLCP genes were identified from the genomes of castor bean and physic nut respectively, which can be divided into nine subfamilies based on the phylogenetic analysis: RD21, CEP, XCP, XBCP3, THI, SAG12, RD19, ALP and CTB. Although most of them harbor orthologs in Arabidopsis, several members in subfamilies RD21, CEP, XBCP3 and SAG12 form new groups or subgroups as observed in other species, suggesting specific gene loss occurred in Arabidopsis. Recent gene duplicates were also identified in these two species, but they are limited to the SAG12 subfamily and were all derived from local duplication. Expression profiling revealed diverse patterns of different family members over various tissues. Furthermore, the evolution characteristics of PLCP genes were also compared and discussed. Our findings provide a useful reference to characterize PLCP genes and investigate the family evolution in Euphorbiaceae and species beyond.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29321580      PMCID: PMC5762910          DOI: 10.1038/s41598-017-18760-6

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

Castor bean (Ricinus communis L., 2n = 20) and physic nut (Jatropha curcas L., 2n = 22) are two economically important species that belong to the spurge family, Euphorbiaceae[1,2]. Castor bean, a perennial shrub of the monotypic Ricinus genus, is indigenous to Africa. The oil produced in castor seeds, mainly composed of the unusual hydroxylated fatty acid ricinoleic acid, is widely used for industrial, medicinal and cosmetic purposes, having prompted its domestication in many tropical, subtropical and warm temperate regions around the world[3]. Physic nut, also known Barbados nut, purging nut or jatropha, is a semi-evergreen shrub or small tree originated from central America, and now is widely cultivated in tropical and subtropical regions around the world[4]. Physic nut is a potential energy plant with the high oil content (up to 50%) in seeds and the fossil fuel-like oil composition that can be easily processed into bio-diesel[5]. The draft genome of castor bean was reported in 2010, which consists of 25,878 scaffolds spanning approximate 400 Mb[6]. The genome size of physic nut was estimated to be 350 Mb and two genome assemblies have been available[7,8]. The more complete assembly is about 320 Mb consisting of 23,125 scaffolds, and the number of putative protein-encoding genes of 27,172 is slightly smaller than 31,221 in castor bean[6,8]. Most importantly, except for the ancient so-called γ whole-genome duplication (WGD) event shared by core eudicots including Arabidopsis thaliana and poplar (Populus trichocarpa), comparative genomics analysis indicated that both castor bean and physic nut didn’t experience additional recent WGD[6,8,9]. From this perspective, analysis of certain gene families in castor bean and physic nut may provide insights into lineage-specific evolution in high plants especially in core eudicots. Papain-like cysteine proteases (PLCPs) are a class of proteolytic enzymes that are associated with plant growth, development, protein storage and mobilization, organ senescence, abscission, seed germination, immunity and stress response[10-14]. Featuring a catalytic cysteine as a nucleophile during proteolysis, PLCPs are classed as the family C1A of clan CA and thus are also known as C1A cysteine proteases[15]. PLCPs are produced as preproproteases which usually include a signal peptide, an auto-inhibitory pro-domain and a mature protease domain[16]. The signal peptide ensures that the proprotease enters the endomembrane system, whereas the pro-domain prevents premature activation of the protease. Thereby, the protease precursors are usually inactive or weakly active. To become active, PLCPs need to be processed either by self-processing or with the aid of processing enzymes, which depends on the pH, the action of other proteases and protease inhibitors, and the cellular or extracellular environment[10,17]. In animals, PLCPs are often called cathepsins, and PLCPs in plants are named cathepsin L-, B-, H-, or F-like based on sequence similarity. Furthermore, the L-like can be subclassed into five phylogenetic subgroups (i.e. A–E)[18]. Although several properties of individual PLCPs have been reported in a wide range of plant species[19-21], the genome-wide analysis is still limited to several species such as Arabidopsis, poplar and rubber (Hevea brasiliensis) which were proven to have undergone two or one recent doubling events respectively[2,16,22-25]. In Arabidopsis, 31 PLCPs were recently divided into nine subfamilies based on the phylogenetic analysis[17]: Subfamily 9 (CTB3-like or CTB) contains three cathepsin B-like PLCPs; Subfamily 8 (ALP-like or ALP) contains two cathepsin H-like PLCPs with the vacuolar-targeting NPIR motif at the N-terminal; Subfamily 7 (RD19A-like or RD19) contains four cathepsin F-like PLCPs; Subfamily 6 (SAG12-like or SAG12) contains six members; both Subfamily 5 (THI1-like or THI) and Subfamily 4 (XBCP3-like or XBCP3) contain a single member; Subfamily 3 (XCP2-like or XCP) contains two members; Subfamily 2 (CEP1-like or CEP) contains three members with the endoplasmic reticulum (ER)-localizing KDEL motif at the C-terminal; Subfamily 1 (RD21A-like or RD21) contains nine members (see Supplementary Table S1). In this study, genome-wide identification of castor bean and physic nut PLCP family genes was carried out. Their gene structures were manually curated through aligning transcriptome data to the gene-encoding scaffolds. Furthermore, the sequence feature, evolutionary relationship and expression pattern were also investigated and compared.

Results

Characterization of 26 PLCP genes in castor bean

The initial search of the castor bean genome resulted in 28 loci putatively encoding PLCP homologs. All of them were predicted by the genome annotation[6], however, two loci (i.e. 29900.t000066 and 29675.t000004) were shown to encode pseudogenes and were not further analyzed. The remaining 26 loci are distributed across 16 scaffolds. Most scaffolds were found to harbor a single PLCP gene, however, six of them were shown to contain more than one, i.e., scaffold30170 (4), scaffold29646 (3), scaffold29900 (3), scaffold30131 (2), scaffold28962 (2) and scaffold29910 (2) (Table 1).
Table 1

List of 26 RcPLCP genes identified in this study.

Gene nameLocus IDProtein IDScaffoldPredicted positionIdentified positionEST hitsASa AAMW (KDa) pI GRAVYiPSORTb At_orthologc Jc_orthologc
RcRD21A 30170.t00024330170.m013831scaffold301701214556–12118091214722–121131616Yes46952.115.39−0.492SAtRD21AJcRD21A
RcRD21B 29801.t00006929801.m003124scaffold29801415594–412915415851–4124541047152.185.71−0.470SJcRD21B
RcRD21C 29970.t00000229970.m000973scaffold2997017319–2102117149–21809038342.966.56−0.419MAtRDL1JcRD21C
RcCEP1 30147.t00009730147.m013826scaffold301472830277–28286422830593–28280563Yes36040.115.97−0.571SAtCEP1JcCEP1
RcCEP2 29929.t00028829929.m004785scaffold299291643832–16420901643874–16416860Yes35940.395.87−0.603SJcCEP2
RcXCP1 30162.t00004630162.m001301scaffold301621780890–17796331781121–1779118034939.015.44−0.390SAtXCP1JcXCP1
RcXCP2 30170.t00021330170.m013801scaffold301704386293–43877334386238–4387922134939.035.25−0.412SAtXCP2JcXCP2
RcXBCP3 30170.t00052430170.m014112scaffold301702901779–28995952901979–2898207146651.597.43−0.296SAtXBCP3JcXBCP3
RcXBCP3L 29381.t00000129381.m000072scaffold293817721–117837678–121601350155.855.05−0.432SJcXBCP3L
RcTHI1 29646.t00005729646.m001109scaffold29646344102–345635343762–3459571Yes34738.635.48−0.439SAtTHI1JcTHI1
RcSAG12H1 30131.t00040830131.m007257scaffold301312504598–25057662504200–2506430036241.116.17−0.412SAtSAG12JcSAG12H3
RcSAG12H2 28962.t00001728962.m000448scaffold2896292991–9410192944–94101034037.205.22−0.441SAtSAG12JcSAG12H4
RcSAG12H3 28962.t00001828962.m000449scaffold2896296412–9752296181–977350Yes34037.465.16−0.424SAtSAG12JcSAG12H4
RcSAG12H4 29646.t00003329646.m001085scaffold29646207629–208856207526–209015034938.559.33−0.347SAtSAG12
RcSAG12H5 29646.t00003429646.m001086scaffold29646211653–212894211496–213036034238.068.59−0.417SAtSAG12
RcSAG12H6 29900.t00006529900.m001603scaffold29900407069–405926407069–405639034438.125.13−0.428SAtSAG12JcSAG12H7
RcSAG12H7 29910.t00001529910.m000924scaffold29910208698–206791208876–206709034137.414.86−0.458SAtSAG12JcSAG12H8
RcSAG12H8 29910.t00001429910.m000923scaffold29910204533–202640204533–202640034237.404.71−0.439SAtSAG12JcSAG12H8
RcPAP1 29900.t00007829900.m001616scaffold29900487805–488928487718–489114034037.646.33−0.325SAtPAP1
RcPAP2 29900.t00007729900.m001615scaffold29900483849–485660483819–485847134337.894.87−0.343SAtPAP1
RcPAP3 29827.t00014529827.m002672scaffold29827836744–835493836744–835443034237.915.07−0.371MAtPAP1
RcRD19A 30131.t00024930131.m007098scaffold301311514779–15164611514766–15169038Yes37341.075.95−0.312SAtRD19AJcRD19A
RcRD19B 30170.t00053430170.m014122scaffold301702964658–29662042964578–296652868Yes36640.385.68−0.249SAtRD19CJcRD19B
RcRD19C 28462.t00000428462.m000130scaffold2846246401–4880246261–4931217Yes38141.735.87−0.037SAtRD19DJcRD19C
RcALP1 29739.t00019329739.m003757scaffold297391197717–12001741197581–12006663Yes35839.215.88−0.185SAtALPJcALP1
RcCTB1 30076.t00007430076.m004510scaffold30076399770–402553399629–4031622Yes35939.845.90−0.200SAtCTB2JcCTB1

a“Yes” represents genes containing alternative splicing isoforms; b“S, M and C” represent signal peptide, mitochondrial targeting peptide or chloroplast transit peptide, respectively; cThe best ortholog hit.

List of 26 RcPLCP genes identified in this study. a“Yes” represents genes containing alternative splicing isoforms; b“S, M and C” represent signal peptide, mitochondrial targeting peptide or chloroplast transit peptide, respectively; cThe best ortholog hit. Except for RcCEP1 (GenBank accession number AF050756)[26], homology analysis showed that no full-length cDNA sequences of the other 25 RcPLCP genes were reported in any public database (as of Dec 2016). Nevertheless, 13 members had EST (expressed sequence tag) hits in GenBank and RcRD19B was found to harbor the maximal hit of 68 ESTs. Moreover, the expression of other genes was supported by RNA sequencing reads derived from various tissue transcriptomes, i.e. leaf, flower, endosperm and seed[27-29]. Except for RcSAG12H8, the transcription regions of all other RcPLCP genes were successfully extended based on the read alignment (Table 1). Since the gene models released in castor bean were the result of an automatic annotation[6], an expert revision of their gene structures was conducted via mapping ESTs and RNA sequencing reads to the identified scaffolds. Interesting, six out of the 26 annotated gene models were proved to be inaccurate. The locus 29970.t000002 (RcRD21C) was predicted to encode 366 residues (29970.m000973), and it represents only the 3′ sequence of the gene which encodes 383 residues (see Supplementary File S1). The locus 30162.t000046 (RcXCP1) was predicted to contain three introns putatively encoding 324 residues (30162.m001301), however, hundreds of RNA sequencing reads indicated that the “GAAA” sequence in the first exon was absent from the genome assembly. Thereby, this locus promises to harbor two introns encoding 349 residues (see Supplementary File S2). The locus 30170.t000524 (RcXBCP3) was predicted to contain four introns putatively encoding 422 residues (30170.m014112), however, read alignment indicated that this locus harbors five introns putatively encoding 466 residues (see Supplementary File S3). The locus 29381.t000001 (RcXBCP3L) was predicted to encode 417 residues (29381.m000072), however, read alignment indicated that this locus encodes 501 residues (see Supplementary File S4). The locus 29827.t000145 (RcPAP3) was predicted to contain two introns putatively encoding 321 residues (29827.m002672), however, read alignment indicated that this locus harbors one intron putatively encoding 342 residues (see Supplementary File S5). The locus 30076.t000074 (RcCTB1) was predicted to encode 376 residues (30076.m004510), however, read alignment indicated that this locus encodes 359 residues (see Supplementary File S6). Additionally, 10 genes (i.e. RcRD21A, RcCEP1, RcCEP2, RcTHI1, RcSAG12H3, RcRD19A, RcRD19B, RcRD19C, RcALP1 and RcCTB1) were shown to have alternative splicing isoforms (Table 1). Four gene pairs (i.e. paralogs) can be defined as tandem duplicates for their close organization on same scaffolds and high sequence identity, i.e., 97.4% between RcSAG12H7 (29910.t000015) and RcSAG12H8 (29910.t000014), 96.8% between RcSAG12H2 (28962.t000017) and RcSAG12H3 (28962.t000018), 87.5% between RcSAG12H4 (29646.t000033) and RcSAG12H5 (29646.t000034), and 74.1% between RcPAP1 (29900.t000078) and RcPAP2 (29900.t000077). However, whether RcPAP3 is a proximal duplicate of RcPAP1 or RcPAP2 still needs to be confirmed, since the 25,878 assembled scaffolds have not been anchored to the chromosomes yet[6].

Characterization of 23 PLCP genes in physic nut

After discarding four pseudogenes (i.e. JCGZ_22119, JCGZ_05109 and two unpredicted loci on scaffold170), a total of 23 PLCP-encoding loci were identified from the physic nut genome. Among them, 22 loci were predicted by the automatic genome annotation[8], whereas one more locus putatively encoding a SAG12 subfamily member was identified from scaffold684 (i.e. JcSAG12H8, see Supplementary File S7). These loci are distributed across 17 scaffolds. Like observed in castor bean, most scaffolds contain one PLCP gene, and four scaffolds encoding more than one are as follows: scaffold684 (4), scaffold341 (2), scaffold159 (2) and scaffold872 (2) (Table 2). When taking the linkage map with 1208 markers[8] into account, these scaffolds can be further anchored to nine chromosomes (Chrs), i.e., Chr2 (scaffold84 and scaffold96), Chr3 (scaffold26 and scaffold684), Chr4 (scaffold5, scaffold46, scaffold159 and scaffold221), Chr5 (scaffold3 and scaffold328), Chr7 (scaffold341 and scaffold502), Chr8 (scaffold392), Chr9 (scaffold872), Chr10 (scaffold7 and scaffold464) and Chr11 (scaffold211). The distribution of JcPLCP genes looks uneven: Chromosomes 3 and 4 contain the most of five PLCP genes, followed by Chromosome 7 with three genes (Fig. 1).
Table 2

List of 23 JcPLCP genes identified in this study.

Gene nameLocus IDScaffoldPredicted positionIdentified positionChrEST hitsASa AAMW (KDa) pI GRAVYiPSORTb At_orthologc Rc_orthologc
JcRD21A JCGZ_16099scaffold463722788–37258873726134–372238145346651.795.28−0.515SAtRD21ARcRD21A
JcRD21B JCGZ_22120scaffold7104053–105002103886–10650810647552.685.39−0.504SRcRD21B
JcRD21C JCGZ_12447scaffold3411511831–15148971511754–151528773Yes36641.125.40−0.372SAtRDL1RcRD21C
JcCEP1 JCGZ_11373scaffold328482830–484642484817–48266653036040.175.69−0.552SAtCEP1RcCEP1
JcCEP2 JCGZ_17869scaffold5022577993–25791432579634–257780773635839.946.19−0.548SRcCEP2
JcXCP1 JCGZ_10746scaffold3258651–259875258620–26001950Yes34939.055.59−0.361SAtXCP1RcXCP1
JcXCP2 JCGZ_08041scaffold221310092–311414311442–3099514035039.185.40−0.367SAtXCP2RcXCP2
JcXBCP3 JCGZ_04495scaffold1591388654–13912241391430–138820140Yes44148.826.06−0.333SAtXBCP3RcXBCP3
JcXBCP3L JCGZ_21572scaffold6842356094–23583802355980–23586693352458.045.24−0.369CRcXBCP3L
JcTHI1 JCGZ_12228scaffold34182305–8344683709–8170876Yes34738.948.43−0.615SAtTHI1RcTHI1
JcSAG12H1 JCGZ_09604scaffold26600732–601907602051–6005463034538.267.99−0.438SAtSAG12
JcSAG12H2 JCGZ_21557scaffold6842264217–22646722264714–22632273034538.408.56−0.377SAtSAG12
JcSAG12H3 JCGZ_24483scaffold84187704–188806189021–1873892033937.605.94−0.424SAtSAG12RcSAG12H1
JcSAG12H4 JCGZ_17185scaffold5199542–200754199432–2018964034037.425.13−0.413SAtSAG12RcSAG12H2
JcSAG12H5 JCGZ_25371scaffold872469319–470468472278–4736499034037.595.01−0.424SAtSAG12RcSAG12H2
JcSAG12H6 JCGZ_25372scaffold872472278–473548469201–4704689034037.286.90−0.426SAtSAG12RcSAG12H2
JcSAG12H7 JCGZ_21549scaffold6842211946–22130742213483–22114233034437.984.73−0.372SAtSAG12RcSAG12H6
JcSAG12H8 scaffold6842215711–22143173034237.484.97−0.413SAtSAG12RcSAG12H7
JcRD19A JCGZ_26761scaffold96231461–233265233346–23098826737040.565.95−0.246SAtRD19ARcRD19A
JcRD19B JCGZ_04503scaffold1591432232–14342631432160–143461541136840.925.81−0.338SAtRD19CRcRD19B
JcRD19C JCGZ_16165scaffold464314963–316717316756–314735101138342.106.03−0.106SAtRD19DRcRD19C
JcALP1 JCGZ_07488scaffold2112991934–29947692994891–29917021151Yes35839.445.88−0.147SAtALPRcALP1
JcCTB1 JCGZ_14145scaffold3921649834–16530161653334–16494098735839.716.07−0.230SAtCTB2RcCTB1

a“Yes” represents genes containing alternative splicing isoforms; b“S, M and C” represent signal peptide, mitochondrial targeting peptide or chloroplast transit peptide, respectively; cThe best ortholog hit.

Figure 1

Chromosomal distribution of 23 JcPLCP genes. The eleven linkage groups or chromosomes were constructed with 1208 DNA markers, where the chromosome number is indicated at the top.

List of 23 JcPLCP genes identified in this study. a“Yes” represents genes containing alternative splicing isoforms; b“S, M and C” represent signal peptide, mitochondrial targeting peptide or chloroplast transit peptide, respectively; cThe best ortholog hit. Chromosomal distribution of 23 JcPLCP genes. The eleven linkage groups or chromosomes were constructed with 1208 DNA markers, where the chromosome number is indicated at the top. As of Dec 2016, although no full-length cDNA sequences were available for all physic nut PLCP family genes, 12 members were found to have EST hits in GenBank and JcRD19A harbored the maximum of 67 hits. Moreover, the expression of other members was supported by RNA sequencing reads derived from transcriptomes of various tissues including callus, root, leaf, inflorescence meristem, flower, embryo and seed[7,8,30-34]. Based on the read alignment, the transcription regions of all JcPLCP genes were extended and seven predicted gene models were optimized (Table 2). The locus JCGZ_22120 (JcRD21B) was predicted to encode 242 residues, and it represents only the 5′ sequence of the gene which encodes 471 residues (see Supplementary File S8). The locus JCGZ_17869 (JcCEP2) was predicted to encode 269 residues, and it represents only the 3′ sequence of the gene which encodes 358 residues (see Supplementary File S9). The locus JCGZ_21572 (JcXBCP3L) was predicted to encode 508 residues, and it represents only the 3′ sequence of the gene which encodes 524 residues (see Supplementary File S10). The locus JCGZ_12228 (JcTHI1) was predicted to encode 315 residues, and it represents only the 3′ sequence of the gene which encodes 347 residues (see Supplementary File S11). The locus JCGZ_09604 (JcSAG12H1) was predicted to encode 311 residues, and it represents only the 3′ sequence of the gene which encodes 345 residues (see Supplementary File S12). The locus JCGZ_21557 (JcSAG12H2) was predicted to encode 155 residues, and it represents only the 5′ sequence of the gene which encodes 345 residues (see Supplementary File S13). The locus JCGZ_21549 (JcSAG12H7) was predicted to encode 324 residues, and it represents only the 3′ sequence of the gene which encodes 344 residues (see Supplementary File S14). Additionally, five genes (i.e. JcRD21C, JcXCP1, JcXBCP3, JcTHI1 and JcALP1) were shown to have alternative splicing isoforms (Table 2). Several gene pairs were shown to exhibit high sequence identity, i.e., 97.8% between JcSAG12H1 and JcSAG12H2, 88.6% between JcSAG12H5 and JcSAG12H6, 76.4% between JcSAG12H7 and JcSAG12H8, 67.2% between JcSAG12H8 and JcSAG12H1, 66.8% between JcSAG12H8 and JcSAG12H2, 65.4% between JcSAG12H7 and JcSAG12H1, 65.2% between JcSAG12H7 and JcSAG12H2. JcSAG12H2/7/8 and JcSAG12H5/6 can be defined as tandem duplicates for their adjacent organization on same scaffolds, whereas JcSAG12H1 can be defined as the recent proximal duplicate of JcSAG12H2 for their distribution on two distinct scaffolds of Chromosome 3 (Table 2 and Fig. 1).

Phylogenetic analysis of castor bean and physic nut PLCPs

According to the reciprocal BLASTP analysis, 26 RcPLCPs have 20 or 14 orthologous groups (OGs) in physic nut and Arabidopsis, respectively (Table 1), and 23 JcPLCPs have 19 or 13 OGs in castor bean and Arabidopsis, respectively (Table 2), suggesting gene-specific duplication and/or loss occurred. To reveal the evolutionary relationship, we constructed a phylogenetic tree using 80 PLCPs from castor bean, physic nut as well as the extensively studied Arabidopsis. Results showed that castor bean and physic nut PLCPs can be divided into nine subfamilies as described in Arabidopsis, i.e., RD21, CEP, XCP, XBCP3, THI, SAG12, RD19, ALP and CTB (Figs 2 and 3). Furthermore, RcRD21B/JcRD21B, RcCEP2/JcCEP2 and two XBCP3 members (named RcXBCP3L/JcXBCP3L), which have no orthologs in Arabidopsis, were found to form new groups (Tables 1 and 2, and Fig. 2).
Figure 2

Phylogenetic analysis of castor bean, physic nut and Arabidopsis PLCPs. Sequence alignment and construction of the phylogenetic tree were performed using MUSCLE or MEGA6, respectively. The subfamily names are indicated next to their cluster and the distance scale denotes the number of amino acid substitutions per site.

Figure 3

Distribution of castor, physic nut and Arabidopsis PLCP genes in subfamilies.

Phylogenetic analysis of castor bean, physic nut and Arabidopsis PLCPs. Sequence alignment and construction of the phylogenetic tree were performed using MUSCLE or MEGA6, respectively. The subfamily names are indicated next to their cluster and the distance scale denotes the number of amino acid substitutions per site. Distribution of castor, physic nut and Arabidopsis PLCP genes in subfamilies. As shown in Fig. 2, a large number of AtPLCPs were grouped in pairs, reflecting the occurrence of two recent WGD events[22]. These gene pairs are widely distributed in different subfamilies, only excluding subfamilies XBCP3 and THI with a single member. In contrast, few gene pairs were found in castor bean and physic nut, which are limited to the SAG12 subfamily. In Arabidopsis, the SAG12 subfamily is composed of six members, which can be further divided into two groups named SAG12 and PAP. The PAP group was also present in castor bean but absent from physic nut, suggesting specific gene loss in the latter. Compared with Arabidopsis that contains a single SAG12 group member, both castor bean and physic nut contain eight members, which were shown to form four subgroups (Fig. 2).

Gene structure, sequence feature and conserved motifs

The exon-intron structure of castor bean and physic nut PLCP genes was investigated based on optimized gene models, which are supported by ESTs and/or RNA sequencing reads. As shown in Fig. 4B, these genes harbor at least one intron, varying from one to ten as observed in Arabidopsis. Although genes in different subfamilies were shown to harbor distinct exon-intron structures, the structure is usually conserved within the subfamily and between orthologs across three compared species. Without any exception, genes in the ALP subfamily all contain seven introns. Except for AtSAG12 containing two introns, other members in subfamilies SAG12 and THI feature one intron. Most genes in subfamilies CEP, XCP and RD19 contain three introns, whereas RcXCP1, JcXCP1, AtCEP1, AtCEP2, AtCEP3, AtRD19A and AtRD19B contain two introns instead. Genes in the XBCP3 subfamily usually contain four introns, while RcXBCP3 contains five introns instead. Genes in the CTB subfamily usually contain ten introns, however, AtCTB1 harbors nine introns instead. Compared with the ORF (open reading frame) length (1023–1506 bp with the average of 1119 bp in castor bean, 1020–1575 bp with the average of 1135 bp in physic nut, and 1026–1392 bp with the average of 1118 bp in Arabidopsis), the gene size (from start to stop codons) of each gene is relatively more variant (1111–4315 bp with the average of 1933 bp in castor bean, 1103–3183 bp with the average of 1860 bp in physic nut, and 1137–2471 bp with the average of 1603 bp in Arabidopsis) (Fig. 4B and Supplementary Table S1).
Figure 4

Structural features of castor and physic nut PLCP genes. (A) An unrooted phylogenetic tree constructed using MEGA6. (B) A graphic representation of exon-intron structures displayed using GSDS. (C) Distribution of 15 conserved motifs.

Structural features of castor and physic nut PLCP genes. (A) An unrooted phylogenetic tree constructed using MEGA6. (B) A graphic representation of exon-intron structures displayed using GSDS. (C) Distribution of 15 conserved motifs. Sequence analysis showed that the deduced PLCP proteins were predicted to harbor an average MW (molecular weight) of 41 kDa and pI (isoelectric point) value of 4.7. Since the predicted GRAVY (grand average of hydropathicity) values were all shown to be less than 0 (varying from −0.04 to −0.62), these proteins are more likely to be hydrophilic. According to the subcellular localization analysis, a hydrophobic signal peptide was also found at the N-terminal of each protein (Tables 1 and 2, and Supplementary Table S1), where JcALP1 and RcALP1 include the NPIR motif for the vacuolar localization as observed in AtALP[35]. Except for JcCEP2 that harbors a RDEL motif at the C-terminal, RcCEP1, RcCEP2 and JcCEP1 contain a KDEL motif for the ER retention[26,36,37]. Motif compositions among different Rc/JcPLCPs were also investigated and the results were shown in Fig. 3 and Supplementary Fig. S1. Among the 15 motifs identified using MEME, Motifs 1–11 and 13 are broadly distributed. Motif 7 includes the ERFNIN consensus sequence. This motif as well as Motifs 10 and 6 are characterized as the well-studied Inhibitor_I29 (PF08246), which is the core of the auto-inhibitory pro-domain[10,16]. Motifs 1, 4, 5, 9, 15, 2, 11, 3 and 8 are characterized as the Peptidase_C1 domain (PF00112), where Motifs 1, 11 and 3 include the Cys, His or Asn active site respectively[10,11]. Motifs 14 and 15 are also part of the Peptidase_C1 domain, where Motif 14 is specific to the RD19 subfamily and Motif 15 is only found in subfamilies SAG12, XBCP3, RD19 and ALP. Motif 13 is the link of the Inhibitor_I29 domain and the Peptidase_C1 domain, which was shown to contain the cleavage site for generation of a mature enzyme[21,38]. Motif 12, which is limited to RD21 and XBCP3 subfamilies, is characterized as the well-studied GRAN domain (PF00396) (Fig. 3C).

Expression patterns of RcPLCP genes in various tissues

Transcriptional profiling revealed that 26 RcPLCP genes were expressed in at least one of the tested tissues or developing stages of a certain tissue, i.e., 20 in leaf, 23 in male flower, 19 in endosperm II/III, 16 in endosperm V/VI, 21 in developing seed and 17 in germinating seed. According to the FPKM value, the total transcripts of RcPLCP genes were most abundant in male flower, followed by germinating seed (Group I); moderate in endosperm II/III, leaf and developing seed (Group II, occupying 21–27% of Group I); and, considerably low in endosperm V/VI (Group III, occupying 6–7% of Group I). In male flower, the unique member of the THI subfamily contributes the major transcripts, occupying about 72% of the total PLCP transcripts; by contrast, the second highly abundant RD21 subfamily occupies only 10%. In leaf, subfamilies RD19 and RD21 occupy about 69% of the total PLCP transcripts. In endosperm and seed, the CEP subfamily usually contributes the major PLCP transcripts, though the RD19 subfamily plays a more important role in developing seed. In endosperm V/VI, the CEP subfamily occupies about 88% of the total PLCP transcripts. In endosperm II/III, the CEP subfamily occupies about 57% of the total PLCP transcripts, and the less abundant subfamilies RD19 and RD21 occupy about 19% or 13% respectively. In germinating seed, the CEP subfamily occupies about 51% of the total PLCP transcripts, and the less abundant subfamilies RD21 and RD19 occupy about 25% or 11% respectively. In developing seed, three highly abundant subfamilies RD19, CEP and RD21 occupy about 35%, 28% or 17% of the total PLCP transcripts respectively (Fig. 5).
Figure 5

Transcriptional profiling of 26 RcPLCP genes over various tissues. Color scale denotes FPKM normalized log10 transformed counts.

Transcriptional profiling of 26 RcPLCP genes over various tissues. Color scale denotes FPKM normalized log10 transformed counts. Although the transcript level is diverse, most RcPLCP genes were shown to express in all examined tissues. RcCEP1 represents the most expressed gene in endosperm II/III, endosperm V/VI, developing seed and germinating seed, occupying about 46%, 41%, 43% or 27% of the total PLCP transcripts in each sample respectively; by contrast, its transcript level in leaf is extremely low. Compared with RcCEP1, the transcript level of RcCEP2 is comparable in endosperm V/VI, but is relatively lower in other tissues. RcTHI1 represents the most expressed gene in flower; RcRD19B and RcRD21A represent the most expressed genes in leaf, occupying about 29% or 25% of the total PLCP transcripts respectively. Nevertheless, the expression of RcRD19B was not detected in endosperm V/VI; RcSAG12H5, RcSAG12H6 and RcPAP1 seem to be flower-specific; and, RcPAP3 seems to be leaf-specific (Fig. 5). Based on the expression pattern across various tissues, 26 RcPLCP genes were clustered into four groups: Group 1 prefers to express in germinating seed, including 2 CEPs (RcCEP1 and RcCEP2), 2 RD21s (RcRD21B and RcRD21A), RcRD19A, RcALP1, RcCTB1 and RcXBCP3; Group 2 prefers to express in developing seed and endosperm, including RcRD19C, 2 SAG12s (RcSAG12H7 and RcSAG12H8); Group 3 prefers to express in male flower, including RcTHI1, 7 SAG12s (RcSAG12H1, RcSAG12H2, RcSAG12H3, RcSAG12H4, RcSAG12H5, RcSAG12H6 and RcPAP1); and, Group 4 prefers to express in leaf, including 2 XCPs (RcXCP1 and RcXCP2), 2 SAG12s (RcPAP2 and RcPAP3), RcRD21C, RcXBCP3L and RcRD19B.

Expression patterns of JcPLCP genes in various tissues

As shown in Fig. 6, transcriptional profiling supported the expression of all JcPLCP genes in at least one of the examined tissues, i.e., 21 in root, 19 in flower bud, 23 in seed, 21 in leafage and 18 in mature leaf. The total transcripts were most abundant in leafage (Group I); moderate in seed, mature leaf and root (Group II, occupying 37–47% of Group I); and, considerably low in flower bud (Group III, occupying about 13% of Group I). In most tissues, RD19 and RD21 subfamilies contribute the major PLCP transcripts, e.g., 72% in leafage, 62% in flower, and 54% in root. By contrast, subfamilies RD19 and ALP occupy 55% of the total PLCP transcripts in mature leaf, and the CEP subfamily occupies 55% of the total PLCP transcripts in seed. Several important JcPLCP genes were also identified for a certain tissue. JcRD19A presents the most expressed gene in flower bud and root, both occupying about 30% of the total PLCP transcripts. JcCEP1 presents the most expressed gene in seed, occupying about 55% of the total PLCP transcripts, though its expression level is relatively low in flower bud, leafage and mature leaf. In leafage, the transcript level of JcRD21A, JcRD19A, JcRD19B and JcALP1 is considerably high, where JcALP1 also presents the most expressed gene in mature leaf (Fig. 6). These genes were also clustered into four groups based on the tissue-specific expression pattern: Group 1 prefers to express in leafage, including 2 RD21s (JcRD21A and RcRD21C), 2 RD19s (JcRD19A and RcRD19B), 2 XBCP3 (JcXBCP3 and JcXBCP3L), 2 SAG12 (JcSAG12H1 and JcSAG12H2), JcALP1 and JcCTB1; Group 2 prefers to express in root (JcXCP1) or mature leaf (JcXCP2); Group 3 prefers to express in flower bud, including JcTHI1, 2 SAG12s (JcSAG12H4 and JcSAG12H7); and, Group 4 prefers to express in seed, including 2 CEP (JcXCP1 and JcXCP2), 4 SAG12s (JcSAG12H3, JcSAG12H5, JcSAG12H6, and JcSAG12H8), JcRD21B and JcRD19C.
Figure 6

Transcriptional profiling of 23 JcPLCP genes over various tissues. Color scale denotes FPKM normalized log10 transformed counts.

Transcriptional profiling of 23 JcPLCP genes over various tissues. Color scale denotes FPKM normalized log10 transformed counts.

Discussion

Small number but high diversity of PLCP family genes in castor bean and physic nut

WGDs are widespread and play an important role in the origin and diversification of the angiosperms[39,40]. Arabidopsis, an annual herb with a relatively short life cycle and small size, services as a popular model species for the study of plant biology and genetics. The relatively small diploid genome (approximately 135 Mb) made it the first plant to be sequenced, completed in December of the year 2000[41]. However, analysis of the Arabidopsis genome has revealed several unexpected secrets. During the last 120 million years, Arabidopsis was shown to have experienced three WGDs known as γ, β and α[22,42]. The γ event occurred at about 117 million years ago (Mya)[9], which is shared by all core eudicots, e.g. castor bean, physic nut, rubber, cassava (Manihot esculenta), poplar, papaya (Carica papaya), cacao (Theobroma cacao) and grapevine (Vitis vinifera)[6,8,24,25,43-47]. The β event occurred at 61–65 Mya, shortly after its divergence with its close species papaya[44], which is shared by the Cleomaceae plants[48]. The α event is Brassicaceae-specific, occurred within a window of 23–50 Mya[22,49]. Following these WGD events, the ancestral A. thaliana genome was hugely rearranged and gene copies have been massively lost, and almost half of the genome was lost since its divergence with A. lyrata at about 10–13 Mya[50]. The genome-wide analysis indicated that the Arabidopsis PLCP gene family is comprised of 31 members that can be divided into nine subfamilies based on sequence similarity[17]. As shown in Fig. 2, a relatively high number of Arabidopsis PLCP gene pairs were identified in most subfamilies. The 18 duplicates were shown to be resulted from different modes of gene duplication, i.e., γ (2), β (2), α (5), tandem (5), proximal (1) and transposed (3)[51] (see Supplementary Table S1). Despite containing more or comparable protein-coding loci than Arabidopsis (i.e. 27,416 in TAIR10), our genome-wide survey revealed that castor bean (i.e. 31,221) and physic nut (i.e. 27,172) encode relatively less PLCP genes, i.e. 26 or 23, respectively. The number occupies 0.08% of the total loci in both species, which is relatively smaller than 0.11% in Arabidopsis. The expression of all these genes was supported by available EST and/or RNA sequencing reads, suggesting that they may have function in these two species. Moreover, all PLCP genes in physic nut can be anchored to nine out of the 11 chromosomes based on available DNA markers[8]. Based on the phylogenetic analysis, these genes can be assigned to nine previously described subfamilies (i.e. RD21, CEP, XCP, XBCP3, THI, SAG12, RD19, ALP and CTB)[17]. Except for the SAG12 subfamily, one-to-one orthologous relationships were found between castor bean and physic nut, and conserved synteny between these two species can be clearly observed. Despite the castor bean genome is highly fragmented, we are able to anchor 25 out of 26 RcPLCP genes to eight physic nut chromosomes based on the synteny analysis (Supplementary Fig. S2). By contrast, gene-specific expansion was observed in the SAG12 subfamily, which can be further divided into two groups (i.e., SAG12 and PAP). The PAP group was shown to be lost in physic nut, but highly expanded in castor bean (i.e. 3 members) as observed in Arabidopsis (i.e. 5 members)[17]. The SAG12 group is comprised of a single gene with two introns in Arabidopsis, whereas eight members with a single intron were found in both castor bean and physic nut. As shown in Fig. 2, the SAG12 group is obviously clustered into four subgroups: Subgroup 1 (Ia) includes JcSAG12H1–2 and AtSAG12; Subgroup 2 (Ib) includes JcSAG12H3–6 and RcSAG12H1–3; Subgroup 3 (Ic) includes RcSAG12H4–5; and Subgroup 4 (Id) includes JcSAG12H7–8 and RcSAG12H6–8. Since members of all four subgroups can be found in other plant species (see below), castor bean and physic nut promise to have lost Ic or Ia subgroup members, respectively; and, the unique AtSAG12 is more likely to be the result of massive gene loss after WGDs. As for other subfamilies, most members in castor bean and physic nut were shown to harbor one to three orthologs in Arabidopsis, however, the orthologs of RcRD21B/JcRD21B, RcCEP2/JcCEP2 and RcXBCP3L/JcXBCP3L have also been lost in Arabidopsis. Thereby, it is probably safe to say that the ancestral Euphorbiaceae genome contained 20 PLCP family genes, i.e., three RD21s, two CEPs, two XCPs, two XBCP3s, one THI, five SAG12s, three RD19s, one ALP and one CTB.

Evolution of the PLCP gene family in castor bean and physic nut

As discussed above, the PLCP family genes in castor bean and physic nut promise to evolve from 20 ancestors, and gene-specific expansion and/or loss was shown to be restricted to the SAG12 subfamily. Although the exon-intron pattern between orthologs is highly conserved, RcXBCP3 has obtained an additional intron close to the 3′ untranslated region (Fig. 4). As expected, the sequence length and nucleotide substitution of introns are relatively more variable than that in exons between orthologs. Nevertheless, as shown in Fig. 2, the origin and evolution of subfamily members still needs to be resolved. The available genomes of several representative plants allow us to discuss this issue. When taking advantage of the castor bean, physic nut and Arabidopsis PLCP genes to trace their orthologs in these plants, we are able to find one RD21 and three CTBs in a single celled green alga, Chlamydomonas reinhardtii [52]; one RD21, one RD19, one ALP and one CTB in spikemoss (Selaginella moellendorffii), an ancient vascular species first appeared at about 400 Mya[53]; one RD21, one CEP, two XCP, one XBCP, four SAG12s, one RD19, one RD19, one ALP and one CTB in Amborella trichopoda, a sister species to all other flowering plants[54]; two RD21s, one CEP, two XCPs, one XBCP3, four SAG12s, two RD19s, one RD19, one ALP and one CTB in rice (Oryza sativa), a model species of monocotyledons[55]; one RD21, one CEP, two XCPs, two XBCP3 (including one XBCP3L), one THI, four SAG12s, one RD19, one RD19, one ALP and one CTB in Aquilegia coerulea, a basal species of most eudicot clade[56]. As shown in Fig. 2, the RD21 subfamily contains five OGs: Group I includes RcRD21A, JcRD21A, AtRD21A and AtRD21B; Group II includes RcRD21B and JcRD21B; Group III includes RcRD21C, JcRD21C and AtRDL1; Group IV includes AtRD21C, AtRDL2 and AtRDL3; and, Group V includes AtRDL4, AtRDL5 and AtRDL6 (Supplementary Table S2). Group I members are widely distributed, which can be traced back to C. reinhardtii [52]. The divergence of other groups is more likely to occur in the common ancestor of core eudicots, which was proven to experience the whole-genome triplication γ event[9]. Plant species not having undergone any recent WGD were found to contain one RD21A ortholog, one RD21B ortholog and one RD21C ortholog, e.g. papaya, cacao and grapevine[43-45]. And species such as poplar and cassava that have experienced one recent WGD[24,47] contain one or two orthologs for RD21A, RD21B and RD21C, respectively. However, Group II is more likely to be lost in Brassicaceae, e.g. A. thaliana, A. lyrata, Brassica rapa and B. oleracea [41,50,57,58]. In contrast, Groups IV and V were shown to be restricted to Brassicaceous plants probably resulted from the β event or fast evolution, though gene-specific expansion and/or loss were found in B. rapa and B. oleracea (Supplementary Table S2). The CTB subfamily can also be traced back to C. reinhardtii [52], and a single member was found in most plant species, especially those not having experienced recent WGDs. In contrast, gene expansion was found in C. reinhardtii, poplar, cassava and Brassicaceous plants, resulted from recent WGDs and local duplication[24,38,45,50]. In A. thaliana, AtCTB3 was produced from AtCTB2 via the α WGD, which is shared by A. lyrata, B. rapa and B. oleracea; AtCTB1 was produced from AtCTB2 via tandem duplication, which is only shared by A. lyrata [41,50,57,58] (Supplementary Tables S1 and 2). The CEP subfamily contains two OGs: Group I includes RcCEP1, JcCEP1, AtCEP1, AtCEP2 and AtCEP3; and, Group II includes RcCEP2 and JcCEP2. Group I members are relatively primitive, which can be traced back to A. trichopoda [54]. This group was shown to be highly expanded through WGD and local duplication in Brassicaceous plants. By contrast, Group II is more likely to be derived from the γ event, and gene-specific loss occurred in Brassicaceous plants[41,50,57,58] (Supplementary Table S2). The XCP subfamily also includes two OGs: Group I includes RcXCP1, JcXCP1 and AtXCP1; and, Group II includes RcXCP2, JcXCP2 and AtXCP2. Like the CEP subfamily, Group I of the XCP subfamily can also be traced back to A. trichopoda [54], while Group II is more likely to be resulted from the γ event (Supplementary Tables S1 and 2). The XBCP3 subfamily contains two OGs: Group I includes RcXBCP3, JcXBCP3 and AtXBCP3, which can be traced back to the ancestral angiosperm; and, Group II includes RcXBCP3L and JcXBCP3L, which can be traced back to A. coerulea, though specific gene loss occurred in Brassicaceous plants (Supplementary Table S2). Interesting, Group II is highly expanded in poplar, resulted from the recent WGD and local duplication[24] (Supplementary Table S2). The THI subfamily usually contains a single member, which can be traced back to the ancestor of eudicots. However, specific gene loss was found in poplar[24] (Supplementary Table S2). The RD19 subfamily contains three OGs: Group I includes RcRD19A, JcRD19A, AtRD19A and AtRD19B; Group II includes RcRD19B, JcRD19B and AtRD19C; and, Group III includes RcRD19C, JcRD19C and AtRD19D. Groups I and III can be traced back to spikemoss[53], whereas Group II is more likely to be resulted from the γ event (Supplementary Tables S1 and 2). In grapevine, Group II is highly expanded through local duplication[43]. The SAG12 subfamily contains two main groups, i.e. SAG12 and PAP. The origin of the PAP group is still not clear, since it was only found in castor bean, cassava as well as Brassicaceous plants (Supplementary Table S2), which belong to two distinct plant families. In Brassicaceae, the PAP group is highly expanded via the α WGD, tandem duplication and transposed duplication[51]. The SAG12 group can be further divided into four orthologous subgroups: Ia can be traced back to A. trichopoda, and has been lost in castor bean, papaya, cacao and grapevine; Ib is more likely to appear in the common ancestor of core eudicots along with the γ event, and has been lost in Brassicaceous plants; Ic may also appear along with the γ event, and has been lost in physic nut, cacao as well as Brassicaceous plants; and, Id is more likely to be Euphorbiaceae-specific[14,21] (Supplementary Table S2). In addition to gene copies and exon-intron structures, expression divergence was also observed between orthologs/paralogs. The transcript level of several OGs such as RcRD21A/JcRD21A/AtRD21A, RcRD19A/JcRD19A/AtRD19A, RcRD19B/JcRD19B/ AtRD19C, RcALP1/JcALP1/AtALP and RcCTB1/JcCTB1/AtCTB3 is highly abundant in leaf, flower and seed. In contrast, the paralogs of AtRD21A, AtRD19A, AtALP and AtCTB3, i.e., AtRD21B, AtRD19B, AtALP2 and AtCTB1/AtCTB2, are considerably less expressed, though they are also constitutively expressed in these tissues[59]. As for two OGs of subfamilies CEP and XCP, which were generated along with the γ event, expression divergence is even more obvious. Among them, RcCEP1/JcCEP1 and RcXCP2/JcXCP2 have become the dominated isoforms in all tested tissues (Figs 5 and 6). Similar expression pattern can also be observed for their orthologs in Arabidopsis, e.g. AtXCP1 and AtXCP2 [59]. It is worth noting that, RcTHI1 represents the most abundant gene in the male flower of castor bean, which is in accord with the expression pattern of its ortholog in Arabidopsis (i.e. AtTHI1); by contrast, the expression level of its ortholog in physic nut (i.e. JcTHI1) is considerably low in flower bud (Figs 5 and 6).

Conclusions

The first genome-wide analysis of PLCP family genes in castor bean and physic nut was performed in the present study, resulting in 26 or 23 members respectively. The phylogenetic analysis assigned them into nine subfamilies. Novel groups or subgroups, which are absent from Arabidopsis, were identified in RD21, CEP, XBCP3 and SAG12 subfamilies. Their orthologs are widely distributed in core eudicots, suggesting gene-specific loss occurred in Arabidopsis and other Brassicaceous plants. Moreover, the evolution characteristics of castor bean and physic nut PLCP family genes were also compared and discussed. Our findings provide a useful reference to characterize PLCP genes and analyze the family evolution in Euphorbiaceous plants and other species.

Materials and Methods

Identification and manual curation of PLCP genes in castor bean and physic nut

The genomic data for castor bean[6] and physic nut[8] were downloaded from Phytozome v11[60] or NCBI (http://www.ncbi.nlm.nih.gov/), respectively. 31 Arabidopsis PLCPs obtained from TAIR10[61] were used as queries for the homologous search. Sequences with an E-value of less than 1e−5 in the TBLASTN search[62] were collected, and the positive genomic sequences were predicted using GeneMark.hmm[63]. The predicted gene models were further confirmed with transcriptome data available in NCBI, including cDNAs, ESTs and RNA sequencing reads. The presence of the Peptidase_C1 domain in deduced proteins was checked using SMART[64]. Gene expression annotation and definition of tandem/proximal duplications were performed as described before[14,46]. The alternative splicing (AS) isoforms were identified using Cufflinks (v2.2.1)[65]. The BRH (Best Reciprocal Hits) method[66] was used to define orthologs across species, and the systematic name was assigned based on the best Arabidopsis ortholog.

Multiple sequence alignment and phylogenetic analysis

Sequence alignment of full-length PLCPs was conducted using MUSCLE[67]. The unrooted phylogenetic tree based on the alignment was constructed using MEGA6[68] with the maximum likelihood method, where the bootstrap was set to 1,000 replicates. Gene structures were displayed using GSDS (http://gsds.cbi.pku.edu.cn/).

Analysis of sequence feature and conserved motifs

Protein properties were determined using the online tool ProtParam (http://web.expasy.org/protparam/) and subcellular localization was predicted using iPSORT (http://ipsort.hgc.jp/). Conserved motifs were analyzed using the MEME Suite (http://meme-suite.org/tools/meme) as described before[14].

Gene expression analysis

Global gene expression profiles over various tissues were examined based on Illumina RNA sequencing data available in NCBI SRA, i.e., root (SRX750579), leafage (SRX750580), mature leaf (SRX1097498), flower bud (SRX1037655) and seed (SRX750581) in physic nut; expanding leaf (ERX021378), male developing flower (ERX021379), developing endosperm II/III (ERX021375), developing endosperm V/VI (ERX021376), developing seed (SRX485027) and germinating seed (ERX021377) in castor bean. Read mapping was performed using Bowtie 2[69], and the FPKM (fragments per kilobase of exon per million fragments mapped)[70] was used to determine the gene expression level as described before[46]. Supplementary information
  69 in total

1.  Aquilegia: a new model for plant development, ecology, and evolution.

Authors:  Elena M Kramer
Journal:  Annu Rev Plant Biol       Date:  2009       Impact factor: 26.379

2.  Integrated genome sequence and linkage map of physic nut (Jatropha curcas L.), a biodiesel plant.

Authors:  Pingzhi Wu; Changpin Zhou; Shifeng Cheng; Zhenying Wu; Wenjia Lu; Jinli Han; Yanbo Chen; Yan Chen; Peixiang Ni; Ying Wang; Xun Xu; Ying Huang; Chi Song; Zhiwen Wang; Nan Shi; Xudong Zhang; Xiaohua Fang; Qing Yang; Huawu Jiang; Yaping Chen; Meiru Li; Ying Wang; Fan Chen; Jun Wang; Guojiang Wu
Journal:  Plant J       Date:  2015-03       Impact factor: 6.417

3.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

Review 4.  C1A cysteine protease-cystatin interactions in leaf senescence.

Authors:  Mercedes Díaz-Mendoza; Blanca Velasco-Arroyo; Pablo González-Melendi; Manuel Martínez; Isabel Díaz
Journal:  J Exp Bot       Date:  2014-03-05       Impact factor: 6.992

5.  Polyploidy and angiosperm diversification.

Authors:  Douglas E Soltis; Victor A Albert; Jim Leebens-Mack; Charles D Bell; Andrew H Paterson; Chunfang Zheng; David Sankoff; Claude W Depamphilis; P Kerr Wall; Pamela S Soltis
Journal:  Am J Bot       Date:  2009-01       Impact factor: 3.844

6.  Molecular characterization of a senescence-associated gene encoding cysteine proteinase and its gene expression during leaf senescence in sweet potato.

Authors:  Guan-Hong Chen; Lin-Tzu Huang; Mee-Ngan Yap; Ruey-Hua Lee; Yih-Jong Huang; Ming-Chih Cheng; Shu-Chen Grace Chen
Journal:  Plant Cell Physiol       Date:  2002-09       Impact factor: 4.927

7.  The genome of the mesopolyploid crop species Brassica rapa.

Authors:  Xiaowu Wang; Hanzhong Wang; Jun Wang; Rifei Sun; Jian Wu; Shengyi Liu; Yinqi Bai; Jeong-Hwan Mun; Ian Bancroft; Feng Cheng; Sanwen Huang; Xixiang Li; Wei Hua; Junyi Wang; Xiyin Wang; Michael Freeling; J Chris Pires; Andrew H Paterson; Boulos Chalhoub; Bo Wang; Alice Hayward; Andrew G Sharpe; Beom-Seok Park; Bernd Weisshaar; Binghang Liu; Bo Li; Bo Liu; Chaobo Tong; Chi Song; Christopher Duran; Chunfang Peng; Chunyu Geng; Chushin Koh; Chuyu Lin; David Edwards; Desheng Mu; Di Shen; Eleni Soumpourou; Fei Li; Fiona Fraser; Gavin Conant; Gilles Lassalle; Graham J King; Guusje Bonnema; Haibao Tang; Haiping Wang; Harry Belcram; Heling Zhou; Hideki Hirakawa; Hiroshi Abe; Hui Guo; Hui Wang; Huizhe Jin; Isobel A P Parkin; Jacqueline Batley; Jeong-Sun Kim; Jérémy Just; Jianwen Li; Jiaohui Xu; Jie Deng; Jin A Kim; Jingping Li; Jingyin Yu; Jinling Meng; Jinpeng Wang; Jiumeng Min; Julie Poulain; Jun Wang; Katsunori Hatakeyama; Kui Wu; Li Wang; Lu Fang; Martin Trick; Matthew G Links; Meixia Zhao; Mina Jin; Nirala Ramchiary; Nizar Drou; Paul J Berkman; Qingle Cai; Quanfei Huang; Ruiqiang Li; Satoshi Tabata; Shifeng Cheng; Shu Zhang; Shujiang Zhang; Shunmou Huang; Shusei Sato; Silong Sun; Soo-Jin Kwon; Su-Ryun Choi; Tae-Ho Lee; Wei Fan; Xiang Zhao; Xu Tan; Xun Xu; Yan Wang; Yang Qiu; Ye Yin; Yingrui Li; Yongchen Du; Yongcui Liao; Yongpyo Lim; Yoshihiro Narusaka; Yupeng Wang; Zhenyi Wang; Zhenyu Li; Zhiwen Wang; Zhiyong Xiong; Zhonghua Zhang
Journal:  Nat Genet       Date:  2011-08-28       Impact factor: 38.330

8.  SMART: recent updates, new developments and status in 2015.

Authors:  Ivica Letunic; Tobias Doerks; Peer Bork
Journal:  Nucleic Acids Res       Date:  2014-10-09       Impact factor: 16.971

9.  Survey of the rubber tree genome reveals a high number of cysteine protease-encoding genes homologous to Arabidopsis SAG12.

Authors:  Zhi Zou; Jianting Liu; Lifu Yang; Guishui Xie
Journal:  PLoS One       Date:  2017-02-06       Impact factor: 3.240

10.  Global analysis of transcriptome responses and gene expression profiles to cold stress of Jatropha curcas L.

Authors:  Haibo Wang; Zhurong Zou; Shasha Wang; Ming Gong
Journal:  PLoS One       Date:  2013-12-09       Impact factor: 3.240

View more
  4 in total

1.  Identification and functional characterization of safflower cysteine protease 1 as negative regulator in response to low-temperature stress in transgenic Arabidopsis.

Authors:  Yanxi Lv; Jianyi Zhang; Naveed Ahmad; Hong Yingqi; Youbao Li; Nan Wang; Liu Xiuming; Yao Na; Xiaokun Li
Journal:  Planta       Date:  2022-04-21       Impact factor: 4.116

2.  Genome-wide identification and comparative evolutionary analysis of the Dof transcription factor family in physic nut and castor bean.

Authors:  Zhi Zou; Xicai Zhang
Journal:  PeerJ       Date:  2019-02-05       Impact factor: 2.984

3.  Genome-Wide Identification of Papain-Like Cysteine Proteases in Gossypium hirsutum and Functional Characterization in Response to Verticillium dahliae.

Authors:  Shuqin Zhang; Zhongping Xu; Heng Sun; Longqing Sun; Muhammad Shaban; Xiyan Yang; Longfu Zhu
Journal:  Front Plant Sci       Date:  2019-02-20       Impact factor: 5.753

4.  Light-harvesting chlorophyll a/b-binding protein-coding genes in jatropha and the comparison with castor, cassava and arabidopsis.

Authors:  Yongguo Zhao; Hua Kong; Yunling Guo; Zhi Zou
Journal:  PeerJ       Date:  2020-01-28       Impact factor: 2.984

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.