Literature DB >> 26509832

Gene Structures, Evolution, Classification and Expression Profiles of the Aquaporin Gene Family in Castor Bean (Ricinus communis L.).

Zhi Zou1, Jun Gong1, Qixing Huang2, Yeyong Mo1, Lifu Yang1, Guishui Xie1.   

Abstract

Aquaporins (AQPs) are a class of integral membrane proteins that facilitate the passive transport of water and other small solutes across biological membranes. Castor bean (Ricinus communis L., Euphobiaceae), an important non-edible oilseed crop, is widely cultivated for industrial, medicinal and cosmetic purposes. Its recently available genome provides an opportunity to analyze specific gene families. In this study, a total of 37 full-length AQP genes were identified from the castor bean genome, which were assigned to five subfamilies, including 10 plasma membrane intrinsic proteins (PIPs), 9 tonoplast intrinsic proteins (TIPs), 8 NOD26-like intrinsic proteins (NIPs), 6 X intrinsic proteins (XIPs) and 4 small basic intrinsic proteins (SIPs) on the basis of sequence similarities. Functional prediction based on the analysis of the aromatic/arginine (ar/R) selectivity filter, Froger's positions and specificity-determining positions (SDPs) showed a remarkable difference in substrate specificity among subfamilies. Homology analysis supported the expression of all 37 RcAQP genes in at least one of examined tissues, e.g., root, leaf, flower, seed and endosperm. Furthermore, global expression profiles with deep transcriptome sequencing data revealed diverse expression patterns among various tissues. The current study presents the first genome-wide analysis of the AQP gene family in castor bean. Results obtained from this study provide valuable information for future functional analysis and utilization.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26509832      PMCID: PMC4625025          DOI: 10.1371/journal.pone.0141022

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Aquaporins (AQPs) are a special class of integral membrane proteins that belong to the ancient major intrinsic protein (MIP) superfamily [1,2]. Although they firstly raised considerable interest for their high permeability to water, increasing evidence has shown that some of them also transport certain small molecules, e.g., glycerol, urea, boric acid, silicic acid, ammonia (NH3), carbon dioxide (CO2) and hydrogen peroxide (H2O2) [3,4]. After the first AQP gene reported in human erythrocytes [5], in the past three decades, its homologs have been identified from all types of organisms, including eubacteria, archaea, fungi, animals and plants [2,4]. The AQPs are characterized by six transmembrane helices (TM1–TM6) connected by five loops (i.e. LA–LE) as well as two highly conserved NPA motifs located at the N-termini of two half helices (i.e. HB and HE) in LB and LE. The NPA motifs which create an electrostatic repulsion of protons and act as a size barrier form one selectivity region of the pore, whereas another region called the aromatic/arginine (ar/R) selectivity filter (i.e. H2 in TM2, H5 in TM5, LE1 and LE2 in LE) that renders the pore constriction site diverse in both size and hydrophobicity determines the substrate specificity [6-8]. Based on the statistical analysis, Froger et al. proposed five conserved amino acid residues (called Froger’s positions, P1–5) for discriminating glycerol-transporting aquaglyceroporins (GLPs) from water-conducting AQPs: GLPs usually feather an aromatic residue at P1, an acidic residue at P2, a basic residue at P3, a proline followed by a nonaromatic residue at P4 and P5, as Y108-D207-K211-P236-L237 observed in the Escherichia coli glycerol facilitator GlpF in contrast to A103-S190-A194-F208-W209 in the pure water channel AqpZ [9]. More recently, based on the analysis of structure resolved and/or functionally characterized AQPs, nine specificity-determining positions (SDPs) for non-aqua substrates, i.e., urea, boric acid, silicic acid, NH3, CO2 and H2O2 were also predicted for each group [10]. Surveys at the global genome level indicate that the AQPs in terrestrial plants are especially abundant and diverse [11-15]. Based on sequence similarities, plant AQPs are divided into seven main subfamilies, i.e., plasma membrane intrinsic proteins (PIPs), tonoplast intrinsic proteins (TIPs), NOD26-like intrinsic proteins (NIPs), small basic intrinsic proteins (SIPs), uncategorized X intrinsic proteins (XIPs), GlpF-like intrinsic proteins (GIPs) and hybrid intrinsic proteins (HIPs) [16-19]. The former four subfamilies are widely distributed whereas GIPs and HIPs are only found in algae and moss, and XIPs in moss and several dicots including poplar (Populus trichocarpa) [11-20]. Castor bean (Ricinus communis L., 2n = 20) is an annual or perennial shrub that belongs to the Euphorbiaceae family. The castor oil extracted from the seeds is an important raw material used for industrial, medicinal and cosmetic purposes [21]. Although originated in Africa, castor bean is now cultivated in many tropical, subtropical and warm temperate regions around the world, especially under unfavorable conditions with barren, drought and salt stresses [22-24]. The recent completion of its draft genome and the available transcriptome datasets provide an opportunity to analyze specific gene families in castor bean [25-32]. In present study, a genome-wide search was performed to identify the castor bean AQP (RcAQP) genes. Furthermore, functional prediction was performed based on the ar/R filter, Froger’s positions and SDPs [6,7,9], and the expression profiles were examined using deep transcriptome sequencing data. Results obtained from this study provide global information in understanding the molecular basis of the AQP gene family in castor bean.

Methods

Identification of RcAQP Genes

The genome sequences of castor bean [25] were downloaded from phytozome v9.1 (http://www.phytozome.net/), whereas the nucleotides, Sanger ESTs (expressed sequence tags) and raw RNA sequencing reads were downloaded from NCBI (http://www.ncbi.nlm.nih.gov/). The amino acid sequences of Arabidopsis (Arabidopsis thaliana) and poplar AQP obtained from Phytozome v9.1 (the accession numbers are available in S1 Table) were used as queries to search for castor bean AQP homologs. Sequences with an E-value of less than 1E-5 in the tBlastn search [33] were selected for further analysis. The predicted gene models were further validated with cDNAs, ESTs and raw RNA sequencing reads derived from various tissues/organs such as root, stem, leaf, flower, seed, embryo, endosperm and callus [25-32]. Gene structures were displayed using GSDS [34]. Homology search for nucleotides, ESTs or RNA sequencing reads was performed using Blastn [33], and sequences with a similarity of more than 98% were taken into account.

Sequence Alignments and Phylogenetic Analysis

Multiple sequence alignments using deduced amino acid sequences were performed with ClustalX [35], and the unrooted phylogenetic tree was constructed by the maximum likelihood method using MEGA 6.0 [36]. The reliability of branches in resulting trees was supported with 1,000 bootstrap resamplings. Classification of AQPs into subfamilies and subgroups was done as described before [16,19].

Structural Features of Putative RcAQPs

Biochemical features of RcAQPs were determined using ProtParam (http://web.expasy.org/protparam/). The protein subcellular localization was predicted using WoLF PSORT [37] and Plant-mPLoc [38]. The transmembrane regions were detected using TOPCONS [39]. Functional prediction was performed on the basis of dual NPA motifs, ar/R filter (H2, H5, LE1 and LE2), Froger’s positions (P1–5) and specificity-determining positions (SDP1–9) from alignments with the structure resolved Spinacia oleracea PIP2;1 and functionally characterized AQPs [8-10].

Gene Expression Analyses

To analyze the global expression profiles of RcAQP genes among different tissues or developmental stages, RNA sequencing data of leaf, flower, endosperm (II/III, V/VI) and seed described before [30] were examined: the expanding true leaves, appearing after the first cotyledons and leaf-pair, represent the leaf tissue; the male flower tissue includes pollen and anthers but excludes sepals; the germinating seed tissue was obtained by soaking dry seeds in running water overnight followed by germination in the dark for 3 days; and the endosperm tissue includes two representative stages termed stages II/III (endosperm free-nuclear stage) and V/VI (onset of cellular endosperm development). The clean reads were obtained by removing adaptor sequences, adaptor-only reads, reads with “N” rate larger than 10% (“N” representing ambiguous bases) and low quality reads containing more than 50% bases with Q-value≤5. Then, the clean reads were mapped to 37 identified RcAQP genes (cDNA) and released transcripts using Bowtie 2 [40], and the mapped reads were counted. The abundance of each transcript can be measured by its read counts. To remove technical biases inherent in the sequencing approach, the widely-used RPKM (reads per kilo bases per million reads) method [41], which was developed to correct certain biases (i.e. the length of the RNA species and the sequencing depth of a sample), was adopted for the expression annotation. Unless specific statements, the tools used in this study were performed with default parameters.

Results

Identification and Classification of RcAQP Genes

Via a comprehensive homology analysis, a total of 37 loci putatively encoding AQP-like genes were identified from the castor bean genome, corresponding 36 loci by the genome annotation (Table 1) [25]. Among them, the locus 28747.t000001 was predicted to encode 243 residues (28747.m000131) with a gene length of 11054 bp (default in part of the 3’ sequences), however, EST and read mapping supported the existence of two genes denoted RcXIP2;1 and RcXIP3;1, both of which harbor one intron and putatively encode 306 and 304 residues, respectively (see S1 File). In addition, although most predicted gene models were validated with ESTs and RNA sequencing reads, three loci (i.e. 28962.t000006, 30101.t000004 and 29816.t000013) seem not to be properly annotated. In phytozome v9.1, the locus 28962.t000006 was predicted to harbor two introns encoding 198 residues (28962.m000437) which is relatively shorter than that of any other PIP subfamily members, however, two nucleotide sequences (accession numbers HB466472 and HB466473) [42] and three ESTs (accession numbers EE257493, EE260412 and EE258867) suggested that this locus harbors four introns encoding 270 or less residues depending on different alternative splicing isoforms (at least four) (see S2 File), and the longest transcript consistent with other PIPs was selected for further analyses. The locus 30101.t000004 was predicted to encode 203 residues (30101.m000372), however, EST mapping indicated that a number of 143-bp coding -sequences close to the second intron is absent from the assembled genome, which was further validated with PCR amplification and sequencing (see S3 File). The locus 29816.t000013 was predicted to harbor seven introns encoding 367 residues (29816.m000676) which is considerably longer than that of any other NIPs, however, thousands of RNA sequencing reads indicated that this locus harbors only four introns putatively encoding 269 residues (see S4 File). Although the current draft genome of castor bean is comprised of 25,763 scaffolds without anchored to 10 chromosomes [25], we still observed that 8 scaffolds harbor two AQP-encoding loci, whereas other 21 scaffolds contain only one (Table 1).
Table 1

List of the 37 RcAQP genes identified in this study.

NamePhytozome positionLocus IDTranscript IDIdentified positionNucleotide length (bp, from start to stop codons)Intron NO.EST hits in GenBankAlternative splicingPhytozome ID of AtAQP orthologPhytozome ID of PtAQP ortholog
GeneCDS
RcPIP1;1 29969: 73673–7504229969.t00000629969.m00026629969: 73604–75299137086730-AT4G00430Pt_0010s19930
RcPIP1;2 29669: 103956–10227229669.t00001729669.m00080829669: 104366–1019321304864318-AT4G00430Pt_0003s12870
RcPIP1;3 29669: 109461–10799829669.t00001829669.m00080929669: 109461–1075531243867314-AT4G00430Pt_0003s12870
RcPIP1;4 30190: 2676316–267932630190.t00046530190.m01122930190: 2676127–267973226488643109YesAT4G00430Pt_0003s12870
RcPIP1;5 30174: 1747792–174969330174.t00001130174.m00861430174: 1747452–17500931712861310-AT4G00430Pt_0008s06580
RcPIP2;1 30078: 847025–84854030078.t00013030078.m00233730078: 846884–84945011858673155YesAT2G37170Pt_0008s03950
RcPIP2;2 27516: 79785–8292427516.t00000727516.m00017427516: 79695–830072665852314YesAT2G37170Pt_0010s22950
RcPIP2;3 28076: 21501 2369828076.t00000228076.m00041128076: 21259–23855193986432-AT5G60660Pt_0009s01940
RcPIP2;4 29869: 700906–69930629869.t00005929869.m00119429869: 701298–6989701221843335YesAT2G16850Pt_0004s18240
RcPIP2;5 28962: 25218–2327828962.t00000628962.m00043728962: 25344–23278165481343YesAT4G35100Pt_0005s11110
RcTIP1;1 30078: 302469–30370130078.t00004730078.m00225430078: 302469–3041948717561307-AT2G36830Pt_0006s12350
RcTIP1;2 28180: 101873–10318728180.t00001728180.m00039228180: 101668–103298114975925-AT4G01470Pt_0001s24200
RcTIP1;3 29788: 135388–13672129788.t00002129788.m00033829788: 135388–1372241070759260-AT4G01470Pt_0001s24200
RcTIP1;4 29589: 301876–30057029589.t00004029589.m00126129589: 301877–300492112975917-AT3G26520Pt_0008s05050
RcTIP2;1 30146: 893208–89166930146.t00011930146.m00354230146: 893065–8916691300747247YesAT3G16240Pt_0003s04930
RcTIP2;2 30101: 54727–5662230101.t00000430101.m00037230101: 54578–5698110597532196YesAT3G16240Pt_0001s15700
RcTIP3;1 29681: 356771–35810229681.t00007129681.m00136629681: 356430–358852994768268YesAT1G17810Pt_0017s03540
RcTIP4;1 29794: 784441–78321729794.t00011829794.m00341929794: 784479–783045101774421-AT2G25810Pt_0006s25620
RcTIP5;1 30147: 1084776–108601630147.t00050230147.m01423130147: 1083709–1086062124175920-AT3G47440Pt_0001s00690
RcNIP1;1 30026: 383318–38539630026.t00005230026.m00148830026: 382992–385763207981641-AT4G18910Pt_0004s06160
RcNIP2;1 27860: 4197–679427860.t00000127860.m00004027860: 3885–7450259889440--Pt_0017s11960
RcNIP3;1 29908: 1225139–122383029908.t00008429908.m00603329908: 1225139–1222967131084940--Pt_0002s09740
RcNIP4;1 29816: 132753–12997429816.t00001329816.m00067629816: 132884–130426142581040-AT5G37810Pt_0010s12330
RcNIP4;2 28827: 15568–1858528827.t00000228827.m00017128827: 15568–18585301875940-AT5G37820Pt_0017s03060
RcNIP5;1 30068: 792510–79849130068.t00013030068.m00264030068: 792230–7985454934897362-AT4G10380Pt_0001s45920
RcNIP6;1 29588: 109897–11337029588.t00001529588.m00086029588: 109896–113370332092741-AT1G80760Pt_0003s17930
RcNIP7;1 29844: 994348–99560529844.t00017629844.m00333029844: 993874–996212125889740-AT3G06100Pt_0008s20750
RcXIP1;1 28929: 27007–2685628929.t00000328929.m00005528929: 27059–254341195930120--Pt_0009s13090
RcXIP1;2 28846: 5121–339228846.t00000128846.m00004828846: 5267–33861346912112--Pt_0009s13090
RcXIP1;3 28846: 37430–3655328846.t00000228846.m00004928846: 38157–3637685885803--Pt_0009s13090
RcXIP1;4 28929: 16870–1614428929.t00000228929.m00005428929: 16870–1614472762710--Pt_0009s13090
RcXIP2;1 28747: 53725–4267228747.t00000128747.m00013128747: 54568–521081302921218Yes-Pt_0009s13110
RcXIP3;1 28747: 44459–42629182791510--Pt_0004s17430
RcSIP1;1 29950: 345324–34011129950.t00006129950.m00117729950: 345535–339806482272024-AT3G04090Pt_0019s04640
RcSIP1;2 30010: 75574–7487030010.t00000830010.m00065830010: 76324–7430470570500-AT3G04090Pt_0014s15250
RcSIP1;3 30010: 77831–7711230010.t00000930010.m00065930010: 78002–7676572072000-AT3G04090Pt_0014s15250
RcSIP2;1 30045: 458397–45443630045.t00001930045.m00048730045: 458594–454223396272320-AT3G56950Pt_0016s02560
To analyze the evolutionary relationship and their putative function, an unrooted phylogenetic tree was constructed from the deduced amino acid sequences of RcAQPs together with that from model plant species, Arabidopsis (AtAQPs) [16] and poplar (PtAQPs) [19] (Fig 1). According to the phylogenetic analysis, the identified RcAQPs were grouped into five subfamilies, i.e., PIP (10), TIP (9), NIP (8), XIP (6) and SIP (4) (Table 1 and Fig 1). Following the nomenclature of Arabidopsis and poplar [16,19], the RcPIP subfamily was further divided into two phylogenetic subgroups (5 RcPIP1s and 5 RcPIP2s), the RcTIP subfamily into five subgroups (4 RcTIP1s, 2 RcTIP2s, 1 RcTIP3, 1 RcTIP4 and 1 RcTIP5), the RcNIP subfamily into seven subgroups (1 RcNIP1, 1 RcNIP2, 1 RcNIP3, 2 RcNIP4s, 1 RcNIP5, 1 RcNIP6 and 1 RcNIP7), the RcSIP subfamily into two subgroups (3 RcSIP1s and 1 RcSIP2) and the RcXIP subfamily into three subgroups (4 RcXIP1s, 1 RcXIP2 and 1 RcXIP3) (Fig 1). Although the closest homolog of RcNIP2;1 and RcNIP3;1 is not AtNIP2;1 or AtNIP3;1, their counterparts in poplar were identified, thereby, they were nominated following the nomenclature of poplar [19]. As shown in Fig 1, several RcAQPs were clustered together apart from homologs from Arabidopsis and poplar, i.e., RcPIP1;2/RcPIP1;3/RcPIP1;4, RcPIP2;1/RcPIP2;2, RcSIP1;2/RcSIP1;3, RcXIP1;1/RcXIP1;2/RcXIP1;3/RcXIP1;4. Most of these gene pairs are tandem distributed on the same scaffold, i.e., RcPIP1;2/RcPIP1;3 on the scaffold29669, RcSIP1;1/RcSIP1;2 on the scaffold30010, RcXIP1;1/RcXIP1;4 on the scaffold28929 and RcXIP1;2/RcXIP1;3 on the scaffold28846. Without any exception, these four gene pairs are characterized by same-direction neighbors (foot-to-head order), and the intergenic spacer is about 3.1 kb, 0.5 kb, 8.5 kb or 31.1 kb, respectively (Table 1). Homology analysis indicated that the 37 RcAQPs have 30 counterparts in poplar, whereas only 29 out of them have orthologs with a number of twenty in Arabidopsis (Table 1 and Fig 1).
Fig 1

Phylogenetic analysis of deduced amino acid sequences of the 37 RcAQPs with Arabidopsis and poplar homologs.

Deduced amino acid sequences were aligned using ClustalX and the phylogenetic tree was constructed using bootstrap maximum likelihood tree (1000 replicates) method and MEGA6 software. The distance scale denotes the number of amino acid substitutions per site. The name of each subfamily and subgroup is indicated next to the corresponding group. Species and accession numbers are listed in Table 1 and S1 Table.

Phylogenetic analysis of deduced amino acid sequences of the 37 RcAQPs with Arabidopsis and poplar homologs.

Deduced amino acid sequences were aligned using ClustalX and the phylogenetic tree was constructed using bootstrap maximum likelihood tree (1000 replicates) method and MEGA6 software. The distance scale denotes the number of amino acid substitutions per site. The name of each subfamily and subgroup is indicated next to the corresponding group. Species and accession numbers are listed in Table 1 and S1 Table. Sequence alignments indicated that the complete cDNA sequences of 18 RcAQP genes (i.e., RcPIP1;1–RcPIP1;5, RcPIP2;1–RcPIP2;5, RcTIP1;1–RcTIP1;3, RcTIP2;1, RcTIP4;1, RcNIP1;1, RcNIP5;1 and RcNIP6;1) were mentioned in three patents and one academic paper [42-45]. Except for RcPIP1;1, RcPIP2;1 and RcTIP1;1 whose water transport activity and expression pattern along the hypocotyl axis were investigated [45], the expression profiles and precise roles of other RcAQP genes have not been reported yet. Homology search showed that 25 RcAQP genes had EST hits in NCBI GenBank database (as of Oct 2014), and RcTIP1;1 matched the maximum number of 304 ESTs. Furthermore, read alignments against RNA sequencing data of root, leaf, flower, seed and endosperm further supported the expression of other 12 RcAQP genes. In addition, alternative splicing isoforms existing in 9 RcAQP-encoding loci were supported by Sanger ESTs (Table 1).

Analysis of Exon-Intron Structure

The exon-intron structures of the 37 RcAQP genes were analyzed based on the optimized gene models. Although the ORF (open reading frame) length of each gene is similar (627–830 bp), the gene size (from start to stop codons) is distinct (705–4934 bp) (Table 1 and Fig 2). The introns of RcAQP genes have an average length of about 380 bp, with the minimum of 46 bp in RcPIP2;5 and the maximum of 3360 bp in RcNIP5;1. Genes in different subfamilies harbor distinct exon-intron structures. Except for the existing of one more small intron close to the 5’-terminal of RcPIP2;5, other RcPIP subfamily members feature three introns (74–537 bp, 90–1401 bp and 68–141 bp, respectively). Most RcTIPs contain two introns (117–401 bp and 81–429 bp, respectively), in contrast, RcTIP1;1 and RcTIP1;4 contain only one. Most RcNIPs harbor four introns (91–343 bp, 82–910 bp, 86–1469 bp and 91–650 bp, respectively), however, RcNIP5;1 contains three instead. Two out of three RcSIP1s harbor no introns, in contrast, RcSIP1;1 and RcSIP2;1 contain two. Most RcXIPs contain one intron except for RcXIP1;4 and RcXIP2;1 that contain zero or two, respectively (Table 1 and Fig 2).
Fig 2

Exon-intron structures of the 37 RcAQP genes.

The graphic representation of the gene models is displayed using GSDS.

Exon-intron structures of the 37 RcAQP genes.

The graphic representation of the gene models is displayed using GSDS.

Structural Features of RcAQPs

Sequence analysis showed that the 37 deduced RcAQPs consist of 208–309 amino acids, with a theoretical molecular weight of 22.49–33.22 kDa and a pI value of 4.93–9.97. Homology analysis of these deduced proteins revealed a high sequence diversity existing within and between the five subfamilies. The sequence similarity of 65.2∓99.7% was found within RcPIPs, 57.3∓93.3% within RcTIPs, 37.6∓90.6% within RcXIPs, 33.8∓74.6% within RcNIPs, and 39.4∓75.1% within RcSIPs. RcPIPs share the highest sequence similarity of 37.1∓45.1% with RcTIPs, 25.6∓42.1% with RcXIPs, 22.8∓34.9% with RcNIPs, and the lowest of 23.7∓30.1% with RcSIPs. RcTIPs show 27.2∓42.8%, 20.4∓36.4% and 27.4∓37.3% sequence similarity with RcNIPs, RcXIPs and RcSIPs, respectively. RcNIPs share a similarity of 16.6∓34.5% and 19.3∓30.7% with RcXIPs and RcSIPs, whereas RcSIPs share the similarity of 17.7∓27.9% with RcXIPs (see S2 Table). Topological analysis showed that almost all RcAQPs were predicted to harbor six TMs except for RcXIP1;4 containing only four, which is consistent with the results from multiple alignments with structure proven AQPs (Table 2 and S5 File). The subcellular localization of each RcAQP was also predicted (Table 2). RcPIPs with an average pI value of 7.89 and RcNIPs with an average pI value of 8.44 are localized to plasma membranes. RcTIPs with an average pI value of 5.63 are mainly localized to vacuoles, though several members (RcTIP3;1 and RcTIP4;1) were mispredicted to target the cytosol by WoLF PSORT. RcSIPs (with an average pI value of 9.79) and RcXIPs (with an average pI value of 7.63) are mainly predicted to target the plasma membrane by Plant-mPLoc, in contrast, the predicted localizations by WoLF PSORT are diverse, including the plasma membrane, chloroplast, vacuole, peroxisome and cytosol. In addition, based on the multiple alignments with structure/function characterized AQPs, the conserved residues typical of dual NPA motifs, the ar/R selectivity filter, five Froger’s positions and nine SDPs were also identified (Tables 2 and 3, and S6 File).
Table 2

Structural and subcellular localization analysis of the RcAQPs.

NameLenMw (KDa)pITM a Loc b Loc c Ar/R selectivity filterNPA motifsFroger’s positions
H2H5LE1LE2LBLEP1P2P3P4P5
RcPIP1;128830.777.006PlasPlasFHTRNPANPAESAFW
RcPIP1;228730.697.696PlasPlasFHTRNPANPAESAFW
RcPIP1;328830.787.656PlasPlasFHTRNPANPAESAFW
RcPIP1;428730.678.306PlasPlasFHTRNPANPAESAFW
RcPIP1;528630.718.836PlasPlasFHTRNPANPAQSAFW
RcPIP2;128830.668.216PlasPlasFHTRNPANPAQSAFW
RcPIP2;228330.308.296PlasPlasFHTRNPANPAQSAFW
RcPIP2;328730.607.626PlasPlasFHTRNPANPAQSAFW
RcPIP2;428029.718.996PlasPlasFHTRNPANPAMSAFW
RcPIP2;527028.566.296PlasPlasFHTRNPANPAMSAFW
RcTIP1;125125.905.916VacuVacuHIAVNPANPATSAYW
RcTIP1;225225.925.356VacuVacuHIAVNPANPATSAYW
RcTIP1;325225.804.946VacuVacuHIAVNPANPATSAYW
RcTIP1;425226.265.136VacuVacuHIAVNPANPATSAYW
RcTIP2;124825.035.066VacuVacuHIGRNPANPATSAYW
RcTIP2;225025.224.936VacuVacuHIGRNPANPATSAYW
RcTIP3;125527.116.496CytoVacuHIARNPANPATAAYW
RcTIP4;124726.156.126CytoVacuHIARNPANPATSAYW
RcTIP5;125225.926.716VacuVacuNVGCNPANPAAAAYW
RcNIP1;127128.869.266PlasPlasWVARNPANPAFSAYI
RcNIP2;129731.499.026PlasPlasGSGRNPANPALTAYI
RcNIP3;128230.656.216PlasPlasWAARNPANPAFSAFI
RcNIP4;126928.948.596PlasPlasWVARNPANPAFTAYM
RcNIP4;225226.398.446PlasPlasWVARNPANPAVSAYI
RcNIP5;129830.958.886PlasPlasAIGRNPSNPVFTPYL
RcNIP6;130831.858.606PlasPlasTIARNPSNPVFTAYL
RcNIP7;129831.578.546PlasPlasAVGRNPANPAYSAYI
RcXIP1;130933.226.926CytoPlasVFVRSPTNPAMCAFW
RcXIP1;230332.556.916ChloPlasVFVRSPTNPAMCAFW
RcXIP1;328530.807.946PlasPlasVFVRSPANPAMCVFW
RcXIP1;420822.497.574PeroPlasV---SPV-M----
RcXIP2;130632.228.226PlasPlasIFVRNPVNPAVCAFW
RcXIP3;130432.368.216PlasPlasVYARNPINPAVCAFW
RcSIP1;123925.929.636PlasPlasVVPNNPTNPAEAAYW
RcSIP1;223425.249.976VacuPlas/VacuAATNNPNNPAQAAYW
RcSIP1;323925.629.746PlasPlas/VacuAAPNNPANPAQVAYW
RcSIP2;124026.059.826ChloPlas/VacuSHGSNPLNPAIVAYW

Ar/R: aromatic/arginine; Chlo: chloroplast; Cyto: cytosol; ER: endoplasmic reticulum; H2: transmembrane helix 2; H5: transmembrane helix 5; LE: loop E; Loc: subcellular localization; NPA: Asn-Pro-Ala; Plas: plasma membrane; TM: transmembrane helix; Vacu: vacuolar membrane.

a Representing the numbers of transmembrane helices predicted by TOPCONS.

b Best possible subcellular localization prediction by WoLF PSORT.

c Best possible subcellular localization prediction by Plant-mPLoc.

Table 3

Summary of typical SDPs and those identified in the RcAQPs .

SD position AquaporinSDP1SDP2SDP3SDP4SDP5SDP6SDP7SDP8SDP9
Typical NH 3 transporter F/T K/L/N/V F/T V/L/T A D/S A/H/L E/P/S A/R/T
RcTIP2;2TKTVASAP S
RcNIP1;1FKFTADLET
Typical boric acid transporter T/V I/V H/I P E I/L I/L/T A/T A/G/K/P
RcPIP1;1TIHPELLTP
RcPIP1;2TIHPELLTP
RcPIP1;3TIHPELLTP
RcPIP1;4TIHPELLTP
RcPIP1;5TIHPELLTP
RcNIP5;1TIHPELLAP
Typical CO 2 transporter I/L/V I C A I/V D W D W
RcPIP1;1V M CAIDWDW
RcPIP1;3L M CAIDWDW
RcPIP1;4L M CAIDWDW
RcPIP1;5I M CAVDWDW
RcPIP2;2I M CAVDWDW
RcPIP2;4V M CAVDWDW
Typical H 2 O 2 transporter A/S A/G L/V A/F/L/T/V I/L/V H/I/L/Q F/Y A/V P
RcPIP1;1AGVLIHFVP
RcPIP1;2AGVFIHFVP
RcPIP1;3AGVFIHFVP
RcPIP1;4AGVFIHFVP
RcPIP1;5AGVFIHFVP
RcPIP2;1AGVFIQFVP
RcPIP2;2AGVFIQFVP
RcPIP2;3AGVFIQFVP
RcPIP2;4AGVFIHFVP
RcPIP2;5AGVFIHFVP
RcTIP5;1SALAIQYVP
RcNIP2;1AALLVIYVP
RcNIP3;1SALLILFVP
RcNIP4;2SALVVLYVP
RcNIP5;1SALVVLYVP
RcXIP1;1AGLV S HFVP
RcXIP1;2AGLV S HFVP
RcXIP1;3AALV S HFVP
Typical silicic acid transporter C/S F/Y A/E/L H/R/Y G K/N/T R E/S/T A/K/P/T
RcNIP2;1SF V HGNRT Q
Typical urea transporter H P F/I/L/T A/C/F/L L/M A/G/P G/S G/S N
RcPIP1;1HPFFLPGGN
RcPIP1;2HPFFLPGGN
RcPIP1;3HPFFLPGGN
RcPIP1;4HPFFLPGGN
RcPIP1;5HPFLLPGGN
RcPIP2;1HPLFLPGGN
RcPIP2;2HPFFLPGGN
RcPIP2;3HPFFLPGGN
RcPIP2;4HPFFLPGGN
RcPIP2;5HPFLLPGGN
RcTIP1;1HPFFLAGSN
RcTIP1;2HPFFLAGSN
RcTIP1;3HPFFLAGSN
RcTIP1;4HPFFLAGSN
RcTIP2;1HPFALPGSN
RcTIP2;2HPFALPGSN
RcTIP3;1HPFLLPGSN
RcTIP4;1HPLLLPGSN
RcTIP5;1HPFALPGSN
RcNIP1;1HPLALPGSN
RcNIP2;1HPTAMPGSN
RcNIP3;1HPIALPGSN
RcNIP4;1HPIALPGSN
RcNIP4;2HPIALPGSN
RcNIP5;1HPIALPGSN
RcNIP6;1HPIAL E GSN
RcNIP7;1HPIAMPGSN

a The SDP residues in the castor bean AQPs differing from the typical SDPs determined in this study are highlighted in bold.

Ar/R: aromatic/arginine; Chlo: chloroplast; Cyto: cytosol; ER: endoplasmic reticulum; H2: transmembrane helix 2; H5: transmembrane helix 5; LE: loop E; Loc: subcellular localization; NPA: Asn-Pro-Ala; Plas: plasma membrane; TM: transmembrane helix; Vacu: vacuolar membrane. a Representing the numbers of transmembrane helices predicted by TOPCONS. b Best possible subcellular localization prediction by WoLF PSORT. c Best possible subcellular localization prediction by Plant-mPLoc. a The SDP residues in the castor bean AQPs differing from the typical SDPs determined in this study are highlighted in bold.

RcPIP Subfamily

All RcPIPs were identified to have similar sequence length, however, RcPIP2s (270–288 residues) can be distinguished from RcPIP1s (286–288 residues) by harboring relatively shorter N-terminal and longer C-terminal sequences (see S5 File). The five RcPIP1s harbor sequence similarities of 90.7∓99.7%, whereas the similarity similarities of the five RcPIP2s are 73.3∓94.5%. Between RcPIP1 and RcPIP2 members, sequence similarities of 59.1∓65.9% were observed (see S1 Table). The dual NPA motifs, ar/R filter (F-H-T-R), and four of five Froger’s positions are highly conserved in RcPIPs (Table 2). In contrast, the P1 position is more variable with the appearance of an E, Q or M residue (Table 2). In addition, two phosphorylation sites corresponding to S115 and S274 in Spinacia oleracea PIP2;1 [8] are invariable in RcPIP2s, and the former one is even highly conserved in all RcPIPs, RcTIPs and RcXIPs, and most RcSIP1s except for the S→T substitution in several members (see S5 File), implying their regulation by phosphorylation.

RcTIP Subfamily

RcTIPs consist of 247–255 residues. Those belonging to RcTIP1s (4 members, 251–252 residues) share 84.6∓93.3% sequence similarities, whereas two RcTIP2s (248–250 residues) exhibit a similarity of 87.2% (Table 2 and S1 Table). Members of the RcTIP1 subgroup exhibit similarities of 71.5∓76.6%, 70.0∓72.7%, 67.1∓69.7% and 60.7∓63.0% with that of RcTIP2, RcTIP3, RcTIP4 and RcTIP5 subgroups, respectively. RcTIP2s share similarities of 69.1∓71.0%, 68.3∓70.1% and 65.5∓65.9% with RcTIP3, RcTIP4 and RcTIP5, respectively. RcTIP3;1 shares the similarity of 63.4% and 60.1% with RcTIP4;1 and RcTIP5;1, respectively, whereas RcTIP4;1 shares 57.3% sequence similarity with RcTIP5;1 (see S1 Table). Dual NPA motifs and three Froger’s positions (i.e. P3, P4 and P5) are highly conserved in RcTIPs (Table 2), in contrast, residue substitutions were observed at the P1 and P2 positions: the usual T is replaced by A in RcTIP5;1 at the P1 position, and the S is replaced by A in RcTIP3;1 and RcTIP5;1 at the P2 position. Of the ar/R filter, the usual H residue at H2 and I at H5 positions are replaced by N and V in RcTIP5;1, respectively; at the LE1 position, members of RcTIP1, RcTIP3, and RcTIP4 subgroups favor an A, whereas RcTIP2 and RcTIP5 members favor the G residue; residues at the LE2 position are more variable, including V, R or C (Table 2).

RcNIP Subfamily

RcNIPs consist of 252–308 residues (Table 2). Except for the RcNIP4 subgroup that contains two members, each of the other six subgroups harbors a single one. The highest sequence similarity of 74.6% was observed between RcNIP5;1 and RcNIP6;1, followed by 61.7% and 60.0% between RcNIP1;1 with RcNIP4;2 and RcNIP3;1, respectively, and the lowest similarity of 33.8% was found between RcNIP4;1 and RcNIP7;1. Although both RcNIP4;1 and RcNIP4;2 harbor the same ar/R filter (W-V-A-R) (Table 2) and were clustered together (48.2% similarity) as shown in Fig 1, RcNIP4;2 exhibits higher sequence similarities of 42.5–61.7% with other RcNIP subgroup members than that of RcNIP4;1 (33.8–46.3%). The RcNIPs harbor typical dual NPA motifs except for RcNIP5;1 and RcNIP6;1 with NPS-NPV (Table 2 and S5 File). Except for the LE2 position, RcNIPs are highly variable in the ar/R filter and Froger’s positions in comparison to other subfamilies: W/A/G/T at the H2 position, V/I/A/S at the H5 position, A/G at the LE1 position, F/L/V/Y at the P1 position, A/P at the P3 position, S/T at the P2 position, Y/F at the P4 position and I/L/M at the P5 position (Table 2 and S5 File). In addition, one CDPK phosphorylation site corresponding to S262 in Glycine max NOD26 [46] was also found in the C-terminus of RcNIP1;1 and RcNIP4;1 (see S5 File).

RcSIP Subfamily

There are only four members composing two subgroups in the RcSIP subfamily, which consist of 234–240 residues (Table 2). RcSIP2;1 shares sequence similarities of 39.4–41.0% with RcSIP1s. Within the RcSIP1 subgroup, the highest sequence similarity of 75.1% was observed between RcSIP1;2 and RcSIP1;3, and whereas the lowest similarity of 62.4% between RcSIP1;1 and RcSIP1;2 (see S5 File). In RcSIPs, the residues of the second NPA motif (NPA) and the P3, P4 and P5 Froger’s positions (A-Y-W) are highly conserved, whereas other positions are more variable, i.e., NPA/NPN/NPT of the first NPA motif, A/V/S-A/V/H-T/P/G-N/S at the ar/R filter and Q/E/I-A/V at the P1 and P2 Froger’s positions (Table 2).

RcXIP Subfamily

RcXIPs vary from 208 to 309 residues in length (Table 2). With the exception of the RcXIP1 subgroup that contains four members, other two groups harbor only one. RcXIP2;1 shares 37.6–45.2% sequence similarity with RcXIP1s, which is considerably lower than that between RcXIP2;1 and RcXIP3;1 (76.2%), and slightly lower than that between RcXIP3;1 and RcXIP1s (39.5–47.1%) (see S2 Table). Among four RcXIP1s, the highest sequence similarity of 90.6% and the lowest of 42.8% were observed between RcXIP1;1 and RcXIP1;2, or RcXIP1;3 and RcXIP1;4, respectively. Compared with other RcXIPs, the length of RcXIP1;4 is relatively shorter (only 208 residues). Further sequence alignments indicated that RcXIP1;4 harbors only the first NPA motif and H2 and P1 positions (Table 2 and S5 File). In RcXIP1;1, RcXIP1;2, RcXIP1;3, RcXIP2;1 and RcXIP3;1, the second NPA motif and the LE2, P2, P4 and P5 positions are highly conserved, whereas other positions are variable, i.e., SPT/SPA/NPV/NPI in the first NPA motif, V/I at the H2 position, F/Y at the H5 position, V/A at the LE1 position, M/V at the P1 position and A/V at the P3 position (Table 2). In addition, like most XIPs [11,47], two highly conserved C residues in the LG/AGC motif of LC and the NPARC motif of LE were also found in RcXIPs except for RcXIP1;4 which is deficient in LE (see S5 File).

Expression Profiles of RcAQP Genes

To gain more information on the role of RcAQP genes in castor bean, RNA sequencing data of leaf, flower, endosperm (II/III, V/VI) and seed were investigated. Results showed that all 37 RcAQP genes were detected in at least one of the examined tissues, i.e., 35 in flower, 34 in leaf, 33 in seed, 31 in stage II/III endosperm and 29 in stage V/VI endosperm. According the RPKM annotation, the leaf tissue was shown to harbor the most transcripts, followed by the seed and flower tissues, and the endosperm has the fewest. Although the flower tissue harbors the most expressed AQP genes, its total expression level is only 0.46 and 0.71 folds of that of the leaf and seed tissues. In leaf, seed and flower tissues, the PIP subfamily contributes the major transcripts, in contrast, the TIP subfamily contributes the most in the endosperm. In the leaf tissue, six RcPIP members (in order as RcPIP1;4, RcPIP2;4, RcPIP2;2, RcPIP1;2, RcPIP2;3 and RcPIP1;3) were shown to occupy about 98% of the total PIP transcripts, whereas RcTIP1;1 and RcTIP2;1 account for about 85% of the total TIP transcripts. In the seed tissue, five RcPIP members (in order as RcPIP2;4, RcPIP1;2, RcPIP2;1, RcPIP1;3 and RcPIP1;4) occupy about 91% of the total PIP transcripts, whereas RcTIP2;1 occupies more than 78% of the total TIP transcripts. In the flower tissue, six RcPIP members (in order as RcPIP2;4, RcPIP1;2, RcPIP1;3, RcPIP2;3, RcPIP1;4 and RcPIP2;2) occupy about 89% of the total PIP transcripts, whereas RcTIP1;2 occupies more than 63% of the total TIP transcripts. In the stage II/III endosperm, RcPIP1;4, RcPIP2;4 and RcPIP2;2 occupy about 67% of the total PIP transcripts, whereas RcTIP3;1 and RcTIP1;1 occupy about 82% of the total TIP transcripts. Compared with the other tissues and different developmental stage (i.e. endosperm II/III), the expression level of RcAQP genes is considerably low in the stage V/VI endosperm, nevertheless, RcTIP3;1 occupies more than 95% of the total TIP transcripts or 69% of the total AQP transcripts. Although most genes were shown to be constitutively expressed in examined tissues, three genes seem to be tissue-specific, i.e., RcNIP4;2 in flower, RcXIP1;4 in leaf and RcXIP3;1 in seed. Castor bean AQP isoforms seem to play different roles in various tissues. For example, RcPIP2;4 is the most abundant transcript in both the flower and seed tissues, whereas RcPIP1;4 and RcTIP3;1 were shown to be expressed most in the leaf and endosperm tissues, respectively, suggesting their crucial roles in these tissues. In addition, the transcripts of RcNIP5;1, RcNIP7;1 and RcXIP1;1 were also shown to be relatively abundant in seed, flower and leaf, respectively (Fig 3). According to the cluster analysis shown in Fig 3, more similar expression pattern of RcAQP genes was observed between leaf and seed, and two stages of endosperm; two groups with distinct expression levels were also observed, where the more abundant group includes eight PIPs (i.e. RcPIP2;2, RcPIP1;2, RcPIP1;3, RcPIP2;3, RcPIP2;4, RcPIP1;4, RcPIP2;1 and RcPIP1;5), five TIPs (i.e. RcTIP1;1, RcTIP2;1, RcTIP4;1, RcTIP1;2 and RcTIP3;1), two NIPs (i.e. RcNIP1;1 and RcNIP5;1), four SIPs (i.e. RcSIP1;2, RcSIP1;3, RcSIP2;1 and RcSIP1;1).
Fig 3

Expression profiles of the 37 RcAQP genes in leaf, flower, endosperm II/III, endosperm V/VI and seed.

Color scale represents RPKM normalized log10 transformed counts where green indicates low expression and red indicates high expression.

Expression profiles of the 37 RcAQP genes in leaf, flower, endosperm II/III, endosperm V/VI and seed.

Color scale represents RPKM normalized log10 transformed counts where green indicates low expression and red indicates high expression.

Discussion

Small Number but High Diversity of Castor Bean AQP Genes

Compared with animals and microbes, AQPs are particularly abundant and diverse in land plants. To date, a high number of homologs have been identified from several plant species, i.e., 19 from Selaginella moellendorffii [18], 23 from Physcomitrella patens [9], 23 from Vitis vinifera [20], 33 from Oryza sativa [48], 35 from A. thaliana [16], 36 from Zea mays [49], 41 from Solanum tuberosum [14], 47 from Solanum lycopersicum [13], 55 from P. trichocarpa [19], 66 from Glycine max [15], and 71 from Gossypium hirsutum [12]. In contrast, the characterization of castor bean AQPs is still in its infancy. In the present study, a total of 37 AQP genes were identified from the castor bean through mining the genome and transcriptome datasets. Previously, one review also informed the presence of 37 RcAQP genes in castor bean, however, their result was merely dependent on the automatic genome annotation and only 34 out of the 37 RcAQP genes identified by this study were mentioned [50]. Although that paper focused on the analysis of TIPs, only seven RcTIP genes were described, whereas RcTIP2;2 (30101.m000372) and RcTIP5;1 (30147.m014231) were missed. Instead, three transcripts (i.e. 28962.m000435, 28962.m000436 and 30170.m014271) were misannotated as PIPs. In addition, four RcXIP1 genes were also misannotated as PIPs (S3 Table). The numbers of the castor bean AQP family are comparable to that of Arabidopsis and maize but less than that of potato, tomato, poplar, soybean and cotton. Since the AQP genes in the model plant Arabidopsis and poplar were well characterized [16,19,47,51], their deduced proteins were added in the phylogenetic analysis of RcAQPs, which assigned the 37 RcAQPs to five subfamilies. Compared with Arabidopsis without XIPs, castor bean contains five XIPs and one more SIP but fewer members of PIPs, TIPs and NIPs. In contrast, the number of members in the five castor bean subfamilies was shown to be relatively smaller than that of poplar (Fig 4). With the exception of XIP subfamily, the further classification of RcAQP subfamilies into subgroups is consistent with Arabidopsis, i.e., 2 PIP subgroups, 5 TIP subgroups, 7 NIP subgroups and 2 SIP subgroups. However, it should be noticed that, as shown in Fig 1, the classification of AtNIP2;1 and AtNIP3;1 was not well resolved. In the case of AtNIP2;1, it shares the highest similarity of 67.0% with AtNIP1;2 in Arabidopsis, 64.7% with RcNIP1;1 in castor bean, or 62.0% with PtNIP1;2 in poplar. Given the same ar/R filter (W-V-A-R) and its closer cluster to the NIP1 subgroup, we recommend to class AtNIP2;1 to the NIP1 subgroup, thereby, no NIP2s were retained in Arabidopsis as seen in castor bean (RcNIP2;1) and poplar (PtNIP2;1) which harbor a G-S-G-R filter. In the case of AtNIP3;1, although it was clustered with the NIP4 subgroup, its closest homolog is AtNIP1;2 (61.7%) in Arabidopsis, RcNIP1;1 (61.6%) in castor bean, or PtNIP1;1 (60.3%) in poplar, thus AtNIP3;1 can also be nominated as an NIP1 member. In addition, according to the phylogenetic analysis and sequence similarity, we propose to rename PtNIP1;5, PtNIP1;3, PtNIP1;4, PtNIP3;3, PtNIP3;4, PtNIP3;1, PtNIP3;2 and PtNIP3;5 as PtNIP3;1, PtNIP4;1, PtNIP4;2, PtNIP5;1, PtNIP5;2, PtNIP6;1, PtNIP6;2 and PtNIP7;1 (see S1 Table). According to the nomenclature based on the ar/R filter that classes NIPs into subgroups NIP I, NIP II and NIP III [52,53], RcNIP1;1, RcNIP3;1 and two RcNIP4s can be also assigned to the subgroup NIP I, three members (i.e. RcNIP5;1, RcNIP6;1 and RcNIP7;1) to the subgroup NIP II, whereas RcNIP2 forms a new subgroup termed the subgroup NIP III as observed in rice and maize [53]. Since no XIP homolog was found in the Arabidopsis genome, the nomenclature proposed by Lopez et al. [19] for poplar was adopted to divide RcXIPs into three subgroups. Besides supported by high bootstrap values, XIP1s are characterized by the ar/R filter of V-F-V-R, XIP2s of I-F-V-R, and XIP3s of V-Y-A-R.
Fig 4

Distribution of the 37 RcAQP genes and their Arabidopsis and poplar homologs in subgroups.

As seen in Arabidopsis and poplar, gene pairs were also observed in the castor bean AQP gene family, though the number is considerably small (Fig 1). For example, five AtPIP1s were clustered together apart from PIP1s of castor bean and poplar; RcPIP1;2, RcPIP1;3 and RcPIP1;4 were clustered with PtPIP1;3; RcPIP2;1 and RcPIP2;2 were clustered with PtPIP2;3 and PtPIP2;4; four RcXIP1s were clustered with two PtXIP1s; RcSIP1;2 and RcSIP1;3 were clustered with PtSIP1;3 and PtSIP1;4. It is well established that poplar underwent one whole-genome triplication event (designated γ) and one doubling event, whereas Arabidopsis underwent the same γ event and two independent doubling events [54-59]. And these duplication events mainly contribute to the gene expansion in these two plant species. Nevertheless, in contrast to poplar, the Arabidopsis genome encodes relatively fewer AQP genes due to massive gene loss and chromosomal rearrangement after genome duplications [54,60]. According to the comparative genomics analyses, the γ duplication occurred at approximate 117 million years ago, shortly before the origin of core eudicots [61]. As a core eudicot plant, castor bean was shown to share and only undergo the whole-genome γ duplication [25]. Since most gene pairs tend to be clustered in same scaffolds, tandem duplications are promised to be the main force for their expansion.

Subcellular Localization and Functional Inference of RcAQPs

In comparison to non-plants, plant AQPs exhibit a broader subcellular localization, including plasma membrane, vacuolar, endoplasmic reticulum (ER), Golgi apparatus, mitochondrion and chloroplast, etc., corresponding to the high degree of compartmentalization of plant cells [3,62]. Our subcellular localization prediction of RcAQPs included the plasma membrane, vacuole, chloroplast, peroxisome and cytosol. As observed in other plants and suggested by their names [62], basic RcPIPs and acidic RcTIPs are localized to plasma membranes and vacuoles, respectively. All RcNIPs were predicted to target the plasma membrane, though their homologs in other organisms were determined to localize to the plasma membrane, ER or peribacteroid membrane of root nodules [63-65]. Compared with the diverse localizations predicted by WoLF PSORT, all XIPs were predicted to localize the plasma membrane by Plant-mPLoc, which is consistent with experimental results [66]. RcSIP2;1 was predicted to target the ER as reported in Arabdopsis and grapevine [67,68], in contrast, three RcSIP1s were predicted to localize the plasma membrane, vacuolar and chloroplast. Thereby, further investigations are required on the subcellular localization of RcAQPs. Although plant AQPs were first known for their high water permeability, when expressed in Xenopus oocytes or yeast cells, increasing evidence has shown that some of them are also participated in the transport of other small molecules such as glycerol, urea, boric acid, silicic acid, NH3, CO2 and H2O2 [2]. As shown in Table 3, most RcAQPs exhibit an AqpZ-like Froger’s positions to favor the permeability of water. In contrast, NIP subfamily members possess mixed key residues of GlpF for P1 and P5, and AqpZ for P2–P4. Given the glycerol permease activity of soybean NOD26 and Arabidopsis NIPs [69,70], RcNIPs are promised to transport glycerol and may play roles in oil formation/translocation. In addition to high permeability to water, plant PIP subfamily members were reported to transport urea, boric acid, CO2 and H2O2 [71-78]. As shown in Table 3, all RcPIPs represent the F-H-T-R ar/R filter as observed in AqpZ which harbors an extremely narrow and hydrophilic pore (diameter 2.8 Å) [72], suggesting their high water permeability. Based on the SDP analysis proposed by Hove and Bhave [10], all RcPIPs represent urea-type SDPs; RcPIP1s represent boric acid-type SDPs; all RcPIPs represent H2O2-type SDPs, supporting their similar functionality. In addition, RcPIP1;1, RcPIP1;3, RcPIP1;4, RcPIP1;5, RcPIP2;2 and RcPIP2;4 seem to represent novel CO2-type SDPs (I/L/V-M-C-A-I/V-D-W-D-W) with the substitution of I for M at SDP2. Although highly variable in the ar/R filter, plant TIPs were shown to transport water as efficiently as PIPs [79]. Additionally, they also allow urea, NH3 and H2O2 through [80,81]. As shown in Table 3, all RcTIPs represent urea-type SDPs, whereas RcTIP5;1 represents H2O2-type SDPs, indicating similar functionality. Compared with typical NH3 SDPs, RcTIP2;2 seems to represent novel SDPs (T-K-T-V-A-S-A-P-S) with the substitution of S for A/R/T at SDP9. As well as glycerol and water, plant NIPs have been found to transport urea, boric acid, silicic acid, NH3 and H2O2 [63-65,80-82]. As shown in Table 3, RcNIP5;1 is promised to be a transporter of urea, boric acid and H2O2; RcNIP1;1 is promised to be a urea and NH3 transporter; RcNIP3;1 and RcNIP4;2 are potential urea and H2O2 transporters; RcNIP4;1 and RcNIP7;1 are potential urea transporters; RcNIP2;1 is a potential urea and H2O2 transporter. In addition, compared with typical silicic acid SDPs, HbNIP2;1 seems to represent novel SDPs-types with the substitution of V for A/E/L at SDP3 or Q for A/K/P/T at SDP9, which is similar to that of GmNIP2;1 and GmNIP2;2 (S-Y-E-R-G-N-R-T-P) [83]. RcNIP6;1 may also be a urea transporter representing novel SDPs (H-P-I-A-L-E-G-S-N) with the substitution of E for A/G/P at SDP6. As a recently identified AQP subfamily, plant XIPs were shown to transport water, glycerol, urea, boric acid and H2O2 [17,53]. According to phylogenetic relationships, XIPs are split into two independent clusters termed XIP-A and XIP-B, where XIP-A includes only XIP1 subgroup and XIP-B contains at least four subgroups, i.e., XIP2, XIP3, XIP4, and XIP5 [17]. Consistent with poplar XIPs (2 XIP1s, 1 XIP2 and 3 XIP3s), six RcXIPs can be assigned to subgroup XIP1 (4), XIP2 (1) and XIP3 (1) (Fig 4). When expressed in Xenopus oocytes, PtXIP2;1 and PtXIP3;3 were shown to transported water while other PtXIPs do not. Although the mechanism why PtXIP1s, PtXIP3;1 and PtXIP3;2 can’t transport water is still unclear, the close homologs of PtXIP1s in tobacco (Nicotiana tabacum) and potato were also reported to have undetectable water permeability. In contrast, Solanaceae XIPs showed high permeability to glycerol [66]. Therefore, although exhibiting an AqpZ-like Froger’s positions, all RcXIPs are promised to transport glycerol. Meanwhile, RcXIP2;1 and RcXIP3;1 are probably capable of transporting water. As shown in Table 3, RcXIP1;1, RcXIP1;2 and RcXIP1;3 may be H2O2 transporters representing novel SDPs with the substitution of S for I/L/V at SDP5.

Distinct Expression Profiles of RcAQP Isoforms in Various Tissues

Water is essential for all life on earth. Like other organisms, plant growth and development depends on water uptake and transport across cellular membranes and tissues. Thereby, water stress forms a major factor that decreases plant growth and productivity. One important response of plant cells to water stress is the regulation of AQPs [84-87]. Although plant AQPs were reported to be regulated by posttranslational modifications (e.g. phosphorylation, methylation and glycosylation), gating, heteromerization and cellular trafficking [88,89], the transcriptional regulation still acts as the key mechanism. Since the organ specificity of AQP expressions may be closely related to the physiological function of each organ, we took advantage of deep transcriptome sequencing (also known as RNA-Seq, an approach to transcriptome profiling through sequencing the total cDNA) data to survey the expression profiles of RcAQP genes from a global view. Results showed that the transcripts of RcPIP and RcTIP subfamily members are highly abundant in all examined tissues, which is consistent with that observed in other plant species such as maize, Arabidopsis, tomato and potato [13,14,49,51]. Considering PIPs and TIPs are highly permeable to water [79,90], their high abundance indicates their crucial roles in intracellular, cellular, organic and whole plant water balance of castor bean. RcPIP1;4, RcPIP2;4 and RcTIP1;1 were considerably abundant in leaves and are promised to play the key role in leaf hydraulics. RcPIP2;4, RcPIP1;2 and RcTIP1;2 are more likely to control the flower water balance for their high abundance in this tissue. The highly abundant RcPIP2;4 and RcTIP2;1 are promised to govern the seed water balance and RcTIP3;1 is promised to monitor the water balance of endosperms (regardless of stage II/III or V/VI). Compared with other tissues or developmental stage, RcTIP1;1 is expressed more in leaf and endosperm II/III, in contrast, its closest homolog in Arabidopsis (AtTIP1;1, encoding a transporter of water, urea and H2O2) was reported to be highly expressed in vascular tissues of root, stem, cauline leaf and flower but not in the apical meristem [79,91-95]. Despite without a strict organ-specific expression pattern, RcTIP1;2 is preferentially expressed in flowers, by contrast, its closest Arabidopsis homolog (AtTIP1;3, encoding a transporter of water and urea) was shown to be expressed highest in mature pollen, moderate in flower, very low in inflorescence and non-detectable in any other tissue [93,96]. Although the castor bean genome encodes two TIP2s, these two genes exhibit distinct expression profiles. Despite very low, the transcripts of RcTIP2;2 was observed in leaf, seed, flower and endosperm II/III, but not detectable in endosperm V/VI. In contrast, RcTIP2;1 is expressed in all examined tissues and the transcript level is considerably high in seed and leaf. In Arabidopsis, the closest homolog of RcTIP2;1 is AtTIP2;1, encoding a transporter of water, urea and NH3, which was shown to be mainly expressed in flower, shoot, and stem, and to a lower extent in roots [93,97-99]. AtTIP3;1, an orthology of RcTIP3;1 was reported to be a seed- and embryo-specific AQP gene [100], in contrast, RcTIP3;1 was preferentially expressed in the endosperm of developing seeds and considerably low in germinating seed. In addition, it is noteworthy that three putative non-aqua transporter encoding genes (i.e. RcNIP5;1, RcNIP7;1 and RcXIP1;1) were shown to be highly abundant in certain tissues. Compared with other tissues tested, the transcript level of RcNIP5;1 is considerably high in seed. Like castor bean, the Arabidopsis genome encodes a unique NIP5 member (AtNIP5;1), which was shown to transport boric acid and arsenite as well as water [65,101]. Expression analyses indicated that AtNIP5;1 is mainly expressed in root epidermal, cortical, and endodermal cells and the transcript is upregulated in response to B deprivation [65]. RcNIP7;1 can be regard as a flower-specific gene, because its expression level in leaf is extremely low. Similar to castor bean, AtNIP7;1, the orthology of RcNIP7;1 in Arabidopsis was also found to be specifically expressed in anther, encoding a less efficient boric acid transporter in comparison to AtNIP5;1 and AtNIP6;1 [102]. RcXIP1;1, a less characterized AQP subfamily member not found in Arabidopsis [19,47,66], was considerably abundant in leaf tissue, thus further investigating its function is of special interest. These results indicated that distinct AQP evolution occurred after the divergence of castor bean from Arabidopsis, and the AQP functional characterization in specific biological process needs to be performed beyond model plant species such as Arabidopsis.

Conclusions

To our knowledge, this is the first genome-wide study of the castor bean AQP gene family and using systematic nomenclature assigned 37 RcAQPs into five subfamilies based on the sequence similarity and phylogenetic relationship with their Arabidopsis and poplar counterparts. Furthermore, their structural and functional properties were investigated using bioinformatics tools. The global expression profiles of these 37 RcAQP genes were examined with deep transcriptome sequencing. And putative transporters of water, glycerol, urea, boric acid, silicic acid, NH3, CO2 and H2O2 were also predicted and discussed. The RcAQP genes identified in this study represent an important resource for future functional analysis and utilization.

The gene model for RcXIP2;1 and RcXIP3;1.

(PDF) Click here for additional data file.

The gene model for RcPIP2;5.

(PDF) Click here for additional data file.

The gene model for RcTIP2;2.

(PDF) Click here for additional data file.

The gene model for RcNIP4;1.

(PDF) Click here for additional data file.

Alignment of deduced amino acid sequences of castor bean AQPs with structure determined spinach PIP2;1.

(PDF) Click here for additional data file.

SDP analysis of castor bean AQPs from alignments with amino acid sequences of AQPs transporting non-aqua substrates.

(PDF) Click here for additional data file.

List of the Phytozome accession numbers of the AQPs genes identified in Arabidopsis (35) and poplar (55).

(XLSX) Click here for additional data file.

Percent similarity within and between five subfamilies of the RcAQPs.

(XLSX) Click here for additional data file.

Comparison of the 37 RcAQP genes identified in this study with previous annotations.

(XLSX) Click here for additional data file.
  95 in total

Review 1.  The role of aquaporins in cellular and whole plant water balance.

Authors:  I Johansson; M Karlsson; U Johanson; C Larsson; P Kjellbom
Journal:  Biochim Biophys Acta       Date:  2000-05-01

Review 2.  Diversity and evolution of membrane intrinsic proteins.

Authors:  Federico Abascal; Iker Irisarri; Rafael Zardoya
Journal:  Biochim Biophys Acta       Date:  2013-12-16

3.  Plant aquaporins with non-aqua functions: deciphering the signature sequences.

Authors:  Runyararo Memory Hove; Mrinal Bhave
Journal:  Plant Mol Biol       Date:  2011-02-10       Impact factor: 4.076

Review 4.  Membrane transport of hydrogen peroxide.

Authors:  Gerd P Bienert; Jan K Schjoerring; Thomas P Jahn
Journal:  Biochim Biophys Acta       Date:  2006-03-10

5.  Aquaporin NIP2;1 is mainly localized to the ER membrane and shows root-specific accumulation in Arabidopsis thaliana.

Authors:  Masahiro Mizutani; Satoshi Watanabe; Tsuyoshi Nakagawa; Masayoshi Maeshima
Journal:  Plant Cell Physiol       Date:  2006-09-05       Impact factor: 4.927

6.  The Arabidopsis major intrinsic protein NIP5;1 is essential for efficient boron uptake and plant development under boron limitation.

Authors:  Junpei Takano; Motoko Wada; Uwe Ludewig; Gabriel Schaaf; Nicolaus von Wirén; Toru Fujiwara
Journal:  Plant Cell       Date:  2006-05-05       Impact factor: 11.277

7.  Solanaceae XIPs are plasma membrane aquaporins that facilitate the transport of many uncharged substrates.

Authors:  Gerd Patrick Bienert; Manuela Désirée Bienert; Thomas Paul Jahn; Marc Boutry; François Chaumont
Journal:  Plant J       Date:  2011-03-01       Impact factor: 6.417

8.  Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution.

Authors:  Guillaume Blanc; Kenneth H Wolfe
Journal:  Plant Cell       Date:  2004-06-18       Impact factor: 11.277

9.  Plant plasma membrane water channels conduct the signalling molecule H2O2.

Authors:  Marek Dynowski; Gabriel Schaaf; Dominique Loque; Oscar Moran; Uwe Ludewig
Journal:  Biochem J       Date:  2008-08-15       Impact factor: 3.857

10.  The tobacco aquaporin NtAQP1 is a membrane CO2 pore with physiological functions.

Authors:  Norbert Uehlein; Claudio Lovisolo; Franka Siefritz; Ralf Kaldenhoff
Journal:  Nature       Date:  2003-09-28       Impact factor: 49.962

View more
  12 in total

1.  Nodulin Intrinsic Protein 7;1 Is a Tapetal Boric Acid Channel Involved in Pollen Cell Wall Formation.

Authors:  Pratyush Routray; Tian Li; Arisa Yamasaki; Akira Yoshinari; Junpei Takano; Won Gyu Choi; Carl E Sams; Daniel M Roberts
Journal:  Plant Physiol       Date:  2018-09-28       Impact factor: 8.340

2.  The Eucalyptus Tonoplast Intrinsic Protein (TIP) Gene Subfamily: Genomic Organization, Structural Features, and Expression Profiles.

Authors:  Marcela I Rodrigues; Agnes A S Takeda; Juliana P Bravo; Ivan G Maia
Journal:  Front Plant Sci       Date:  2016-11-30       Impact factor: 5.753

3.  Understanding Aquaporin Transport System in Eelgrass (Zostera marina L.), an Aquatic Plant Species.

Authors:  S M Shivaraj; Rupesh Deshmukh; Javaid A Bhat; Humira Sonah; Richard R Bélanger
Journal:  Front Plant Sci       Date:  2017-08-03       Impact factor: 5.753

4.  Genome-wide identification and characterization of aquaporin gene family in Beta vulgaris.

Authors:  Weilong Kong; Shaozong Yang; Yulu Wang; Mohammed Bendahmane; Xiaopeng Fu
Journal:  PeerJ       Date:  2017-09-19       Impact factor: 2.984

5.  Genome-wide identification, characterization, and expression profile of aquaporin gene family in flax (Linum usitatissimum).

Authors:  S M Shivaraj; Rupesh K Deshmukh; Rhitu Rai; Richard Bélanger; Pawan K Agrawal; Prasanta K Dash
Journal:  Sci Rep       Date:  2017-04-27       Impact factor: 4.379

6.  Genome-Wide Identification and Characterization of Aquaporins and Their Role in the Flower Opening Processes in Carnation (Dianthus caryophyllus).

Authors:  Weilong Kong; Mohammed Bendahmane; Xiaopeng Fu
Journal:  Molecules       Date:  2018-07-29       Impact factor: 4.411

7.  Genome-wide identification and comparative evolutionary analysis of the Dof transcription factor family in physic nut and castor bean.

Authors:  Zhi Zou; Xicai Zhang
Journal:  PeerJ       Date:  2019-02-05       Impact factor: 2.984

8.  Genome-Wide Identification of Jatropha curcas Aquaporin Genes and the Comparative Analysis Provides Insights into the Gene Family Expansion and Evolution in Hevea brasiliensis.

Authors:  Zhi Zou; Lifu Yang; Jun Gong; Yeyong Mo; Jikun Wang; Jianhua Cao; Feng An; Guishui Xie
Journal:  Front Plant Sci       Date:  2016-03-31       Impact factor: 5.753

9.  The Aquaporin Splice Variant NbXIP1;1α Is Permeable to Boric Acid and Is Phosphorylated in the N-terminal Domain.

Authors:  Henry Ampah-Korsah; Hanna I Anderberg; Angelica Engfors; Andreas Kirscht; Kristina Norden; Sven Kjellstrom; Per Kjellbom; Urban Johanson
Journal:  Front Plant Sci       Date:  2016-06-16       Impact factor: 5.753

10.  Genome Wild Analysis and Molecular Understanding of the Aquaporin Diversity in Olive Trees (Olea Europaea L.).

Authors:  Mohamed Faize; Boris Fumanal; Francisco Luque; Jorge A Ramírez-Tejero; Zhi Zou; Xueying Qiao; Lydia Faize; Aurélie Gousset-Dupont; Patricia Roeckel-Drevet; Philippe Label; Jean-Stéphane Venisse
Journal:  Int J Mol Sci       Date:  2020-06-11       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.