Literature DB >> 30842789

Genome-Wide Identification, Sequence Variation, and Expression of the Glycerol-3-Phosphate Acyltransferase (GPAT) Gene Family in Gossypium.

Yupeng Cui1, Jianjiang Ma1, Guoyuan Liu1, Nuohan Wang1, Wenfeng Pei1, Man Wu1, Xingli Li1, Jinfa Zhang2, Jiwen Yu1.   

Abstract

Cotton is an economically important crop grown for natural fiber and seed oil production. Cottonseed oil ranks third after soybean oil and colza oil in terms of edible oilseed tonnage worldwide. Glycerol-3-phosphate acyltransferase (GPAT) genes encode enzymes involved in triacylglycerol biosynthesis in plants. In the present study, 85 predicted GPAT genes were identified from the published genome data in Gossypium. Among them, 14, 16, 28, and 27 GPAT homologs were identified in G. raimondii, G. arboreum, G. hirsutum, and G. barbadense, respectively. Phylogenetic analysis revealed that a total of 108 GPAT genes from cotton, Arabidopsis and cacao could be classified into three groups. Furthermore, through comparison, the gene structure analyses indicated that GPAT genes from the same group were highly conserved between Arabidopsis and cotton. Segmental duplication could be the major driver for GPAT gene family expansion in the four cotton species above. Expression patterns of GhGPAT genes were diverse in different tissues. Most GhGPAT genes were induced or suppressed after salt or cold stress in Upland cotton. Eight GhGPAT genes were co-localized with oil and protein quantitative trait locus (QTL) regions. Thirty-two single nucleotide polymorphisms (SNPs) were detected from 12 GhGPAT genes, sixteen of which in nine GhGPAT genes were classified as synonymous, and sixteen SNPs in ten GhGPAT genes non-synonymous. Two SNP markers of the GhGPAT16 and GhGPAT26 genes were significantly correlated with cotton oil content in one of the three field tests. This study shed lights on the molecular evolutionary properties of GPAT genes in cotton, and provided reference for improvement of cotton response to abiotic stress and the genetic improvement of cotton oil content.

Entities:  

Keywords:  Gossypium spp.; abiotic stress; gene expression patterns; glycerol-3-phosphate acyltransferase (GPAT); oil content

Year:  2019        PMID: 30842789      PMCID: PMC6391866          DOI: 10.3389/fgene.2019.00116

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


Introduction

The majority of fatty acids are incorporated into membrane glycerolipids or triacylglycerol in a plant cells. The initial and committed step of glycerolipid biosynthesis is catalyzed by glycerol-3-phosphate acyltransferase, which catalyzes the transfer of a fatty acyl moiety from acyl-CoA, dicarboxylic acid-CoA or acyl-acyl carrier protein to the sn-1 or sn-2 position of glycerol 3-phosphate, resulting in the formation of lysophosphatic acid (LPA) (Zheng et al., 2003; Singer et al., 2016; Waschburger et al., 2018). There are plastid, endoplasmic reticulum (ER) and mitochondrial GPATs in plants (Gidda et al., 2009; Chen et al., 2011). Plastid GPATs participate in the prokaryotic biosynthesis of phospholipids, sulfolipids, and galactolipids (Ohlrogge and Browse, 1995), ER and mitochondrial GPATs are mainly involved in the synthesis of cutin, suberin, and storage lipids (Yang et al., 2012). The alignment of GPAT amino acid sequences from evolutionarily diverse organisms has revealed that all GPATs contain at least four highly conserved motifs, which are essential for both acyltransferase activity and the binding of glycerol-3-phosphate substrate (Lewin et al., 1999). The GPAT genes have been identified by genome-wide analyses in Arabidopsis, mammals and other plant species (Zheng et al., 2003; Gidda et al., 2009; Bertolesi et al., 2012; Yang et al., 2012; Waschburger et al., 2018), and fall within three different classes. Previous studies showed that GPAT gene is related to the fertility, oil content of plant, seed development and stress tolerance (Liu et al., 2013; Payá-Milans et al., 2016). In Arabidopsis, ten GPAT genes have been identified (Zheng et al., 2003; Gidda et al., 2009). AtATS1 functions in the production of major phospholipids (Singer et al., 2016). AtGPAT1 plays an essential role in pollen fertility, the disruption of AtGPAT1 affects tapetal differentiation and causes most microspores to abort before reaching maturity, and atgpat1/atgpat6 double mutants reduced the seed setting rate of Arabidopsis (Zheng et al., 2003). AtGPAT4, AtGPAT5, and AtGPAT8 are involved in the production of cutin or suberin, atgpat5 mutant could decrease suberin content in seed coat and root of Arabidopsis (Beisson et al., 2007), atgpat4/atgpat8 double mutants could reduce cutin biosynthesis in Arabidopsis, while overexpression of AtGPAT4 or AtGPAT8 in Arabidopsis could increase the content of C16 and C18 cutin monomers in leaves and stems by 80% (Li et al., 2007). AtGPAT9 is known to be essential for seed triacylglycerol (TAG) biosynthesis (Gidda et al., 2009), knockdown of AtGPAT9 can reduce seed oil content and an atgpat9 mutant causes both male and female gametophytic lethality phenotypes (Shockey et al., 2016). Furthermore, AtGPAT9 plays a role in the biosynthesis of polar lipids and TAG in developing leaves, as well as lipid droplet production in developing pollen grains of Arabidopsis (Singer et al., 2016). Since the first identified plant GPAT gene was reported to function in tapetum differentiation and male fertility in Arabidopsis (Zheng et al., 2003), many studies have reported that GPAT genes are involved in adverse environmental conditions, such as cold, salt, and heat stress, indicating their potential roles in tolerance to abiotic stresses. The overexpression of the AtATS1 gene has been reported to improve the tolerance of tobacco seedlings to chilling stress, while the transformation of a squash GPAT gene into tobacco resulted in decreased chilling tolerance (Murata and Tasaka, 1992). In rice, the overexpression of AtATS1 gene increased the unsaturation of fatty acids in phosphatidylglycerol (PG) and improved the photosynthetic rates and growth at low temperatures (Yokoi et al., 1998; Ariizumi et al., 2002). The chloroplast GPAT gene from tomato (LeGPAT) was enhanced by chilling temperature (4°C) and suppressed by high temperature (40°C) in tomato leaves, and overexpression of the LeGPAT enhanced chilling tolerance by increasing the content of unsaturated fatty acids in PG (Sui et al., 2007b). Similarly, the overexpression of chloroplast LeGPAT in tomato enhanced chilling (Sui et al., 2007a) and salt tolerance (Sun et al., 2010), and the overexpression of Suaeda salsa GPAT gene (SsGPAT) in Arabidopsis enhanced salt tolerance (Sui et al., 2017). However, antisense-mediated depletion of chloroplast LeGPAT gene alleviated heat stress damage in transgenic tomato plants (Sui et al., 2010). Many other similar studies on GPAT genes and their functions have been reported (Payá-Milans et al., 2016; Misra et al., 2017; Xie et al., 2018). However, little is known about this gene family in cotton, especially their potential roles in lipid or oil synthesis or under salt and cold stress in Upland cotton. Cotton (Gossypium) is the most important natural fiber-producing plant species for the textile industry, and according to the Foreign Agricultural Service, cotton is also the fifth largest vegetable oil-producing crop after soybeans, rapeseed, sunflower and peanuts in the world (https://apps.fas.usda.gov/psdonline/circulars/oilseeds.pdf). There are approximately 50 species of Gossypium including 45 diploid (2n = 2x = 26) and 5 tetraploid (2n = 4x = 52) species. Approximately 1–2 million years ago, diploid cotton species resembling G. arboreum (A2) and G. raimondii (D5) hybridized and underwent polyploidization, giving rise to tetraploid cotton species (Cronn et al., 1999; Paterson et al., 2012; Li et al., 2014). The recent progress in the whole-genome sequencing in G. raimondii (Paterson et al., 2012; Wang et al., 2012), G. arboreum (Li et al., 2014), G. hirsutum (Li et al., 2015; Zhang T. et al., 2015), and G. barbadense (Liu et al., 2015; Yuan et al., 2015) has rendered it possible to genome-wide analyses of important gene families including the GPAT gene family in the four species with their genomes sequenced. In this study, all of the GPAT family genes in the four sequenced cotton species were identified and characterized. We further analyzed their phylogenetic tree, gene structure, chromosomal localization, functional diversification and expression profiles of GPAT genes in various tissues and under salt or cold stress. The duplication events of GPAT genes in the four cotton species were also analyzed. Finally, single nucleotide polymorphisms (SNPs) of GhGPAT genes were identified to study their association with cottonseed oil content using high-resolution melting (HRM). The results will contribute to widen our understanding about the gene structures and functions of GPAT family genes in Gossypium, and provide useful information on the relationship between GhGPAT genes and seed oil content in cotton.

Materials and Methods

Sequence Retrial and Identification of GPAT Family Members in Gossypium

The completed genome sequences of G. raimondii (Paterson et al., 2012), G. arboreum (Li et al., 2014), G. hirsutum acc. TM-1 (Zhang T. et al., 2015), and G. barbadense acc. 3-79 (Yuan et al., 2015) were downloaded from the CottonGen website1 (Yu et al., 2014). The genome sequences and the published GPAT proteins of Arabidopsis and cacao (Theobroma cacao) (Table S1) were retrieved from The Arabidopsis Information Resource (TAIR release 10, http://www.arabidopsis.org) and The Cacao Genome Database (https://www.cacaogenomedb.org), respectively. Afterwards, BlastP and tBlastN programs (E-value = 0.00001) were performed to search for the candidate GPAT genes of the four Gossypium species using GPAT protein sequences from Arabidopsis and cacao as queries. Subsequently, the Pfam (http://pfam.sanger.ac.uk/) (Finn et al., 2014), Interproscan (http://www.ebi.ac.uk/Tools/pfa/iprscan/) (Quevillon et al., 2005) and SMART (http://smart.embl-heidelberg.de/) (Letunic et al., 2015) databases were applied to confirm each candidate member of the GPAT family with the PlsC acyltransferase domain of Pfam (PF01553). Furthermore, the theoretical molecular weight (MW) and isoelectric point (pI) of the GPAT proteins were predicted using the online ExPASy tool (http://web.expasy.org/) (Bjellqvist et al., 1994), and subcellular localization was predicted using the online CELLO v2.5 server (http://cello.life.nctu.edu.tw/; Yu et al., 2004).

Phylogenetic Construction

Clustal X v2.0 program (Larkin et al., 2007) was used to perform the multiple sequence alignment with default parameters, followed by manual comparisons and refinements. Subsequently, the Neighbor-Joining (NJ) method was used to construct different phylogenetic trees by MEGA v5.0 software (Tamura et al., 2011) with pairwise deletion option, poisson correction model and uniform rates. Bootstrap test were carried out with 1,000 replicates for evaluating the statistical reliability of phylogenetic tree. Furthermore, Maximum likelihood method was also applied in the tree construction to confirm the consistency with the NJ method.

Chromosomal Localization, Gene Duplication and Genomic Organizations Prediction

All GPAT genes of G. raimondii, G. arboreum, G. hirsutum, and G. barbadense were mapped on the corresponding chromosomes according to their positional information provided in the genome annotation document. The chromosome location images of cotton GPAT genes were portrayed by Mapchart v2.2 software (Voorrips, 2002) and Circos tool (Krzywinski et al., 2009). GPAT gene duplication events were defined according to the length of aligned sequence covered more than 80% of the longer gene, similarity of the aligned regions was bigger than 80%, and only one duplication event was counted for tightly linked genes (Maher et al., 2006; Dong et al., 2016; Cui et al., 2017). Referring to the chromosomal locations of GPAT genes, these GPAT genes could be designated as segmental duplications or tandem duplications. The Gene Structure Display Server (GSDS) tool (Hu et al., 2015) was used to construct the gene structure of GPAT genes by comparing the predicted GPAT coding sequences with their corresponding genomic sequences.

Selective Pressure Analysis of Duplicated Gene Pairs

All the full-length gene sequence of the GPAT duplicated gene pairs of G. raimondii, G. arboreum, G. hirsutum, and G. barbadense were firstly aligned by Clustal X v2.0 program, and then the synonymous substitution (Ks) and non-synonymous substitution (Ka) were calculated using the DnaSP v5.0 software (Rozas et al., 2003). Finally, the Ka/Ks ratio was used to estimate the selection pressure for each gene pair.

RNA-Seq Data Analysis and Gene Expression Heatmap

The raw RNA-Seq data of G. hirsutum acc. TM-1 was downloaded from the NCBI under the accession number SRA: PRJNA248163 (Zhang T. et al., 2015). TopHat and cufflinks were used for mapping reads and analyzing gene expression levels, and fragments per kilobase million (FPKM) values were used to normalize gene expression levels (Trapnell et al., 2013). The expression level of GhGPAT family genes at different developmental stages were measured by extracting their respective data from the total expression matrix according to the identified GhGPAT ID. The expression levels of ten AtGPAT genes were extracted form publicly Arabidopsis microarray expression datasets by the BioArray Resource (BAR) Expression Profiler tool (http://bar.utoronto.ca/affydb/cgi-bin/affy_db_exprss_browser_in.cgi) (Toufighi et al., 2005). The heatmap for expression profiles were generated with MultiExperiment Viewer (MeV) software.

Cis-Acting Elements Searching in the Promoters of GhGPATs

To identify the cis-elements in the promoter sequences of the 28 GPAT family genes in G. hirsutum, the BLASTN program was applied for searching G. hirsutum genome using the GhGPAT genes as query sequences. A 2-kb of genomic sequences upstream of the start codon of each GhGPAT gene was chosen, and then each sequence was submitted to the PlantCARE database (Lescot et al., 2002) (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) to identify the cis-elements.

Plant Materials and Abiotic Stress Treatment

Cotton seedlings of G. hirsutum L. acc TM-1 were grown in a temperature-controlled chamber at 28 ± 2°C under 16 h light/ 8 h dark condition. Seedlings at the 3-true leaf stage were exposed to the following different treatments. For low temperature, seedlings were grown in a temperature-controlled chamber at 4°C for 0, 12, and 24 h. For salt treatment, the seedlings were grown in Hoagland nutrient solution with additional 0, 150, 200, and 300 mM NaCl for 24 h, which represented the control condition, slight stress, moderate stress, and severe stress, respectively. After different treatments, the roots, stems, cotyledons and leaves were harvested at each time point of each treatment. Three biological replicates were conducted for each time point. All collected samples were immediately frozen with liquid nitrogen and then stored at −80°C for RNA isolation.

Quantitative Reverse Transcriptase-Polymerase Chain Reaction (qRT-PCR) Analysis

Total RNA of all samples was extracted using an RNAprep Pure Plant kit (Tiangen, Beijing, China), and approximately 1 μg RNA was used for the synthesis of first-strand cDNAs with PrimerScript 1st Strand cDNA synthesis kit (TaKaRa, Dalian, China) according to the manufacturer protocol. The gene-specific primers were designed, according to the CDSs of cotton GPAT genes, using Primer v5.0 software. The specific primers used are listed in Table S2, and GhUBQ7 was used as an internal control to normalize all data. The qRT-PCR was strictly conducted with SYBR premix Ex Taq Kit (TakaRa) following the manufacturer's instruction using ABI 7500 real-time PCR System (Applied Biosystems, Foster City, CA, USA). Each sample was conducted with three biological replicates and three technical replicates. The relative expression levels of GhGPAT genes were calculated using the 2−ΔΔCt method (Livak and Schmittgen, 2001).

Identification of Single Nucleotide Polymorphisms (SNPs) for GPAT Genes and Statistical Analysis With Cottonseed Oil

Single-nucleotide polymorphism (SNPs) for GPAT genes were identified by the RNA-seq datasets of CRI 36 and Hai 7124, in which CRI 36 is a currently commercial cultivar of G. hirsutum and Hai 7124 is a non-commercial G. barbadence, and SOAPsnp software was used to scan SNPs based on the assembled contigs (Li et al., 2009). To validate SNP markers, genomic DNA was extracted from an interspecific backcross inbred line (BIL) population of 180 lines, as well as the two parents using a quick CTAB method (Zhang and Stewart, 2000). In this study, the BIL population was developed from a cross between Upland cotton CRI 36 and G. barbadense Hai 7124 through one generation of backcrossing using CRI 36 as the recurrent parent followed by seven generations of selfing. Putative SNPs were identified to design primers using Primer v5.0 software (Table S2) and analyzed their association with cottonseed oil content using high-resolution melting (HRM). The HRM reaction mixture and PCR followed the protocols (Ma et al., 2016). The SNP markers were coded as “1” for genotypes of CRI 36, “3” for genotypes of Hai 7124 and “2” for the heterozygotes, and used for a simple correlation analysis with the cottonseed oil content in the BIL population by SPPS software (IBM, New York, USA).

Results

Genome-Wide Identification of GPAT Genes in Four Cotton Species

To identify all the GPAT genes in cotton, we performed a BLASTP search against the diploid cotton (G. raimondii and G. arboreum) and tetraploid cotton (G. hirsutum and G. barbadense) protein databases using the GPAT sequences of Arabidopsis (10) and cacao (13) as queries (Table S1). All potential cotton proteins were then subjected to domain analysis using the Pfam, Interproscan and SMART databases to further confirm the presence of the GPAT protein domain (PF01553). Eventually, a total of 14, 16, 28, and 27 GPAT members were identified in G. raimondii, G. arboreum, G. hirsutum and G. barbadense, respectively. These GPAT genes were named consecutively from GrGPAT1-GrGPAT14, GaGPAT1-GaGPAT16, GhGPAT1-GaGPAT28, and GbGPAT1-GaGPAT27 in the four cotton species, respectively, according to the order of their chromosomal locations (Table 1). Eighty of the 85 identified GPAT genes encode proteins ranging between 377 and 543 amino acids (AAs), except for 5 genes with lengths of less than 377 AAs or more than 543 AAs (i.e., GrGPAT3 encodes a protein of 261 AAs, and GhGPAT20, GbGPAT2, GbGPAT16, and GbGPAT20 encode proteins of 258, 319, 740, and 298 AAs, respectively). The predicted Mw of the GPAT proteins in the four cotton species ranged from 28.78 to 83.20 kDa, and the theoretical pI varied between 5.49 and 9.63. The protein subcellular localization of GPAT in cotton showed that 68 of 85 GPAT proteins were predicted to be located in the plasma membrane, while the rest were predicted to be located in the chloroplast or cytoplasm. We further analyzed the orthologous and paralogous GPAT genes in the four cotton species (Table S3). We identified 18 orthologous gene pairs between G. hirsutum and G. barbadense. Compared with the G. raimondii, 13, 12, and 10 orthologous gene pairs were identified for G. arboreum, G. hirsutum, and G. barbadense, respectively. Compared with the G. arboreum, 10 and 9 orthologous gene pairs were identified for G. hirsutum and G. barbadense, respectively. In addition, 5, 6, 13 and 11 paralogous gene pairs were identified in G. raimondii, G. arboreum, G. hirsutum, and G. barbadense, respectively.
Table 1

The information of GPAT gene family in four Gossypium species.

Gene nameGene identifierGenomics positionCDSAA sizeMw (kDa)pIPredicted subcellular location
GrGPAT1Gorai.003G185500.1Chr03: 45619381-45625850138646151.78.38Chloroplast
GrGPAT2Gorai.004G145100.1Chr04: 40873197-40875543163254361.389.59Plasma Membrane
GrGPAT3Gorai.004G182200.1Chr04: 49380119-4938777978626129.216.66Chloroplast
GrGPAT4Gorai.005G258500.1Chr05: 63451173-63455894113437743.167.96Plasma Membrane
GrGPAT5Gorai.006G181200.1Chr06: 43856590-43859232151550456.19.31Plasma Membrane
GrGPAT6Gorai.007G033500.1Chr07: 2286486-2290897115538442.865.52Chloroplast
GrGPAT7Gorai.007G100100.1Chr07: 7410329-7415092150350055.749.29Plasma Membrane
GrGPAT8Gorai.008G098000.1Chr08: 27391572-27393956162654161.019.26Plasma Membrane
GrGPAT9Gorai.010G045500.1Chr10: 4807029-4809534155151658.49.56Plasma Membrane
GrGPAT10Gorai.011G267100.1Chr11: 59897897-59899780148549455.279.05Plasma Membrane
GrGPAT11Gorai.011G267200.1Chr11: 59914117-59915636121840545.579.04Plasma Membrane
GrGPAT12Gorai.012G071300.1Chr12: 10531559-10533384146748854.99.54Plasma Membrane
GrGPAT13Gorai.012G112800.1Chr12: 25762804-25765226154851557.389.28Plasma Membrane
GrGPAT14Gorai.012G158300.1Chr12: 33150125-33154932113737843.418.83Plasma Membrane
GaGPAT1Cotton_A_09737chr03: 35875883-35882520134144649.936.69Chloroplast
GaGPAT2Cotton_A_21450chr04: 24644639-24648923146748854.259.26Plasma Membrane
GaGPAT3Cotton_A_20607chr04: 56175430-56177058153351056.149.23Plasma Membrane
GaGPAT4Cotton_A_00592chr05: 8890287-8894119113437743.258.83Plasma Membrane
GaGPAT5Cotton_A_18440chr08: 44876559-44878590154851558.379.63Plasma Membrane
GaGPAT6Cotton_A_30920chr09: 80514225-80515788147649154.879.12Plasma Membrane
GaGPAT7Cotton_A_24431chr09: 85523484-85525099151550455.119.23Plasma Membrane
GaGPAT8Cotton_A_25400chr10: 112459178-112464741138646151.928.1Chloroplast
GaGPAT9Cotton_A_02486chr11: 20915572-20917911150350055.529.28Plasma Membrane
GaGPAT10Cotton_A_39465chr12: 20976325-20978378150650155.869.26Plasma Membrane
GaGPAT11Cotton_A_19822chr12: 45334367-45336407160253360.359.49Plasma Membrane
GaGPAT12Cotton_A_36957chr12: 50316570-50318108146748854.799.45Plasma Membrane
GaGPAT13Cotton_A_25468chr12: 75489147-75491122162654160.989.13Plasma Membrane
GaGPAT14Cotton_A_08860chr12: 90634082-90638451113737843.418.83Plasma Membrane
GaGPAT15Cotton_A_09184chr13: 44533469-44537359115538442.675.49Cytoplasmic
GaGPAT16Cotton_A_24308chr13: 89011942-89013603157852559.139.24Plasma Membrane
GhGPAT1Gh_A03G1851A03: 99577941-99581782113437743.248.6Plasma Membrane
GhGPAT2Gh_A04G0765A04: 51891051-51893104150650155.829.26Plasma Membrane
GhGPAT3Gh_A04G1069A04: 60750094-60754552113737843.418.83Plasma Membrane
GhGPAT4Gh_A05G3037A05: 78024613-78026151146748854.799.51Plasma Membrane
GhGPAT5Gh_A06G0327A06: 4927618-4929887156652159.029.63Plasma Membrane
GhGPAT6Gh_A08G1054A08: 72959792-72961831159953260.179.43Plasma Membrane
GhGPAT7Gh_A08G1388A08: 88694103-88700765134144649.86.52Chloroplast
GhGPAT8Gh_A08G1816A08: 98752308-98754008161753860.729.27Plasma Membrane
GhGPAT9Gh_A09G1528A09: 68529716-68532054150350055.569.28Plasma Membrane
GhGPAT10Gh_A10G1987A10: 97626943-97628506147649154.889.12Plasma Membrane
GhGPAT11Gh_A11G0804A11: 7991963-7996279150350055.839.25Plasma Membrane
GhGPAT12Gh_A12G0824A12: 49192781-49194756162654160.999.13Plasma Membrane
GhGPAT13Gh_D02G2290D02: 66653634-66657584113437743.167.96Plasma Membrane
GhGPAT14Gh_D03G1661D03: 46629378-46634196138646151.698.1Chloroplast
GhGPAT15Gh_D04G0605D04: 10821266-10822804146748854.939.48Plasma Membrane
GhGPAT16Gh_D04G1251D04: 41151043-41153134154851557.419.28Plasma Membrane
GhGPAT17Gh_D04G1675D04: 48923650-48927352108636141.489.02Plasma Membrane
GhGPAT18Gh_D06G0355D06: 4862412-4864432154851558.239.54Plasma Membrane
GhGPAT19Gh_D08G1335D08: 43709349-43711332163254361.419.58Plasma Membrane
GhGPAT20Gh_D08G1336D08: 43711996-4371277277725828.789.35Plasma Membrane
GhGPAT21Gh_D08G1683D08: 52638770-52645877134144649.866.48Chloroplast
GhGPAT22Gh_D09G1546D09: 43059728-43061991150950255.899.23Plasma Membrane
GhGPAT23Gh_D10G2294D10: 61148398-61149971148549455.279.05Plasma Membrane
GhGPAT24Gh_D11G0942D11: 8173864-8178049150350055.749.29Plasma Membrane
GhGPAT25Gh_D12G0839D12: 26672950-266749261626541619.22Plasma Membrane
GhGPAT26Gh_A03G1948scaffold496_A03: 45464-51022138646151.98.1Chloroplast
GhGPAT27Gh_A11G2987scaffold2726_A11: 85575-89619132344048.696.11Chloroplast
GhGPAT28Gh_D11G3539scaffold4571_D11: 33114-37148132344049.016.06Chloroplast
GbGPAT1Gbscaffold10683.8.0A03: 101375090-101379793113437743.248.6Plasma Membrane
GbGPAT2Gbscaffold9797.1.0A04: 63446426-6344995696031936.488.48Plasma Membrane
GbGPAT3Gbscaffold1022.17.0A06: 5684360-5686814154851558.419.63Plasma Membrane
GbGPAT4Gbscaffold127.8.0A08: 76313638-76315690159953260.179.43Plasma Membrane
GbGPAT5Gbscaffold5413.16.0A08: 103987721-103989621161753860.699.19Plasma Membrane
GbGPAT6Gbscaffold31407.2.0A09: 71102212-71105277150350055.619.28Plasma Membrane
GbGPAT7Gbscaffold5583.14.0A10: 103589265-103590767119439744.359.44Plasma Membrane
GbGPAT8Gbscaffold17219.12.0A11: 9067020-9071621150350055.869.25Plasma Membrane
GbGPAT9Gbscaffold6816.2.0A12: 51866337-51868435162654160.969.17Plasma Membrane
GbGPAT10Gbscaffold17185.12.0D02: 70322327-70329230113437743.188.23Plasma Membrane
GbGPAT11Gbscaffold2076.5.0D04: 50119564-50124142113737843.438.83Plasma Membrane
GbGPAT12Gbscaffold2076.7.0D04: 50128492-50133233113737843.438.83Plasma Membrane
GbGPAT13Gbscaffold11060.1.0D06: 4727967-4730037154851558.389.57Plasma Membrane
GbGPAT14Gbscaffold5633.11.0D08: 44543424-44545643163254361.319.55Plasma Membrane
GbGPAT15Gbscaffold15529.6.0D08: 54558468-54565707134144649.866.34Chloroplast
GbGPAT16Gbscaffold5633.10.0D08: 44527931-44530917222374083.29.42Plasma Membrane
GbGPAT17Gbscaffold7965.5.0D09: 45190899-45193553150950255.919.23Plasma Membrane
GbGPAT18Gbscaffold10055.26.0D10: 62642543-62644388148549455.279.12Plasma Membrane
GbGPAT19Gbscaffold1641.10.0D11: 9489598-9494108150350055.779.29Plasma Membrane
GbGPAT20Gbscaffold346.41.0D11: 3136620-315473589729833.655.64Cytoplasmic
GbGPAT21Gbscaffold3997.11.0D12: 26167138-26169961162654160.989.26Plasma Membrane
GbGPAT22Gbscaffold1135.1.0scaffold1135: 10610-20038134144649.86.52Chloroplast
GbGPAT23Gbscaffold14639.21.0scaffold14639_A03: 268323-274714169256364.229.38Chloroplast
GbGPAT24Gbscaffold16283.3.0scaffold16283: 71319-73133146748854.939.48Plasma Membrane
GbGPAT25Gbscaffold3695.4.0scaffold3695: 56371-64196138646151.698.37Chloroplast
GbGPAT26Gbscaffold9785.1.0scaffold9785: 126689-128615138346051.379.35Plasma Membrane
GbGPAT27Gbscaffold9785.2.0scaffold9785: 129516-131971154851557.39.23Plasma Membrane
The information of GPAT gene family in four Gossypium species.

Phylogenetic Analysis of the GPAT Gene Family

To assess the phylogenetic relationships of GPAT genes among the four cotton species (G. raimondii, G. arboreum, G. hirsutum, and G. barbadense) and other species, a comprehensive phylogenetic tree was constructed using a NJ method (Figure 1). Meanwhile, the phylogenetic tree was reconstructed with Maximum likelihood (ML) method, which was almost identical with only minor difference at some branches (Figure S1), suggesting that the two methods were largely consistent with each other. As shown in Figure 1, GPAT genes could be divided into three groups, named the Group I, Group II, and Group III. Group III has the most members (74), followed by Group I (19), and Group II (15). Compared to Arabidopsis, GPAT genes in the four cotton species showed a closer relationship with those from cacao, since they were clustered closely to each other in the phylogenetic tree. However, the number of GPAT genes in cotton and cacao were not similar within the groups, and one cacao gene corresponding to one to three homologous genes of G. raimondii and G. arboreum. For example, in Group I, TcGPAT12 had three orthlogous in G. raimondii and G. arboreum each, whereas in Group II, TcGPAT2 had two orthologous genes in both G. raimondii and G. arboreum. Furthermore, in Group III, TcGPAT4 had only one orthologous gene in G. raimondii and G. arboreum, respectively. In addition, the GPAT genes from the four cotton species showed a closer relationship than that from Arabidopsis and cacao (Figure 1). From the perspective of evolution, each GPAT gene in G. raimondii corresponds to one orthologous gene in G. arboreum and to two orthologous genes in tetraploid G. hirsutum and G. barbadense, since the diploid genomes of G. raimondii and G. arboreum are the progenitors for tetraploid G. hirsutum and G. barbadense (Paterson et al., 2012). In this study, most GPAT genes from G. hirsutum and from G. barbadense showed a one-to-one correspondence with those from the diploid progenitors (G. raimondii and G. arboreum). The inconsistencies were that the GrGPAT2 gene was duplicated in G. arboreum, G. hirsutum, and G. barbadense, implying that this gene underwent an independent duplication event in G. arboreum after its divergence from G. raimondii, even this duplication also existed after the formation of tetraploid cotton. In addition, GPAT genes from A subgenome and D subgenome of G. hirsutum showed a bias to those from G. arboreum and G. raimondii.
Figure 1

Phylogenetic relationships of GPAT genes from Gossypium, Arabidopsis, and cacao. Phylogenetic analysis was performed using the neighbor-joining method with 1,000 replicates. The GPAT genes from G. raimondii, G. arboreum, G. hirsutum, G. barbadense, Arabidopsis, and cacao were marked with the red, yellow, black, blue, green, and magenta dots, respectively. The branches of each group were indicated in a specific color.

Phylogenetic relationships of GPAT genes from Gossypium, Arabidopsis, and cacao. Phylogenetic analysis was performed using the neighbor-joining method with 1,000 replicates. The GPAT genes from G. raimondii, G. arboreum, G. hirsutum, G. barbadense, Arabidopsis, and cacao were marked with the red, yellow, black, blue, green, and magenta dots, respectively. The branches of each group were indicated in a specific color.

Genomic Structure of GPAT Genes

Gene structure of GPAT genes is closely related to its function, and could reflect the phylogenetic relation among the GPAT genes, together with their phylogenetic analysis. To obtain further insight into the evolutionary relationships among GPAT genes from the four cotton species and Arabidopsis, we compared their gene structures. As shown in Figure 2A, the GPAT genes were divided into three groups as described in Figure 1. The GPAT gene members within groups were mostly the same, although the topologies of the two phylogenetic trees were different. Analysis of Arabidopsis and cotton GPAT gene structure for exon/intron organizations revealed that the number of exons per gene varied from 1 to 12 (Figure 2B), and the genes close to each other in the phylogenetic tree showed highly similar exon numbers. In general, the GPAT genes in the same group had similar exons than other groups. For example, all the members in Group I contained 12 exons except GbGPAT19 that had 11 exons. A host of GPAT genes in Group II possessed 12 exons, except for GhGPAT17 and GbGPAT2, which had 11 and 10 exons, respectively. Most members of Group III had two exons, except for GhGPAT20 that had only one exon, eight genes (GaGPAT2, GrGPAT7, GhGPAT5, GhGPAT11, GhGPAT24, GbGPAT8, GbGPAT14, and GbGPAT20) that had three exons, and three enes (AtGPAT4, AtGPAT8, and GrGPAT11) that had four exons. As excepted, the gene structures of orthologous gene pairs of cotton GPATs were almost identical with minor differences with the exception of GrGPAT6/GbGPAT19, GaGPAT14/GbGPAT2, GaGPAT15/GbGPAT19, GaGPAT5/GhGPAT5, and GhGPAT5/GbGPAT3 (Table S3).
Figure 2

Phylogenetic relationships and gene structure of the GPAT gene family in Gossypium and Arabidopsis. (A) The phylogenetic tree of all GPAT proteins in G. raimondii, G. arboreum, G. hirsutum, G. barbadense, and Arabidopsis was constructed by the neighbor-joining method with 1,000 replicates, (B) The exon-intron structure of GPAT genes in G. raimondii, G. arboreum, G. hirsutum, G. barbadense, and Arabidopsis. Exons are represented by green boxes and introns by black lines.

Phylogenetic relationships and gene structure of the GPAT gene family in Gossypium and Arabidopsis. (A) The phylogenetic tree of all GPAT proteins in G. raimondii, G. arboreum, G. hirsutum, G. barbadense, and Arabidopsis was constructed by the neighbor-joining method with 1,000 replicates, (B) The exon-intron structure of GPAT genes in G. raimondii, G. arboreum, G. hirsutum, G. barbadense, and Arabidopsis. Exons are represented by green boxes and introns by black lines.

Genomic Localization and Duplication of GPAT Genes

GPAT genes were mapped onto chromosomes in G. arboreum, G. raimondii, G. hirsutum, and G. barbadense based on the available genomic information on the four cotton species. Seventy-six of 85 GPATs were assigned to chromosomes (Figure 3), while the remaining 9 GPATs were anchored on unmapped scaffolds. A total of 14 GrGPAT genes were mapped to 9 chromosomes of G. raimondii: chromosome D12 contained three GPAT genes, chromosomes D04, D07, and D11 contained two GPAT genes each, and chromosomes D03, D05, D06, D08, and D10 each contained only one GPAT gene. The 16 GaGPAT genes were distributed on nine chromosomes in G. arboreum: 5 GPAT genes were on chromosome A12 and 2 genes on each of chromosomes A04, A09, and A13, while each of chromosomes A03, A05, A08, A10, and A11 contained only one GPAT gene. A total of 25 GhGPAT genes were mapped to 18 chromosomes of the G. hirsutum genome, not including chromosomes A01, A02, A07, A13, D01, D05, D07, and D13. The remaining three genes, GhGPAT26, GhGPAT27, and GhGPAT28, were positioned on unmapped scaffolds. As illustrated in Figure 3, chromosomes A08, D04, and D08 contained three GPAT genes each, and two GPAT genes were present on chromosome A04, while only one GPAT gene appeared on each of chromosomes A03, A05, A06, A09, A10, A11, A12, D02, D03, D06, D09, D10, D11, and D12. A total of 21 GbGPAT genes were distributed unevenly on 16 of 26 G. barbadense chromosomes, and the number of GPAT genes on each of the 16 chromosomes ranged from 1 to 3.
Figure 3

Genomic localization and gene duplication of GPAT genes in the four cotton species. The chromosomes of G. raimondii, G. arboreum, G. hirsutum, and G. barbadense were filled with red, blue, green, and orange, respectively. The duplicated gene pairs were connected with different color lines. Green, red, and black GPAT genes belonged to Group I, Group II, and Group III, respectively.

Genomic localization and gene duplication of GPAT genes in the four cotton species. The chromosomes of G. raimondii, G. arboreum, G. hirsutum, and G. barbadense were filled with red, blue, green, and orange, respectively. The duplicated gene pairs were connected with different color lines. Green, red, and black GPAT genes belonged to Group I, Group II, and Group III, respectively. Gene duplication events were considered to play an important role in the amplification of gene families. It is reported that the genomes of G. raimondii and G. arboreum have undergone at least two rounds of genome-wide duplication (Paterson et al., 2012; Wang et al., 2012). And all tetraploid cotton species are derived from G. raimondii and G. arboreum (Wendel et al., 2009). Therefore, evolutionary relationships among the four cotton species are close. To elucidate the expanded mechanism of the GPAT gene family, gene duplication event analysis in the four cotton species was performed, and the details of the duplicated gene pairs were listed in Table 2. A total of 5, 6, and 19 duplication events were detected in G. raimondii, G. arboreum, and G. hirsutum, respectively. They were located on individual chromosomes and were mainly attributed to segmental duplication events in the three cotton species (Figure 3). In G. barbadense, 1 pair of tandem duplications and 13 pairs of segmental duplication events were detected (Figure 3, Table 2), but we also concluded that the expansion of the GbGPAT gene family occurred mainly by segmental duplication events rather than tandem events.
Table 2

Ka/Ks analysis for the duplicated GPAT gene pairs from G. raimondii, G. arboreum, G. hirsutum, and G. barbadense.

SpeciesDuplicated gene 1Duplicated gene 2KaKsKa/KsPurifying selectionDuplicate type
G. raimondiiGrGPAT1GrGPAT60.1490.1790.832YesSegmental
GrGPAT4GrGPAT140.120.1280.938YesSegmental
GrGPAT5GrGPAT130.1320.1041.269NoSegmental
GrGPAT6GrGPAT30.1580.1371.153NoSegmental
GrGPAT10GrGPAT120.2110.1881.122NoSegmental
G. arboreumGaGPAT1GaGPAT80.1310.1970.665YesSegmental
GaGPAT1GaGPAT150.1190.2480.48YesSegmental
GaGPAT3GaGPAT70.2051.0240.2YesSegmental
GaGPAT4GaGPAT140.0940.2280.412YesSegmental
GaGPAT6GaGPAT120.140.4950.283YesSegmental
GaGPAT9GaGPAT100.080.3740.214YesSegmental
G. hirsutumGhGPAT1GhGPAT130.0070.0080.875YesSegmental
GhGPAT2GhGPAT90.110.1630.675YesSegmental
GhGPAT2GhGPAT160.0070.0080.875YesSegmental
GhGPAT3GhGPAT10.0890.1150.774YesSegmental
GhGPAT3GhGPAT170.0090.0081.125NoSegmental
GhGPAT4GhGPAT150.0050.0160.313YesSegmental
GhGPAT4GhGPAT230.1740.3110.559YesSegmental
GhGPAT5GhGPAT180.010.0330.303YesSegmental
GhGPAT6GhGPAT190.0170.0161.063NoSegmental
GhGPAT7GhGPAT2100.0080YesSegmental
GhGPAT7GhGPAT270.1220.1740.701YesSegmental
GhGPAT9GhGPAT220.0050.0160.313YesSegmental
GhGPAT10GhGPAT230.030.0360.833YesSegmental
GhGPAT11GhGPAT240.010.0071.429NoSegmental
GhGPAT12GhGPAT250.0160.0180.889YesSegmental
GhGPAT14GhGPAT260.0090.0260.346YesSegmental
GhGPAT17GhGPAT10.0840.1250.672YesSegmental
GhGPAT21GhGPAT280.1260.150.84YesSegmental
GhGPAT27GhGPAT280.0170.0161.063NoSegmental
G. barbadenseGbGPAT1GbGPAT100.0060.0080.75YesSegmental
GbGPAT2GbGPAT110.0080.0150.533YesSegmental
GbGPAT3GbGPAT130.0140.0081.75NoSegmental
GbGPAT4GbGPAT151.8121.5351.18NoSegmental
GbGPAT6GbGPAT170.0110.0150.733YesSegmental
GbGPAT7GbGPAT180.0320.0590.542YesSegmental
GbGPAT7GbGPAT240.2340.1781.315NoSegmental
GbGPAT8GbGPAT201.7053.1030.549YesSegmental
GbGPAT9GbGPAT210.0060.0160.375YesSegmental
GbGPAT11GbGPAT120.0060.0150.4YesTandem
GbGPAT14GbGPAT151.8891.5251.239NoSegmental
GbGPAT16GbGPAT221.8761.5571.205NoSegmental
GbGPAT25GbGPAT230.0080.0240.333YesSegmental
GbGPAT26GbGPAT270.010.0071.429NoSegmental
Ka/Ks analysis for the duplicated GPAT gene pairs from G. raimondii, G. arboreum, G. hirsutum, and G. barbadense. The duplicated gene pairs might have undergone three alternative fates during their evolution, i.e., nonfunctionalization, neofunctionalization and subfunctionalization (Lynch and Conery, 2000). To examine the selective constrains on duplicated GPAT gene pairs in the four cotton species, the Ka/Ks ratio for each pair of duplicated GPAT genes was calculated using the full-length sequences. Generally, Ka/Ks > 1 indicates accelerated evolution with positive selection, Ka/Ks = 1 indicates neuteal selection, while Ka/Ks < 1 indicates the functional constraint with purifying selection. In this study, there are 2, 6, 15, and 8 pairs of duplicated GPATs with a Ka/Ks ratio < 1 in G. raimondii, G. arboreum, G. hirsutum, and G. barbadense, respectively, which suggested that they had experienced purifying selection pressure with limited functional divergence after cotton GPAT genes duplications (Table 2). However, 3, 4, and 6 pairs of duplicated GPATs with a Ka/Ks ratio > 1 were found in G. raimondii, G. hirsutum, and G. barbadense, respectively, suggesting that these cotton GPAT genes had experienced positive selection (Table 2). These observations reflected that the functions of the duplicated GPAT genes in the four cotton species had not diverged much during subsequent evolution, and purifying selection might contribute greatly to the maintenance of function in the GPAT family genes in the four cotton species.

Expression Profiling of GPAT Genes in Upland Cotton and Arabidopsis

Gene expression is the major step toward a gene realizing its biological function. To understand the temporal and spatial expression patterns of upland cotton and Arabidopsis GPAT genes, their expression profiles in different tissues were analyzed using the public expression data. The RNA-seq data of GhGPAT genes from different tissues and developmental stages of G. hirsutum acc. TM-1 (Zhang T. et al., 2015) were analyzed (Figure 4). As shown in Figure 4, GhGPAT genes exhibited different expression patterns in different tissues of TM-1, indicating that GhGPAT genes had multiple biological functions during cotton growth and development. Eleven GhGPAT genes (GhGPAT3, GhGPAT7, GhGPAT11, GhGPAT14, GhGPAT17, GhGPAT21, GhGPAT22, GhGPAT24, GhGPAT26, GhGPAT27, and GhGPAT28) could be detected in all the tested tissues, indicating that these GhGPAT genes were involved in multiple processes during the development of cotton. With the exception of GhGPAT5, GhGPAT6, GhGPAT8, and GhGPAT18, the expression of other genes could be detected in the stems. GhGPAT5 and GhGPAT18 were specifically expressed in 10 DPA fibers, and GhGPAT4, GhGPAT10, GhGPAT15, and GhGPAT23 were expressed in 20 DPA fibers, suggesting that these genes might play important roles in the rapid elongation period and secondary wall development. GhGPAT13 exhibited a high level of accumulation during ovule development (−3 to 35 DPA) and fiber development (5, 10, and 25 DPA), indicating that GhGPAT13 might play important roles in ovule development and fiber development. GhGPA4, GhGPAT5, GhGPAT15, GhGPAT18, GhGPAT19 and GhGPAT20 were not detectable at the different stages of cotyledons, suggesting that those six genes might not be associated with seed germination. There were 13 pairs of paralogous genes among these GhGPAT genes, and 11 pairs clustered together, which suggested that the expression of the paralogous GhGPAT genes might be conserved. In addition, the expression patterns of duplicated gene pairs in G. hirsutum were also investigated in Figure 4. Most duplicated gene pairs such as GhGPAT2/GhGPAT16 and GhGPAT5/GhGPAT18 were similar. However, the expression of GhGPAT1 and GhGPAT13 were strongly divergent, which might be caused by the significant variation in gene regulations after the duplication events.
Figure 4

Expression profiles of GhGPAT genes in different tissues of G. hirsutum acc. TM-1. The RNA-Seq expression profiles of G. hirsutum acc. TM-1 (Zhang T. et al., 2015) were used to identify the expression levels of GhGPAT genes. The color bar represents the expression values in FPKM, and the black color represents there is not expression.

Expression profiles of GhGPAT genes in different tissues of G. hirsutum acc. TM-1. The RNA-Seq expression profiles of G. hirsutum acc. TM-1 (Zhang T. et al., 2015) were used to identify the expression levels of GhGPAT genes. The color bar represents the expression values in FPKM, and the black color represents there is not expression. The expression profiles of Arabidopsis GPAT genes were also analyzed unsing the public microarray data (Toufighi et al., 2005). Similar to GhGPAT genes, the AtGPAT genes also showed different expression patterns in different tissues (Figure S2). The expression of AtGPAT5 showed a relatively high level in the siliques at 6–10 DAF (day after flowering), but lower expression in early developmental stages of silique (4–5 DFA), which suggested that it might contribute to the accumulation of glycerolipid in Arabidopsis siliques. The AtGPAT6 had high exprsssion in sepals, petals and stamens than in other tissues, which indicated that it may play important rols in the Arabidopsis flower development. Additionally, the paralogous gene pairs of Arabidopsis GPAT genes were also investigated, it was showed that AtGPAT4/AtGPAT8 had the simiar expression patterns. However, it was not the case for AtGPAT2/AtGPAT3.

Gene Expression Patterns Under Salt and Cold Stresses

Salt and cold stresses are the two of the main abiotic stresses that limit the productivity and geographical distribution of various crops all over the world. Previous study reported that plant GPAT genes are well known to be involved in responses to salt (Sun et al., 2010; Sui et al., 2017) and cold stresses (Murata and Tasaka, 1992; Ariizumi et al., 2002; Sui et al., 2007b), but the function of GhGPAT genes in cotton response to salt or cold stress remains unknown. The cis-elements could provide critical evidence and understanding about the functions of GPAT genes in cotton. Therefore, the promoter regions up to 2 kb upstream of all 28 GhGPAT genes were used to find their putative cis-elements (Table S4), and eight putative environmental stimulus-responsive cis-elements were identified in GhGPAT genes (Table S5). Although there were no cis-elements responding to salt stress and only one cis-element responding to low temperature in the PLACE database, some cis-elements might respond to multiple environmental stimuli (Higo et al., 1999). The results showed that almost all GhGPAT promoters contained more than three environmental stimulus-responsive cis-elements except for GhGPAT13 (Table S5), indicating that GhGPAT genes may be involved in the response to salt and cold stresses.To validate the participation of GPAT genes in salt or cold stress, gene expression of all GhGPAT genes were conducted in different tissues under salt or cold stress (Figure 5). For the salt stress, the relative expression levels of GhGPAT genes were altered either induction or suppression under different salt treatments (Figure 5A). In the roots, nearly all GhGPAT genes were upregulated under slight salt stress except for GhGPAT17 and GhGPAT21. Most GhGPAT genes were suppressed under severe stress except for GhGPAT8, GhGPAT9, GhGPAT11, GhGPAT18, and GhGPAT22. In the stems, most GhGPAT genes displayed downregulation under different levels of salt stress, while five GhGPAT genes (GhGPAT5, GhGPAT7, GhGPAT8, GhGPAT11, GhGPAT13, and GhGPAT18) showed upregulation under at least one salt treatment. In the leaves, GhGPAT5 and GhGPAT18 were upregulated under different levels of salt stress. Several GhGPAT genes showed downregulation. In the cotyledons, only GhGPAT18 was upregulation under different levels of salt stress, while nearly all GhGPAT genes were downregulation. Furthermore, two duplicated gene pairs, GhGPAT2/GhGPAT16 and GhGPAT27/GhGPAT28, showed similar expression patterns. For the cold stress (Figure 5B), only a few GhGPAT genes were induction, while others was suppression under cold stress. In the roots, five GhGPAT genes (GhGPAT2, GhGPAT6, GhGPAT10, GhGPAT20, and GhGPAT22) displayed upregulated expression under cold stress, while others were suppressed by cold stress. In the stems, GhGPAT18 and GhGPAT24 were initially increased at 12 h of cold stress and then decreased at 24 h of cold stress. In the leaves, the expression of GhGPAT20 showed upregulation under cold stress as compared with other GhGPAT genes. In the cotyledons, most of the GhGPAT genes were suppressed under cold stress except for GhGPAT8, GhGPAT9, and GhGPAT24. GhGPAT8 and GhGPAT9 were upregulated after 12 h of cold treatment, while GhGPAT24 was upregulated at 24 h. In addition, three duplicated gene pairs, GhGPAT5/GhGPAT18, GhGPAT12/GhGPAT25, and GhGPAT27/GhGPAT28 were clustered together with similar expression patterns.
Figure 5

Expression patterns of 28 GhGPAT genes in four representative tissues of G. hirsutum in response to salt stress and cold stress. (A) Expression patterns of 28 GhGPAT genes under salt stress. (B) Expression patterns of 28 GhGPAT genes under cold stress. Slight stress, moderate stress, and severe stress in G. hirsutum were represented by 150, 200, and 300 mM NaCl, respectively. The color bar represents the relative signal intensity values.

Expression patterns of 28 GhGPAT genes in four representative tissues of G. hirsutum in response to salt stress and cold stress. (A) Expression patterns of 28 GhGPAT genes under salt stress. (B) Expression patterns of 28 GhGPAT genes under cold stress. Slight stress, moderate stress, and severe stress in G. hirsutum were represented by 150, 200, and 300 mM NaCl, respectively. The color bar represents the relative signal intensity values.

Co-localization and Sequence Variation of GhGPAT Genes With Quantitative Trait Loci for Seed Oil and Protein Contents

To analyze the relationships of GhGPAT genes with cottonseed oil and protein contents, a co-localization analysis of GhGPAT genes with oil- and protein-related QTLs was conducted (Figure 6). QTLs for oil and protein traits were downloaded from intraspecific G. hirsutum and interspecific G. hirsutum × G. barbadense populations (Said et al., 2015a). The results showed that 5 GhGPAT genes on five chromosomes were co-localized within a 21.7 cM region of cottonseed oil- and protein-related QTLs, including GhGPAT3 on chromosome A04/c4, GhGPAT4 on chromosome A05/c5, GhGPAT7 on chromosome A08/c8, GhGPAT9 on chromosome A09/c9, and GhGPAT15 on chromosome D04/c22. Three GhGPAT genes on one chromosome (GhGPAT19, GhGPAT20, and GhGPAT21 on chromosome D08/c24) were localized within a 35 cM region (Figure 6). The co-localized QTLs include six QTLs (on chromosomes A04, A08, A09 and D04) from intraspecific G. hirsutum population and three QTLs (on chromosomes A05 and D08) from interspecific G. hirsutum × G. barbadense population (Table S6). In intraspecific G. hirsutum population, 4 and 2 QTLs associated with GhGPAT genes were identified for cotton oil and protein, respectively. GhGPAT3 and GhGPAT7 each co-localized with one oil-related QTL, while GhGPAT9 and GhGPAT15 each co-localized with one oil-related QTL and one protein-related QTL. In addition, one oil-related QTL and two protein-related QTLs associated with GhGPAT genes were identified from interspecific G. hirsutum × G. barbadense population. GhGPAT4 only co-localized only with one oil-related QTL on chromosome A05, whereas GhGPAT19, GhGPAT20, and GhGPAT21 co-localized with one oil-related QTL and one protein-related QTL on chromosome D08. However, the causative relationship between the natural variation in GhGPAT genes and oil content may not be obtained from the co-localization of a cottonseed oil-related QTL or a protein-related QTL with GhGPAT genes, since hundreds of genes could be identified from a 20 cM region (Said et al., 2015b), even 35 cM region.
Figure 6

A co-localization analysis of GhGPAT genes with cotton oil and protein quantitative trait loci (QTLs). GhGPAT genes are shown in red, and underlining indicates the GhGPAT genes colocalized with oil or protein content QTLs.

A co-localization analysis of GhGPAT genes with cotton oil and protein quantitative trait loci (QTLs). GhGPAT genes are shown in red, and underlining indicates the GhGPAT genes colocalized with oil or protein content QTLs. To analyze the relationship between the sequence variation in GPAT genes and oil content, the SNPs of GPAT genes were identified from RNA-seq data on CRI 36 and Hai 7124. As shown in Table S7, 32 SNPs in the cDNA sequences of 12 GPAT genes were identified from the CRI 36 and Hai 7124 data compared with the TM-1 genome. Among them, 16 SNPs in 9 GPAT genes were classified as synonymous, and 16 SNPs in 10 GPAT genes were non-synonymous. To detect the association analysis between SNPs (non-synonynous) from GPAT genes and oil content, primers were designed to amplify fragments containing SNPs (Table S2) and analyzed using high-resolution melting (HRM). The 16 SNPs classified as non-synonymous were screened in a BIL population of 180 lines derived from a backcross between CRI 36 and Hai 7124, and correlation analysis was conducted in the BIL population of 180 lines in 2015 AY, 2016AY, and 2016XJ. An SNP of GhGPAT16, named GhGPAT16-1624 (T/C), which changes amino acid 542 from phenylalanine to leucine, was found to be significantly associated with oil content in the BIL population in 2015AY, and an SNP of GhGPAT26, named GhGPAT26-172 (T/C), which changes amino acid 58 from serine to proline, was found to be associated with oil content in 2016AY (Table S8). In the BIL population, the correlation of GhGPAT16-1624 and oil content in 2015AY was significantly negative (−0.262, P < 0.01), and the correlation of GhGPAT26-172 with oil content in 2016AY was significantly positive (0.183, P < 0.05). However, GhGPAT16-1624 and GhGPAT26-172 did not exhibit any association with oil content from the same BIL population in 2016AY, 2016XJ and 2015AY, 2016XJ, respectively. Therefore, the association between the SNP markers of GhGPAT16 and GhGPAT26 genes and oil content remains to be elucidated. These results suggest that the sequence variations in most GhGPAT genes are not associated with natural variations in cotton oil content.

Discussion

Identification, Phylogenetic Analysis and Evolution of GPAT Genes in Cotton

In this study, a total of 14, 16, 28, and 27 GPAT genes were identified from G. raimondii, G. arboreum, G. hirsutum, and G. barbadense, respectively (Table 1). The allotetraploid cotton species originated from an interspecific hybridization event between the progenitors G. raimondii (D5) and G. arboreum (A2) (Cronn et al., 1999; Wendel et al., 2009). Each of the GPAT genes in the diploid cotton species corresponds to two orthologous genes in their allotetraploid derivatives based on the theory of evolution. However, the numbers of GPAT genes in each of G. hirsutum and G. barbadense were slightly less than the sum of those from G. raimondii and G. arboreum, suggesting that whole-genome duplication was the major impetus and was followed by different degrees of gene loss during the evolution of allotetraploid cottons, and this was consistent with previous reported that gene loss was accompanied by the rapid arrangement of genomic sequences after hybridization and chromosome doubling during polyploidization (Paterson et al., 2004). This inconsistency has also been detected in a certain genus among different cotton species (Dong et al., 2017; Li et al., 2017; Pant et al., 2018). The number of GPAT genes in G. arboreum is slightly greater than that in G. raimondii, although the genome of G arboreum was approximately twice larger than that of G. raimondii, and the long terminal repeat (LTR) retrotransposon insertion in G. arboreum may explain the phenomenon (Paterson et al., 2012; Li et al., 2014). The GPAT family genes from the subgenomes of G. hirsutum or G. barbadense were more closely phylogenetically related to those from G. raimondii and G. arboreum, reflecting that the GPAT gene family evolved before the formation of allotetraploid cotton species. Phylogenetic analysis revealed that GhGPAT genes were more closely related to TcGPATs than to AtGPATs (Figure 1), this result was consistent with the evolutionary relationships among the four cotton species, cacao and Arabidopsis. The GPAT genes from cotton, Arabidopsis and cacao were distinctly classified into three groups: Group I, Group II, and Group III (Figure 1), and no additional groups were identified in the tree, which was similar to those in Arabidopsis (Nishida et al., 1993; Gidda et al., 2009) and other plants (Waschburger et al., 2018). The gene structure analysis of Arabidopsis and cotton GPAT genes also supported this classification (Figure 2B). GPAT genes within the same groups showed similar exon numbers, indicating that genes within the same group diverged from a common ancestor. In addition, previous study reported that at least one rice GPAT gene was clusted with Arabidopsis GPAT gene from a phylogenetic tree constructed with 8 Arabidopsis and 16 rice GPAT genes (Horan et al., 2005). Take those results together, it suggested that the functional differences of various GPAT groups from Arabidopsis, rice and cotton have been conserved over 100 million years (Mya).

Duplication of GPAT Genes in Cotton

Previous studies reported that gene duplication was the major reason for amplification of the gene family, including tandem duplication, segmental duplication, transposition events and whole-genome duplication (Blanc and Wolfe, 2004; Flagel and Wendel, 2009). Segmental duplication and translocation have enabled plants to rapidly adapt to new environments (Fraser et al., 2005). In this study, we analyzed gene duplication events to further understand the expansion mechanism of GPAT genes in cotton. Forty-four duplicated gene pairs were identified in the four cotton species, including 5, 6, and 19 segmental duplicated pairs detected in G. raimondii, G. arboreum, and G. hirsutum, respectively, and 1 and 13 pairs of tandem and segmental duplication detected in G. barbadense (Figure 3, Table 2). The results suggested that the expansion of cotton GPAT genes was mainly due to segmental duplication events. The results from Table 2 showed that purifying selection predominated across the duplicated gene pairs in cotton. Purifying selection could eliminate deleterious loss-of-function mutations, enhance fixation and preserve the function of a new duplicated gene at both duplicated loci (Tanaka et al., 2009). After duplication, the coding regions can obtain new regulatory context by insertion or deletion of tissue-specific enhancers or repressors, causing spatial and temporal variations in expression patterns of duplicated genes (Yang et al., 2017). The expression patterns of GhGPAT genes in different tissues of TM-1, showed that most of GhGPAT duplicated gene pairs displayed similar expression patterns, indicating that their biological functions might not be diverged during cotton development and growth, but determining whether these genes had similar functions in different tested tissues would require further studies. However, some GhGPAT duplicated gene pairs also displayed differential expression patterns, and this case also presented in the expression pattern of Arabidopsis GPAT genes (Figure S2), indicating that the biological functions of the duplicated genes might have experienced divergence during the development and growth of cotton and Arabidopsis, which might enable these two plants to rapidly adapt to new environments. Previous studies also reported that some duplicated gene pairs in other gene families, such as the WOX and YABBY gene families, displayed different expression patterns (Yang et al., 2017, 2018). These results indicated that divergence of duplicated gene pairs is a pervasive evolutionary process after duplication.

Functional Divergence of Cotton GPAT Genes Under Salt Stress and Cold Stress

Salt and cold stresses are the serious environmental stresses that limit the productivity and geographical distribution of various crops worldwide. Previous studies revealed that the overexpression of GPAT genes in plants could enhance their salt or cold tolerance (Murata and Tasaka, 1992; Sui et al., 2007a, 2017; Sun et al., 2010). Gene expression patterns can provide useful clues for understanding gene function (Cui et al., 2017). Figure 5A showed that a majority of GhGPAT genes were upregulated in roots under moderate salt stress. In contrast, only a few upregulated GPAT genes were found in leaves compared with roots. This result was consistent with the fact that roots were the first tissues directly responding to salt stress (Guo et al., 2001), and root and leaf tissues were distinctly different in structure and function (Campo et al., 2014). Some GhGPAT genes showed similar expression patterns in the same tissue under salt or cold stress, being either induced or suppressed (Figure 5). For example, GhGPAT8 and GhGPAT11 were induced in roots by salt stress, and GhGPAT2 and GhGPAT22 were also upregulated before 24 h under cold stress. These results indicated that those genes were co-expressed under salt or cold stress. However, some GhGPAT genes in the same tissue also presented different expression patterns under salt stress and under cold stress. For instance, GhGPAT18 was induced in roots under different salt stress conditions, while it was suppressed before 24 h under cold stress. This difference indicated that these genes may perform diverse functions under salt or cold stress. Together, these results implied that the signaling network responding to abiotic stress in plants was complicated (Zhang L. et al., 2015). In addition, some GhGPAT gene pairs presented a high degree of functional divergence under salt. GhGPAT5 was downregulated in roots after severe salt stress, while its duplicated gene, GhGPAT18, was upregulated. This evidence supported the assertion that the expression divergence between duplicated gene pairs was the first step in functional divergence, which could increase the likelihood of their retention in the genome (Zhang, 2003).

Conclusions

In this study, a total of 85 GPAT genes based on the genome information of G. raimondii, G. arboreum, G. hirsutum, and G. barbadense were identified. The family could be divided into three groups based on the phylogenetic tree and gene structures. Segmental duplication events were the main contributors to GPAT gene family expansion in G. raimondii, G. arboreum, G. hirsutum, and G. barbadense. Moreover, duplicated gene pairs of GhGPATs in upland cotton might be experienced functional divergence, since their expression patterns were different in different tissues. some GhGPAT genes are likely to be involved in salt or cold response. Eight GhGPAT genes were co-localized with oil and protein content QTLs, and two SNP markers were genetically associated with cottonseed oil content in one interspecific BIL population (in one of three field tests). Our results provide a solid foundation for further understanding about involvement of cotton GPAT genes in the natural variation of cottonseed oil content, and also could provide the candidate genes for function studies.

Author Contributions

JZ and JY conceived and designed the study. YC and JM performed the experiments and analyzed the data. NW and GL prepared the figures. WP, MW, and XL analyzed and interpreted the data. YC, JZ, and JY prepared the manuscript. JZ participated in the design of the experiments, wrote part of the manuscript, and performed a critical review for intellectual content. All authors have read, edited, and approved the current version of the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  68 in total

1.  Duplicated genes evolve independently after polyploid formation in cotton.

Authors:  R C Cronn; R L Small; J F Wendel
Journal:  Proc Natl Acad Sci U S A       Date:  1999-12-07       Impact factor: 11.205

2.  Analysis of amino acid motifs diagnostic for the sn-glycerol-3-phosphate acyltransferase reaction.

Authors:  T M Lewin; P Wang; R A Coleman
Journal:  Biochemistry       Date:  1999-05-04       Impact factor: 3.162

3.  The evolutionary fate and consequences of duplicate genes.

Authors:  M Lynch; J S Conery
Journal:  Science       Date:  2000-11-10       Impact factor: 47.728

4.  PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences.

Authors:  Magali Lescot; Patrice Déhais; Gert Thijs; Kathleen Marchal; Yves Moreau; Yves Van de Peer; Pierre Rouzé; Stephane Rombauts
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

5.  Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method.

Authors:  K J Livak; T D Schmittgen
Journal:  Methods       Date:  2001-12       Impact factor: 3.608

6.  MapChart: software for the graphical presentation of linkage maps and QTLs.

Authors:  R E Voorrips
Journal:  J Hered       Date:  2002 Jan-Feb       Impact factor: 2.645

7.  DnaSP, DNA polymorphism analyses by the coalescent and other methods.

Authors:  Julio Rozas; Juan C Sánchez-DelBarrio; Xavier Messeguer; Ricardo Rozas
Journal:  Bioinformatics       Date:  2003-12-12       Impact factor: 6.937

8.  An increase in unsaturation of fatty acids in phosphatidylglycerol from leaves improves the rates of photosynthesis and growth at low temperatures in transgenic rice seedlings.

Authors:  Tohru Ariizumi; Sachie Kishitani; Rie Inatsugi; Ikuo Nishida; Norio Murata; Kinya Toriyama
Journal:  Plant Cell Physiol       Date:  2002-07       Impact factor: 4.927

9.  Molecular characterization of functional domains in the protein kinase SOS2 that is required for plant salt tolerance.

Authors:  Y Guo; U Halfter; M Ishitani; J K Zhu
Journal:  Plant Cell       Date:  2001-06       Impact factor: 11.277

10.  Arabidopsis AtGPAT1, a member of the membrane-bound glycerol-3-phosphate acyltransferase gene family, is essential for tapetum differentiation and male fertility.

Authors:  Zhifu Zheng; Qun Xia; Melanie Dauk; Wenyun Shen; Gopalan Selvaraj; Jitao Zou
Journal:  Plant Cell       Date:  2003-08       Impact factor: 11.277

View more
  6 in total

1.  Identification and analysis of oil candidate genes reveals the molecular basis of cottonseed oil accumulation in Gossypium hirsutum L.

Authors:  Zhibin Zhang; Juwu Gong; Zhen Zhang; Wankui Gong; Junwen Li; Yuzhen Shi; Aiying Liu; Qun Ge; Jingtao Pan; Senmiao Fan; Xiaoying Deng; Shaoqi Li; Quanjia Chen; Youlu Yuan; Haihong Shang
Journal:  Theor Appl Genet       Date:  2021-10-29       Impact factor: 5.699

2.  GhGPAT12/25 Are Essential for the Formation of Anther Cuticle and Pollen Exine in Cotton (Gossypium hirsutum L.).

Authors:  Meng Zhang; Hengling Wei; Pengbo Hao; Aimin Wu; Qiang Ma; Jingjing Zhang; Hantao Wang; Xiaokang Fu; Liang Ma; Jianhua Lu; Shuxun Yu
Journal:  Front Plant Sci       Date:  2021-05-13       Impact factor: 5.753

3.  Identification and Analysis of the GASR Gene Family in Common Wheat (Triticum aestivum L.) and Characterization of TaGASR34, a Gene Associated With Seed Dormancy and Germination.

Authors:  Xinran Cheng; Shengxing Wang; Dongmei Xu; Xue Liu; Xinyu Li; Weiwei Xiao; Jiajia Cao; Hao Jiang; Xiaoyu Min; Jianfeng Wang; Haiping Zhang; Cheng Chang; Jie Lu; Chuanxi Ma
Journal:  Front Genet       Date:  2019-10-18       Impact factor: 4.599

4.  Genome-Wide Characterization and Analysis of CIPK Gene Family in Two Cultivated Allopolyploid Cotton Species: Sequence Variation, Association with Seed Oil Content, and the Role of GhCIPK6.

Authors:  Yupeng Cui; Ying Su; Junjuan Wang; Bing Jia; Man Wu; Wenfeng Pei; Jinfa Zhang; Jiwen Yu
Journal:  Int J Mol Sci       Date:  2020-01-29       Impact factor: 5.923

5.  Genome-wide analysis of glycerol-3-phosphate O-acyltransferase gene family and functional characterization of two cutin group GPATs in Brassica napus.

Authors:  Jingxue Wang; Sanjay Kumar Singh; Siyu Geng; Shanshan Zhang; Ling Yuan
Journal:  Planta       Date:  2020-04-03       Impact factor: 4.116

6.  Genomics-assisted prediction of salt and alkali tolerances and functional marker development in apple rootstocks.

Authors:  Jing Liu; Fei Shen; Yao Xiao; Hongcheng Fang; Changpeng Qiu; Wei Li; Ting Wu; Xuefeng Xu; Yi Wang; Xinzhong Zhang; Zhenhai Han
Journal:  BMC Genomics       Date:  2020-08-10       Impact factor: 3.969

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.