Literature DB >> 29934376

Cotton Late Embryogenesis Abundant (LEA2) Genes Promote Root Growth and Confer Drought Stress Tolerance in Transgenic Arabidopsis thaliana.

Richard Odongo Magwanga1,2, Pu Lu1, Joy Nyangasi Kirungu1, Qi Dong1, Yangguang Hu1, Zhongli Zhou1, Xiaoyan Cai1, Xingxing Wang1, Yuqing Hou1, Kunbo Wang3, Fang Liu3.   

Abstract

Late embryogenesis abundant (LEA) proteins play key roles in plant drought tolerance. In this study, 157, 85 and 89 candidate LEA2 proteins were identified in G. hirsutum, G. arboreum and G. raimondii respectively. LEA2 genes were classified into 6 groups, designated as group 1 to 6. Phylogenetic tree analysis revealed orthologous gene pairs within the cotton genome. The cotton specific LEA2 motifs identified were E, R and D in addition to Y, K and S motifs. The genes were distributed on all chromosomes. LEA2s were found to be highly enriched in non-polar, aliphatic amino acid residues, with leucine being the highest, 9.1% in proportion. The miRNA, ghr-miR827a/b/c/d and ghr-miR164 targeted many genes are known to be drought stress responsive. Various stress-responsive regulatory elements, ABA-responsive element (ABRE), Drought-responsive Element (DRE/CRT), MYBS and low-temperature-responsive element (LTRE) were detected. Most genes were highly expressed in leaves and roots, being the primary organs greatly affected by water deficit. The expression levels were much higher in G. tomentosum as opposed to G. hirsutum The tolerant genotype had higher capacity to induce more of LEA2 genes. Over expression of the transformed gene Cot_AD24498 showed that the LEA2 genes are involved in promoting root growth and in turn confers drought stress tolerance. We therefore infer that Cot_AD24498, CotAD_20020, CotAD_21924 and CotAD_59405 could be the candidate genes with profound functions under drought stress in upland cotton among the LEA2 genes. The transformed Arabidopsis plants showed higher tolerance levels to drought stress compared to the wild types. There was significant increase in antioxidants, catalase (CAT), peroxidase (POD) and superoxide dismutase (SOD) accumulation, increased root length and significant reduction in oxidants, Hydrogen peroxide (H2O2) and malondialdehyde (MDA) concentrations in the leaves of transformed lines under drought stress condition. This study provides comprehensive analysis of LEA2 proteins in cotton thus forms primary foundation for breeders to utilize these genes in developing drought tolerant genotypes.
Copyright © 2018 Magwang et al.

Entities:  

Keywords:  Antioxidants; Drought stress; Expression analysis; LEA2 proteins; Oxidants; Transgenic plant; miRNAs

Mesh:

Substances:

Year:  2018        PMID: 29934376      PMCID: PMC6071604          DOI: 10.1534/g3.118.200423

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Drought stress is one of the major abiotic stress factors with deleterious effects in plant growth and development (Sofia ). With the ever changing environmental condition and erratic precipitation levels, plant production is projected to undergo further decline, that meeting the demands and needs of the growing population will be a challenge in the near future (Tilman ). Plants being sessile, the effects caused by the various abiotic stresses are enormous thus threatening their existence (Rejeb ). Plants have developed various coping strategies for continued survival under these extreme conditions, one of which is through the induction of various transcriptome factors (TFs) with the aim of boosting their tolerance level (Xiong and Ishitani 2006). One of the transcriptome factor (TF) that has a functional role under various abiotic stress conditions is a member of the late embryogenesis abundant (LEA) proteins (Rodriguez-Salazar ). LEA proteins are basically grouped into eight (8) sub families, named as LEA1, LEA2, LEA3, LEA4, LEA5, LEA6, seed maturation proteins (SMPs) and dehydrins (Battaglia and Covarrubias 2013). In several studies conducted on the genome wide identification, the proteins encoding the late embryogenesis abundant (LEA) genes have been found to be the most abundant among all the other LEA protein families (Yang and Xia 2011). LEA2 proteins are the members of a larger protein family of the late embryogenesis abundant (LEA) (Hundertmark and Hincha 2008). As the name suggests, this group of proteins are found to in large quantities in seeds at the late stages of embryo development (Dure ). Even though, the LEA proteins are synonymous with the seeds, a number of LEA proteins have been detected in the other plant tissues, such as the vegetative tissues (de Nazaré Monteiro Costa ). The distribution of LEA proteins is not restricted to plants only, but have been found in animals (10) (Denekamp ) and in bacteria (11) (Espelund ). The LEA protein families basically have universal structural architecture, high hydrophilicity, low proportion of cysteine (Cys) and tryptophan (Trp) residues and high contents of arginine (Arg), lysine (Lys), glutamate (Glu), alanine (Ala), threonine (Thr) and glycine (Gly). Due to the unique and common features of the LEA proteins, the LEA proteins are mainly referred as hydrophilins with a hydrophilicity index of more than 1 and a glycine (Gly) content of more than 6% (Battaglia ). The late embryogenesis abundant (LEA) proteins have been positively correlated with several of abiotic stress, and have been found to confer tolerance in plants such as Brassica napus (Dalal ), rice (He ) and Fagus sylvatica (Jiménez ). For instance, overexpression of Arabidopsis LEA gene, AtLEA3 have been found to enhance tolerance to drought and salinity stresses (Zhao ). Overexpression of a rice LEA gene type, OsLEA3-1 was found to confer drought tolerance (Xiao ). Similarly, the LEA gene HVA1 LEA gene from barley, was found to confer dehydration tolerance in transgenic rice (Babu ). In addition, SiLEA14, a novel gene was found to be highly expressed in the roots of foxtail millet under drought condition (Wang ). However, the precise roles of LEA proteins are still not well understood. A number of proposals have been made to explain the possible roles of the LEA proteins in plants during water deficit conditions, such as enzyme protection (Hand ), molecular shield (Furuki ), hydration buffer (Hundertmark ) and membrane interactions (Olvera-Carrillo ). To date, a number of studies have been conducted in trying to determine the distribution and characterization of the LEA proteins in various plants, for instance Arabidopsis (Hundertmark and Hincha 2008), Brassica napus (Dalal ), water melon (Celik Altunoglu ) among other plants. Despite all the significance of the LEA genes, little has been done to investigate their putative role in cotton in relation to drought stress tolerance. Cotton (Gossypium hirsutum) is an economically important fiber and oil crop cultivated in many tropical and subtropical areas of the world, where they are constantly exposed to a range of abiotic stresses which includes drought, extreme temperature and high salinity (Mahajan ). The completion and publication of the draft genome sequences of upland cotton G. hirsutum (Li ), Gossypium arboreum (Li ) and Gossypium raimondii (Wang ) has become a valuable tool in elucidating the transcriptome factors (TFs) in cotton genomes. There is a paucity of information available about LEA2 sub family in upland cotton. Therefore, in this study we carried out the identification, characterization of the LEA2 genes in three cotton genomes and transformed a novel LEA2 gene, Cot_AD24498 into Arabidopsis thaliana, in which we further investigated the expression levels of the transformed gene in both the transgenic lines and the wild type (WT) under drought stress condition.

Materials and methods

Identification, Sequence Analysis, Phylogenetic Tree Analysis and Subcellular Location Prediction of The LEA2 Proteins In Cotton

G. hirsutum, tetraploid (AD) genome LEA2 protein sequences were downloaded from the Cotton Research Institute website (http://mascotton.njau.edu.cn). The G. arboreum of A genome LEA2 protein sequences were downloaded from the Beijing Genome Institute database (https://www.bgi.com/), and G. raimondii of D genome was obtained from Phytozome (http://www.phytozome.net/). The conserved domain of LEA2 protein (PF03168) was downloaded from Pfam protein families (http://pfam.xfam.org). The hidden Markov model analysis (HMM) profile of LEA2 protein was queried to carry out the HMMER search (http://hmmer.janelia.org/) (Finn ) against G. hirsutum, G. raimondii and G. arboreum protein sequences. The amino acids sequences were analyzed for the presence of the LEA2 protein domains by ScanProsite tool (http://prosite.expasy.org/scanprosite/) and SMART program (http://smart.embl-heidelberg.de/). The three cotton genomes LEA2 proteins together with the LEA2 proteins from Arabidopsis (http://www.arabidopsis.org/) and rice (http://rice.plantbiology.msu.edu/index.shtml) were used to investigate the evolutionary history and patterning in relation to orthology or paralogy among the proteins encoding LEA2 genes. A phylogenetic tree was constructed, the multiple sequence alignments of all the LEA2 proteins were done by Clustal omega, MEGA 7.0 software using default parameters as described by Higgins et al., (Higgins ). The physiochemical characteristics of all the obtained LEA2 proteins were determined through an online ExPASy Server tool (http://www.web.xpasy.org/compute_pi/). In addition, subcellular location prediction for all the upland cotton LEA2 proteins were determined through Wolfpsort (https://www.wolfpsort.hgc.jp/) (Horton ). The subcellular prediction results were further validated through other two online tools TargetP1.1 server (Emanuelsson ) and Protein Prowler Subcellular Localization Predictor version 1.2 (http://www.bioinf.scmb.uq.edu.au/pprowler_webapp_1-2/) (Bodén and Hawkins 2005).

Analysis of promoter regions, chromosomal locations and miRNA target prediction of LEA2 genes

To identify the presence of drought stress-responsive cis-acting regulatory elements in LEA2 promoter regions, 1 kb up and down stream region from the translation start site of the LEA2 genes were analyzed using the PLACE database (http://www.dna.affrc.go.jp/place/signalscan.html) (Higo ). The physical locations in base pair (bp) of each LEA2 genes were determined through BLASTN searching against the local database. Mapchart software (https://www.wur.nl/en/show/Mapchart.htm) (Voorrips 2002), was used to plot the gene loci on G. hirsutum, G. arboreum and G.raimondii chromosomes. Finally we analyzed the miRNA targeting the LEA2 genes by submitting all the coding sequences (CDS) of all the LEA2 genes to the psRNATarget database (http://plantgrn.noble.org/psRNATarget/).

Expression analysis of LEA2 genes and determination of the gene to be transformed

The qRT-PCR analysis was used to determine the expression changes of the LEA2 genes in response to drought stress in the two parental lines used. the upland elite cultivar, G. hirsutum is known to be drought sensitive while the wild tetraploid cotton, G. tomentosum is a drought tolerant (Zheng ). The two cotton genotypes were treated for drought stress for 14 days. The samples for RNA extraction were obtained from the leaves, stem and roots, at 0, 7 and 14 days of stress exposure. All the samples were taken in three biological replicates in both control and treated seedlings. In order to get the best sets of the LEA2 genes for carrying out qRT-PCR validation, we had to rely on the RNA-sequencing data profiled under drought stress condition. The RNA-Sequence data were downloaded from cotton research institute website (http://mascotton.njau.edu.cn/html/Data). RNAs were reversely transcribed to first strand cDNA by use of TransCript-All-in-One-First-Strand cDNA synthesis Super Mix for qPCR (TransGen, Beijing, China). The fluorescent quantitative primers were designed for the selected genes (24 up and 24 down regulated genes) using Primer Premier 5 (Supplemental Table S1). Actin gene served as a reference. The synthesized cDNA was pre-incubated at 95° for 15 sec, followed by 40 cycles of denaturation at 95° for 5 sec and extension at 60° for 34 sec. The fluorescence quantitative assay was used to analyze expression level of the LEA2 genes in root, leaves and stem tissues of cotton plant, and expression changes in G. hirsutum and G. tomentosum under drought stress. The assay was designed with three replicates and the results were analyzed with the double delta Ct method.

Transformation and Screening of Novel gene Cot_AD24498 (LEA2) in the Model Plant Arabidopsis thaliana (Ecotype Colombia-0) Lines

The gene was transformed into model plant, A. thaliana ecotype Colombia-0 (Col-0). The upland cotton, G. hirsutum, accession number CRI-12 (G09091801–2) was used to confirm for the presence of the Cot_AD24498 gene in various tissues. The pWM101-35S:Cot_AD24498 (LEA2) construct in Agrobacterium tumefaciens GV3101 was confirmed by gene specific primer, the forward primer sequence Cot_AD24498 (5′CGGATCCATGTCGGTAAAAGAGTGCGGC3′) and reverse primer sequence pair of Cot_AD24498 (5′GGTCGACTTACACGCTAACACTGCATCT3′), synthesized from Invitrogen, Beijing, China. The Arabidopsis Wild-type (WT) plants were transformed by use of floral dip method (Clough SJ und Bent A 1998). Infiltration media mainly composed of 4.3 g/l, sucrose 50 g/l (5%), 2-(4-morpholino) ethane sulfonic acid (MES) 0.5 g/l, Silwet-77 200 µl/l (0.02%), 6-benzylaminopurine (6-BA) 0.01 mg/l with pH of 5.7. Transformed lines of A. thaliana were selected by germinating seeds on 50% (0.5) MS (PhytoTechnology Laboratories, Lenexa, USA), containing 50 mg/l hygromycin B (Roche Diagnostics GmbH, Mannheim, Germany) for a duration of three (3) days at temperature of 4° to optimize germination. Upon which the seedlings were transferred to Arabidopsis conditioned growth room set at 16 hr light and 8 hr dark. After 7 days in selection medium, and at three true leaves stage, the seedlings were transplanted into small plastic containers filled with vermiculite and humus in equal ratios. The seedlings at generation T0 were grown to set seeds, the seeds obtained were generation T1. The T1 seeds were germinated in selective antibiotic medium; the one-copy lines were identified by determining the segregation ratio of 3:1 of the antibiotics-selectable marker. The 3:1 ratio of the segregated lines (T2) seeds were again germinated in antibiotics-selective medium, only the lines with 100% were selected for the development of T3 generation. The T3 homozygous progeny was bred from a T2 population after real-time quantitative reverse transcription polymerase chain reaction (qRT-PCR) and the selection of three out of the eight successfully transformed overexpressed lines (L2, L3, and L4) was done by using Cot_AD24498 (LEA2) forward primer sequence (5′CGAACATCCATCCCTCCAAC3′) and Cot_AD24498 (LEA2) reverse primer sequence (5′ATCATCAAGAAAACCGACCC3′) with total complementary DNA (cDNA) as template. The phenotypic investigations were carried out in T3 homozygous generation.

qRT-PCR Analysis of the Expression of Drought-Responsive Genes in Transgenic Arabidopsis

We assessed the action of the transformed gene in the transgenic lines and the wild type of the model plant, A. thaliana by carrying out expression analysis of two drought responsive genes. ABRE-binding factor 4 (ABF4) gene; forward sequence 5′AACAACTTAGGAGGTGGTGGTCAT3′ and reverse sequence 5′TGTAGCAGCTGGCGCAGAAGTCAT3′ and responsive to desiccation 29A (RD29A) gene with forward sequence 5′TGAAAGGAGGAGGAGGAATGGTTGG3′ and the reverse sequence 5′ACAAAACACACATAAACATCCAAAGT3′. Total RNA was isolated from four-week-old transgenic Arabidopsis seedlings and wild type (Columbia ecotype) grown under normal conditions (CK) and 15% PEG6000 treatments for 4 days. RNA extraction and real-time RT-PCR (qRT-PCR) analyzed was applied as described in the section” Expression analysis of LEA2 genes and determination of the gene to be transformed”, cotton Actin2 forward sequence 5′ATCCTCCGTCTTGACCTTG3′ and reverse sequence 5′TGTCCGTCAGGCAACTCAT3′ applied as the reference gene.

Quantification of oxidant and antioxidants in transgenic lines and the wild type

When plants are exposed to any form of stress, there are drastic changes which occurs both at molecular and cellular level in order to tolerate the stress factors (Gill ). Reactive oxygen species is an oxidant substance being produced continuously from the respiring cells, and plants have an elaborate mechanism to keep the level within nontoxic limit, but when stresses such as drought sets in, the ROS equilibrium shifts leading to excessive production. In this research work, we undertook to evaluate the various oxidants and antioxidants levels between the transgenic lines (L1, L2 and L3) compared to the wild type when exposed to drought stress condition. Catalase (CAT), superoxide dismutase (SOD), peroxidase (POD), Malondialdehyde (MDA) and hydrogen peroxide (H2O2) levels were quantified according to the method described by Bartosz (Bartosz 2005). The seeds for transgenic and the wild types were grown in0.5 MS for eight (8) days, then transferred to small conical containers filled with a mixture vermiculite and sand in the ratio of 1:1 and grown for 21 days. After 21 days, water was totally withdrawn from drought treated plants for a period of 8 days, while the controlled plants were watered normally. The leaf samples were then harvested for antioxidants and oxidant determination after 8 days of post stress exposure. The samples were obtained in triplicate, in which each represented a biological repeat.

Availability of Data Statement

The author do affirms that all the data supporting the conclusions of this research work are represented fully within the manuscripts and its supplementary files. Supplemental material available at Figshare: https://doi.org/10.25387/g3.6626849.

Results and discussion

LEA2 protein encoding genes in the cotton genome and other plants

In the identification of the LEA2 proteins in the three cotton genomes, we employed the Hidden Markov Model (HMM profile) of the Pfam LEA2 domains PF03168, as keyword to search the three cotton genome sequences databases. Based on the Pfam domain search, we obtained 200 LEA2 genes in G. hirsutum of AD genome, 101 LEA2 genes in G. raimondii of D genome and 110 LEA genes in G. arboreum of A genome. In order to ascertain the various genes obtained for the three cotton genomes, we carried out manual search through SMART (http://smart.embl.de/smart/) and PFAM database (http://pfam.xfam.org) to verify the presence of the LEA2 gene domain. Upon removal of the redundant sequences with no functional domain or those that lacked the LEA2 domains, we eventually obtained 157, 85 and 89 LEA2 proteins in G. hirsutum, G. arboreum and G. raimondii, respectively. The confirmed domains of the LEA2 proteins in the three cotton genomes were further analyzed for their functional domain attributes of the LEA2 proteins, by use of an online tool, conserved domain database (CDD) tool hosted in the NCBI database. The results showed that the LEA2 proteins were members of c112118 super family with E values ranging from 0 to 0.008 (Supplementary Table S2) and all contained transmembrane domain (Supplementary Table S3) The association of the LEA2s with transmembrane domain could possibly explain the reason why the LEA proteins are found in high concentrations in seeds at late stages of seed development, this possibly to aid in maintaining the stability of the cell membrane under dehydration state. Similar results have also been reported in some of the drought and salt enhancing genes such as Salicornia brachiata SNARE-like superfamily protein (SbSLSP), has been reported to be localized in the plasma membrane (Singh ). LEA2 proteins could be playing an integral role in maintaining non-lethal level of reactive oxygen species (ROS homeostasis) in order to minimize oxidative damages to cellular membranous and macromolecules, in addition, LEA2s could also be playing similar roles as the aquaporin’s, the water channel proteins, which are responsible in the regulation of water movement channels such as plasmodesmata and xylem vessels (Buckley 2015). Aquaporin’s (AQPs) have been associated with salt and drought stress tolerance in plants, the aquaporin’s share similar functional domain with LEAs, being basically membrane proteins (Li ). The number of proteins encoding the LEA2 genes found in G. arboreum, G. raimondii and G. hirsutum were relatively higher than the number recorded in other plants, the entire repertoire of LEA proteins in the 8 LEA families outlined in (Hundertmark and Hincha 2008) have been found to be 34 in rice (Wang ), 30 in Chinese plum (Du ), 27 in tomatoes (Cao and Li 2014), 53 in poplar (Lan ) and 29 in potatoes (Charfeddine ), which is far below the individual numbers of LEA2 in the three cotton genome. The abundance of cotton proteins encoding the LEA2 genes could be possibly due to their unique characteristics of being more hydrophobic than other LEA2 proteins from other species and or they could have evolved much later after other transcriptome factors. The genome size of plants and animal is constant, and high abundance of a particular gene family gives an indication of their integral role in enhancing the survival of the plants. The ever changing environmental conditions, plants are constantly faced with hearse environmental condition and disadvantaged by their sessile nature. The survival of the plants under these extreme environmental conditions therefore is through the increase of more stress tolerance genes or integrating a more complex gene interaction in initiating adaptive response mechanisms aimed at increased tolerance levels (Avramova 2015).

Phylogenetic analyses of LEA2 proteins in G. hirsutum, G. arboreum and G. raimondii

Phylogenetic tree analysis provides valuable knowledge on the lines of evolutionary descent of different genes or proteins from a common ancestor, since its inception, it has remained a powerful tool for structuring classifications, biological diversity and for providing insight into events that occurred during gene evolution (Gregory 2008). In this study a total of 157, 85 and 89 LEA2 proteins were identified from G. hirsutum, G. arboreum and G. raimondii, respectively (Table 1). All the LEA2 proteins were aligned by the neighbor joining (NJ) method in ClustalW. The various LEA2 proteins from upland cotton, G. arboreum, G. raimondii, A. thaliana, T. cacao and G. max were analyzed. The inclusion of A. thaliana, T. cacao and G. max in the analysis of the cotton LEA2s was due to fact that Theobroma cacao share ancestral origins with cotton, A. thaliana and G. max have undergone whole genome duplication similar to cotton plant. The resulting phylogenetic tree showed that the cotton LEA genes tend to cluster together. Based on the clustering pattern, the LEA2 genes were sub-divided into 6 groups, namely group 1 with three sub-groups, group 2, group 3 with two sub-groups, group 4, group 5 and finally group 6 with 5 sub-groups. Groups 1, 2, 4 and 5 were entirely LEA2 proteins from the three cotton genomes.
Table 1

The identified LEA2 genes and their nomenclatural description

In this workHundertmark & Hincha (2008)G. hirsutumG. arboreumG. raimondiiV. viniferaB.napusG. maxArabidopsis
LEA2LEA_215785891453
The LEA2s seems to have evolved later among all the LEA genes, in the analysis of the LEA genes in sweet orange, the highest among all the 8 members of the LEA genes were members of the LEA2 (Muniz Pedrosa ), this kind of observation was replicated in a number of plants. More than a half of the phylogenetic tree was mainly covered by the cotton LEA2 proteins, with no presence of LEA2s from other plants used in the analysis of the phylogenetic tree. Theobroma cacao, being evolutionary related to cotton, a few members of the LEA proteins clustered with cotton, while majority of the proteins encoding the LEA2 genes from Theobroma cacao clustered together. The late embryogenesis abundant (LEA2) proteins from A. thaliana were found to cluster with those of cotton LEA2s in group 3 and 6 (3-2 and 6-1) while Glycine max LEA2 proteins were predominantly found in group 6-1 (Figure 1). No ortholog gene pairs were detected between the proteins encoding the cotton LEA2 genes of cotton to any of the plants used. All the ortholog gene pairs occurred between G. hirsutum and G. arboreum, G. hirsutum and G. raimondii and G. arboreum and G. raimondii. Interestingly, even Theobroma cacao, which is evolutionary related to Gossypium species, had their LEA2 proteins clustered together.
Figure 1

Phylogenetic relationship of LEA2 genes in three cotton species with Arabidopsis, T. cacao and G. max. Neighbor-joining phylogeny of 157 genes for G. hirsutum, 85 genes for G. arboreum, 89 genes for G. raimondii, 9 genes for T. cacao, 5 G. max and 3 Arabidopsis LEA protein sequences, as constructed by MEGA7.0.

Phylogenetic relationship of LEA2 genes in three cotton species with Arabidopsis, T. cacao and G. max. Neighbor-joining phylogeny of 157 genes for G. hirsutum, 85 genes for G. arboreum, 89 genes for G. raimondii, 9 genes for T. cacao, 5 G. max and 3 Arabidopsis LEA protein sequences, as constructed by MEGA7.0. The abundance of LEA2s in plants can be explained by either being the last members of the LEA genes to evolve and or due to duplication. Upland cotton is a tetraploid cotton, having emerged through whole genome duplication (WGD) between the two diploid cotton of A and D genomes. The high number of LEA2 genes, have also been observed in Arabidopsis (Hundertmark and Hincha 2008). Therefore, we could infer that LEA2 proteins might have evolved later after species divergence and the presence of ortholog genes in the cotton genome could be due to the whole genome duplication event coupled with chromosome rearrangement. It is generally assumed that ortholog genes have the same biological functions in different species (Tatusov 1997), and duplication makes room for paralogous gene pairs to evolve new functions (Ohno 1970). LEA2 genes could be functionally-oriented ortholog groups consisting of orthologous pair which plays the same biological role in the three different cotton genomes.

Physio-chemical analysis, subcellular localization and amino acid composition of the LEA2 genes in upland cotton

In the analysis of the physio-chemical properties of the LEA2 genes in upland cotton, the proteins encoding the LEA2 genes had varied molecular formulae though with similar elemental composition, carbon (C), hydrogen (H), oxygen (O), nitrogen (N) and sulfur (S) in varying proportions. Molecular weights ranged from 11.5384 to 73.5831 kD, Pl values from 4.63 to 10.35, aliphatic index from 19.78 to 65.4, instability index from 6.91 to 63.52, protein lengths ranged from 100 to 661 bp and the grand average of hydropathy (GRAVY) values ranged from 0.574 to 1.04. The grand average hydropathy (GRAVY) values showed that almost all the LEA2s are hydrophobic proteins, the hydrophobic nature of proteins is integral for their biological functions, allows the proteins to fold spontaneously into complex three-dimensional structures that are significant for biological activity (Gosline ). The hydrophobic nature of the proteins enables the removal of nonpolar amino acids from solvent and their burial in the core of the protein, this attribute is common among the aquaporin’s (AQPs), water channel proteins, are highly hydrophobic and known to have a functional role in water and salt stress tolerance in plants (Sreedharan ). In the sub cellular localization prediction, 10 different sites were detected, in which majority of the LEA2 proteins were found to be localized within the chloroplast with 73 genes. Further analysis by TargetP and Pprowler, more than 70% of the genes were found to be associated with secretory pathway and chloroplast (Table 2 and Supplementary Table S4). The high number of these genes in chloroplast explains their significant role in drought stress, since chloroplast plays a central role in plant response to stress (Gläßer ). The connection between different stress responses and organellar signaling pathways such as reactive oxygen species, emanate from the chloroplast (Kmiecik ). Chloroplasts being semi-autonomous organelles provide complex communication channel that allow for effective coordination of gene expression since most plastid localized proteins are nuclear-encoded, thus ensuring an effective functioning of overall cellular metabolism (Pfannschmidt ). Numerous and vital cellular processes such as aromatic amino acids, fatty acids and carotenoids biosynthesis and sulfate assimilation pathways are harbored within the chloroplast, in addition to photosynthesis, these cellular processes are known to be key factors in plants response to stress. The chloroplast acts as a sensor to abiotic stress thus initiates different cell functions in response to stress factor, enhancing adaptability of the plant to the environmental stress (Mittler 2006). Higher proportions of LEA2 genes were found to be localized within the cytoplasm, nucleus and mitochondrion, with 24, 20 and 16 genes respectively, which further provided a stronger evidence of the importance of these genes in enhancing drought tolerance ability in cotton. The following cell structures contained low numbers of LEA2 genes, endoplasmic reticulum (E.R) with 3, extracellular structures with 5, Golgi body 6, plasma 4 and vacuole with 3 genes each. The result obtained for the subcellular localization of the LEA2 genes is in agreement to previous findings in which the highest proportions of LEA2 genes were found to be localized within the cytoplasm and chloroplast, accounting for 35.7% and 30.9% of the total LEA2 genes in sweet orange, while others were found to target endoplasmic reticulum (E.R) and mitochondrion (Muniz Pedrosa ). Similarly, abiotic stress related gene, plasma membrane protein 3 (PMP3), a member of the small hydrophobic polypeptides with high sequence similarity, and have been functionally characterized to be responsible for salt, drought, cold, and abscisic acid, have been found to be sub localized in the nucleus, cytoplasm, and cell membrane (Fu ).
Table 2

Physiochemical properties of LEA2 gene in upland cotton, G. hirsutum, subcellular location prediction and chromosome position

Gene IdMolecular FormulaAtoms NumbersInstability IndexAliphatic IndexGravyLength (Aa)PlMw (Aa)Chr NoSub Cellular Localization
WolfpsortTargetPProwler
CotAD_ 00275C2550H4266N832O1061S220892949.2424.580.8242741029834.66Dt09_chr23chloSsp
CotAD_ 00465C2809H4694N922O1183S186979438.6827.50.7043041033689.28Dt09_chr23chloCsp
CotAD_ 00799C3119H5215N1021O1297S1961084842.1431.890.776337938982.02scaffold26.1golgCsp
CotAD_ 00808C2114H3538N688O893S149738238.4925.510.6982261026011.22scaffold26.1cyto_sp
CotAD_ 01033C1868H3118N616O781S132651537.5727.690.749202922587.14Dt10_ch20chloSsp
CotAD_ 01298C1996H3326N664O833S142696135.2927.790.7542181024021.4Dt10_ch20cyto_other
CotAD_ 01321C2138H3550N724O880S189748148.525.620.8552381026020.28Dt10_ch20cytoSsp
CotAD_ 01385C2253H3753N751O944S189789053.422.960.757247727497.03Dt09_chr23cytoSsp
CotAD_ 01700C2382H3972N790O976S223834352.1225.760.914260928399.83Dt09_chr23cyto_sp
CotAD_ 02652C2022H3396N646O835S184708363.5225.470.8982121023764.43Dt09_chr23mitoSsp
CotAD_ 03037C2465H4132N796O1011S239864354.6925.190.943262928472.57Dt05_chr19cytoSsp
CotAD_ 03649C2938H4904N970O1220S2321026442.0227.070.8113201035345.6At_chr09cytoSsp
CotAD_ 03784C1076H1792N358O453S53373226.7131.740.644116713537.66Dt07_chr16chlo_other
CotAD_ 05724C1834H3065N601O771S128639946.8526.710.7191971022442.51At_chr09chlo_sp
CotAD_ 05725C2229H3732N724O935S169778950.425.760.7552381027552.78At_chr09nucl_sp
CotAD_ 06037C1893H3159N625O802S134661345.2224.240.6682051022125.81Dt13_ch18chlo_sp
CotAD_ 07087C1926H3222N628O819S106670143.928.430.6222061022853.64At_chr02plas_other
CotAD_ 08181C1864H3110N616O780S135650543.7126.870.745202922460.02Dt09_chr23cytoSsp
CotAD_ 08350C1894H3182N604O790S142661249.0627.910.802198522266.98scaffold190.1chlo_sp
CotAD_ 08837C2300H3853N745O961S220807955.2920.730.825245926376.34scaffold280.1golgSsp
CotAD_ 09578C2381H3970N790O977S223834150.4625.380.905260928406.84At_chr09chlo_sp
CotAD_ 09685C2306H3847N763O928S220806461.1729.71.0242511027153.8Dt09_chr23chlo_sp
CotAD_ 09732C2198H3688N706O923S164767947.0726.140.755232925906.5Dt09_chr23chloCsp
CotAD_ 10376C2568H4293N841O1038S271901160.0525.861.0332771030152.74Dt01_chr15chloSsp
CotAD_ 11658C2438H4075N799O1007S165848434.8631.990.8232631029835.19Dt08_chr24cyto_sp
CotAD_ 11875C1627H2717N535O682S94565533.7130.960.706175720070.28scaffold42.1chloSsp
CotAD_ 11876C1942H3245N637O798S180680250.0125.510.9042091023563.32scaffold42.1chlo_other
CotAD_ 11878C2121H3552N688O886S165741255.9726.240.7852261025841.73scaffold42.1chloSsp
CotAD_ 11879C1215H2031N397O519S61422341.3228.350.5741291015037.05scaffold42.1chloSsp
CotAD_ 12375C1765H2948N580O727S157617761.325.950.879190921328.78At_chr09chlo_other
CotAD_ 13115C1791H2994N586O760S122625339.0724.830.659192920770.35Dt08_chr24extr_sp
CotAD_ 135842310H3858N760O957S190807546.5926.650.8322501028048.83Dt06_chr25golgSsp
CotAD_ 13827C3342H5592N1090O1370S2991169355.0727.480.922360840945.87Dt12_ch26E.R._sp
CotAD_ 14147C2022H3396N646O838S180708261.9925.160.8712121023855.54At_chr07mitoSsp
CotAD_ 15892C2861H4789N931O1209S186997640.4727.230.688307834741.21Dt12_ch26chlo_sp
CotAD_ 16731C2370H3954N784O980S202829047.4626.090.8452581028519.44Dt09_chr23chloSsp
CotAD_ 17044C1387H2309N463O581S100484043.0226.250.725151516422.87At_chr07cyto_other
CotAD_ 17045C2199H3654N742O907S185768748.7126.490.8382191023930.18At_chr07cyto_other
CotAD_ 17062C2047H3416N676O852S170716150.5625.070.8022441027393.16At_chr07chloSsp
CotAD_ 17101C1958H3277N637O811S177686053.5724.410.86222925294.09At_chr06mito_sp
CotAD_ 17102C2435H4063N805O1008S182849341.9829.020.8172091023661.48At_chr06nucl_sp
CotAD_ 17103C2213H3709N715O930S187775461.4622.720.767265730299.29At_chr06mito_sp
CotAD_ 17649C1849H3077N619O759S131643537.2231.770.843235926726.9At_chr10chloSsp
CotAD_ 18210C1850H3079N619O757S134643935.5732.090.8652031022501.33scaffold377.1cyto_other
CotAD_ 18233C1630H2729N529O675S118568140.5929.980.8222031022406.26scaffold377.1chlo_other
CotAD_ 18546C2571H4299N841O1038S270901958.8126.341.041731019695.85Dt09_chr23chlo_sp
CotAD_ 18729C1990H3320N658O828S137693343.3229.420.7722771030227.97scaffold336.1chloSsp
CotAD_ 19078C1684H2807N559O714S128589245.2522.260.6692161024007.7At_chr12nuclSsp
CotAD_ 19107C2766H4629N901O1165S184964542.1627.590.71183920031.24At_chr12chlo_other
CotAD_ 19205C941H1570N310O394S56327135.6530.190.703297733395.7At_chr12chlo_sp
CotAD_ 19213C1704H2853N553O707S109592645.832.120.7931001011538.35At_chr10chlo_sp
CotAD_ 19214C2114H3541N685O887S125735235.7630.890.719181920628.72At_chr10nuclCother
CotAD_ 19375C2310H3858N760O958S187807346.3226.780.823225925956.2Dt11_ch21golgSsp
CotAD_ 20020C1807H3029N583O761S123630336.8427.190.7172501027947.68At_chr06mitoSsp
CotAD_ 20308C2201H3658N742O909S184769446.2926.350.831911021054.44Dt06_chr25chlo_sp
CotAD_ 21731C2426H4054N796O986S230849260.2327.710.9752441027381.21Dt05_chr19nuclSsp
CotAD_ 21924C1845H3069N619O756S138642738.3130.960.862621028411.4Dt11_ch21nuclSsp
CotAD_ 23646C2458H4115N799O1036S200860853.522.840.7382041021921.93Dt07_chr16nucl_other
CotAD_ 24019C1624H2711N535O680S94564433.7131.140.7112031022391.06Dt06_chr25mitoSsp
CotAD_ 24497C1941H3243N637O796S181679850.125.830.916263929247.79Dt10_ch20chloSsp
CotAD_ 24499C2118H3546N688O883S170740556.2725.950.801175820026.25scaffold238.1chlo_sp
CotAD_ 25271C2240H3751N727O937S188784348.67240.792091023559.33scaffold238.1nuclSsp
CotAD_ 26038C1695H2826N562O718S127592845.5322.860.673226925852.71scaffold238.1chlo_sp
CotAD_ 26981C1423H2384N460O593S106496651.3427.730.792741029936.66At_chr09chloCsp
CotAD_ 27453C2034H3390N676O861S160712144.8221.960.6862391026994.13scaffold477.1mito_sp
CotAD_ 27789C2367H3951N781O998S201829853.321.310.731184920135.39scaffold699.1E.R._sp
CotAD_ 28249C2260H3788N730O947S140786533.9930.490.736150916764.6At_chr09nucl_sp
CotAD_ 28252C2177H3646N706O916S180762548.7722.870.752222924982.77At_chr07mitoSsp
CotAD_ 28872C1387H2306N466O578S109484648.9125.430.764257926949.97Dt03_chr17nucl_sp
CotAD_ 29279C1875H3141N607O784S137654447.6727.440.7693051034588.47Dt13_ch18chlo_other
CotAD_ 31344C2277H3795N757O932S181794242.4730.20.887101611711.01scaffold1346.1chloSsp
CotAD_ 31535C2944H4916N970O1223S2311028441.327.170.809240827649.86At_chr05vacuSsp
CotAD_ 31536C2047H3416N676O854S171716452.0224.330.789210923875.63scaffold1346.1plasSsp
CotAD_ 31537C1956H3273N637O809S177685254.1324.720.8682541027558.52scaffold1841.1nucl_sp
CotAD_ 31780C2649H4422N874O1100S195924040.0128.670.7993101034525.38Dt08_chr24chlo_sp
CotAD_ 31782C1944H3258N628O812S139678146.1428.270.774210823638.39Dt09_chr23chloSsp
CotAD_ 31860C4139H6916N1360O1727S3381448044.8925.260.7952061022839.69scaffold257.1cyto_sp
CotAD_ 31906C1914H3198N628O804S148669247.4924.60.7392321026256.38scaffold769.1cytoCsp
CotAD_ 31936C2627H4393N859O1089S219918747.9326.370.838152516462.97Dt01_chr15mitoSsp
CotAD_ 32487C1940H3238N640O815S167680051.1921.790.7533051033718.76At_chr11mito_sp
CotAD_ 32645C1845H3066N622O771S148645242.7924.350.753199922785.41Dt06_chr25chloSsp
CotAD_ 32847C1752H2928N574O730S100608439.4932.870.7452491027707.74At_chr09extrSsp
CotAD_ 33143C3449H5767N1129O1433S2461202446.4729.550.83051034544.43Dt02_chr14chloSsp
CotAD_ 33144C1970H3298N640O818S163688954.1226.180.83240927655.92Dt05_chr19chlo_sp
CotAD_ 34476C2374H3959N787O982S206830847.9125.480.8443201035579.84Dt09_chr23cyto_sp
CotAD_ 34798C2925H4884N964O1214S2451023251.6925.780.826222925253.03Dt06_chr25nuclSsp
CotAD_ 35069C2296H3827N763O944S159798948.4932.190.842091023628.4Dt06_chr25chloSsp
CotAD_ 35091C2037H3411N661O855S133709742.1728.830.728288732755.52Dt06_chr25extr_sp
CotAD_ 35514C1704H2853N553O708S110592846.7831.580.785206623420.27Dt05_chr19mitoSsp
CotAD_ 36328C1970H3298N640O819S162688953.1326.020.821450549131.5scaffold821.1chloCother
CotAD_ 36446C1628H2725N529O673S119567444.2530.170.8332311024949.39Dt08_chr24chlo_other
CotAD_ 36583C2954H4936N970O1224S2341031840.3627.690.829206922761.2scaffold821.1chlo_sp
CotAD_ 37776C1843H3062N622O768S149644446.5724.840.769202922357.93Dt09_chr23chloSsp
CotAD_ 37888C2554H4274N832O1063S219894250.7724.70.8232831031410.18At_chr08chloSsp
CotAD_ 38978C2819H4711N925O1184S205984442.126.220.7342101022644.27Dt08_chr24nuclSsp
CotAD_ 39064C969H1623N313O399S81338556.0927.650.8742101023699.74Dt01_chr15chloSsp
CotAD_ 39719C1971H3300N640O818S160688954.826.80.83191620961.07Dt01_chr15nuclSsp
CotAD_ 40324C2364H3954N772O959S228827754.4927.790.9952041021780.76At_chr07plas_sp
CotAD_ 41569C2875H4808N940O1188S2441005550.5526.760.8622081022559.45At_chr13chlo_sp
CotAD_ 41571C1947H3252N640O803S171681359.7626.020.872701030627.54Dt09_chr23chlo_sp
CotAD_ 41925C1928H3226N628O816S110670846.1329.070.656188921941.4scaffold1231.1nucl_other
CotAD_ 42599C2794H4661N925O1169S206975543.3826.650.7523731043118.75scaffold1231.1cyto_other
CotAD_ 44357C2819H4711N925O1183S209984744.41260.743210923874.6scaffold1088.1cytoCsp
CotAD_ 45324C2259H3786N730O944S141786034.6931.040.7542561028431.93Dt11_ch21chloSsp
CotAD_ 46873C2117H3529N703O871S205742556.1423.680.8942591028603.52At_chr09vacuSsp
CotAD_ 47322C1862H3106N616O776S139649943.1427.20.7732201024666.72At_chr03chloSsp
CotAD_ 47454C1973H3304N640O818S176691153.2124.610.854661673583.12scaffold1851.1cyskSsp
CotAD_ 47495C1754H2923N583O719S178615755.0123.060.9223181035234.15Dt07_chr16chloSsp
CotAD_ 47749C1922H3208N634O818S131671342.7823.890.636251927769.63Dt07_chr16chloSsp
CotAD_ 48050C2571H4320N820O1053S198896250.0832.150.921217924968.87Dt10_ch20mito_sp
CotAD_ 48069C2356H3932N778O994S159821943.8326.420.6891811020577.73Dt10_ch20extrSsp
CotAD_ 48336C2036H3400N670O835S177711847.9627.690.9211923479.93Dt04_chr22nuclSsp
CotAD_ 48753C6218H10441N1993O2614S4482171447.8127.020.752210923676.69At_chr06mito_sp
CotAD_ 48769C1998H3351N643O829S165698656.4126.680.8433041033675.21At_chr09nucl_sp
CotAD_ 49818C2811H4698N922O1186S183980036.9627.390.691317535274.16scaffold2616.1cytoSsp
CotAD_ 53045C2922H4881N961O1224S1731016137.1730.970.72206822650.27Dt10_ch20cytoSsp
CotAD_ 53263C1938H3246N628O811S135675844.0928.270.7562511027168.81At_chr09chlo_other
CotAD_ 53981C2316H3867N763O933S219809861.729.831.021247727715.29scaffold3326.1mito_sp
CotAD_ 54337C2251H3749N751O943S189788354.6222.960.757152516453.02At_chr07chlo_sp
CotAD_ 55224C1390H2312N466O579S109485650.6125.650.7682101023769.83Dt03_chr17mitoSsp
CotAD_ 56356C1954H3266N640O822S101678333.9732.130.6771731019737.98At_chr09chlo_other
CotAD_ 56696C1963H3275N649O822S113682233.2931.220.712131023750.48Dt03_chr17nuclSsp
CotAD_ 58358C1600H2547N445O483S11508661.1965.42091023626.51Dt12_ch26chloSsp
CotAD_ 59405C1936H3233N637O793S189678854.4124.720.933201035457.72Dt05_chr19chlo_sp
CotAD_ 60279C2316H3879N751O968S220813453.8620.830.82247926619.63scaffold2414.1chloSsp
CotAD_ 60435C2292H3819N763O938S163797549.5932.720.8692511027952.81At_chr01chloSsp
CotAD_ 60617C1977H3312N640O820S177692654.1524.450.8552101023780.9Dt01_chr15mitoSsp
CotAD_ 61173C1964H3271N655O821S137684838.4927.720.7392151024043At_chr04chlo_other
CotAD_ 61391C1753H2921N583O718S179615454.123.060.928191620884.97Dt01_chr15chloSsp
CotAD_ 62996C2926H4886N964O1214S2451023551.5825.880.8283181035356.25At_chr01nuclSsp
CotAD_ 63174C3526H5909N1141O1460S2811231744.1328.180.853771041228.93scaffold3177.1E.R.Ssp
CotAD_ 64004C2020H3371N667O833S157704848.1129.020.8452191023825.02Dt07_chr16chlo_sp
CotAD_ 64120C2001H3336N664O837S143698133.3627.190.7432181024050.43At_chr12chloSother
CotAD_ 64346C1963H3284N640O817S168687252.9224.610.818210923572.5Dt06_chr25chlo_other
CotAD_ 64347C2142H3567N715O883S231753859.9819.780.901235926111.93Dt06_chr25plas_sp
CotAD_ 64657C2431H4064N796O990S225850659.0427.960.9612621028516.58At_chr11vacu_sp
CotAD_ 65119C1908H3186N628O800S147666942.9325.080.747206922733.19Dt08_chr24golgSsp
CotAD_ 65370C1019H1668N278O359S3332761.44493261036098.18scaffold3528.1chloSsp
CotAD_ 66245C4148H6934N1360O1732S3371451145.3825.260.792450548836.2Dt08_chr24chloCsp
CotAD_ 66538C1991H3337N643O823S168696259.1626.990.8662111023424.96At_chr04chloSsp
CotAD_ 66551C2086H3485N685O872S114724218.6632.80.72225925226.24scaffold3976.1cyto_sp
CotAD_ 66774C1993H3326N658O830S137694442.9329.270.7682161024090.84Dt08_chr24chloSsp
CotAD_ 66775C2066H3445N685O872S139720732.1226.210.6822251025078.29Dt08_chr24chlo_other
CotAD_ 67823C2035H3392N676O841S191713553.5323.440.8612221023928.26At_chr08cytoSsp
CotAD_ 68063C2031H3396N664O856S167711450.8622.360.733218923245.72At_chr03cyto_sp
CotAD_ 68189C1936H3242N628O808S135674944.7328.910.772206722579.21At_chr10chloSsp
CotAD_ 69737C1966H3281N649O821S117683432.8331.380.7322131023867.69scaffold2095.1chloSsp
CotAD_ 69738C1956H3270N640O824S101679132.3131.820.6692101023893.04scaffold2095.1chloSsp
CotAD_ 70003C1807H3029N583O761S12063006.9127.710.7131911020942.44At_chr12cyto_sp
CotAD_ 70190C3927H6552N1300O1658S2171365430.6630.050.661430548185.02scaffold4817.1cyto_other
CotAD_ 70192C1226H2050N400O509S77426234.4531.910.776130514420.49scaffold4817.1nuclCother
CotAD_ 71431C1743H2916N568O719S152609846.4626.330.8741861020579.98Dt05_chr19extrCsp
CotAD_ 72458C1788H2988N586O760S119624139.9624.830.6441921020613.31scaffold3083.1cysk_sp
CotAD_ 72913C2901H4845N955O1214S1731008838.3731.060.726315535071.89scaffold4398.1cysk_other
CotAD_ 73966C2955H4938N970O1228S2301032141.0627.380.8093201035484.73At_chr12chloSsp
CotAD_ 74713C1998H3351N643O829S165698656.4126.680.843211923479.93Dt08_chr24golgSsp
CotAD_ 76129C1937H3235N637O793S190679254.4124.720.9352091023626.51At_chr12chlo_sp
The cell compartmentalization of stress related genes is fundamental to their functional role (Osman ), the presence of the proteins encoding LEA2 genes in the chloroplast, could be responsible for maintaining osmotic balance and suppression of reactive oxygen species (ROS) production in the guard cells (Wang ), while those present in the membrane, could be responsible for the protection of the membrane integrity (Guo ). In addition, the sub cellular localized proteins encoding LEA2 genes embedded in the channeling or transporter organelles such endoplasmic reticulum, are likely to aid in the process of the ions sequestration (Porcel ). Based on various findings, the LEA protein families are known to have a universal structure, with varying proportions of the various amino acids (Hong-bo ). In order to verify the LEA2 proteins due to their unique hydrophobic property, we found that the LEA2s are rich in non-polar aliphatic amino acid residues, in which the highest proportion was noted in leucine with 9.2%, Valine with 8.2%, isoleucine (6.3%), alanine (5.9%) and the least was proline (5.7%). The high proportions of the non-polar residues, indicated that the LEA2 proteins are mainly embedded within the membrane, non-polar amino acids are found in the center of water soluble proteins while the polar amino acids are found at the surface (Petukhov ). The second in proportions were the polar, non-charged residues such as serine (8.9%), threonine (6.4%), cysteine 1.9%), methionine (2.2%), asparagine (5.0%) and glutamine (3.4%) The high proportions of the polar residues have been found to be predominant among the stress related proteins, such as the heat shock proteins (HSPs) (Wang ), therefore the presence of the polar residue, indicated that the LEA2 proteins could be responsible for coating the cellular macromolecules with a cohesive water layer and in turn protect the membrane and the membrane bounds multiprotein complexes from unfolding and aggregation during drought stress condition.

Genomic organization and motif detection of LEA2 proteins in cotton

Analysis of the exon-intron structure of all the 157 LEA2 genes was done using the gene structure displayer (http://gsds.cbi.pku.edu.cn/), a greater percentage of the LEA2 genes and their exons were highly conserved within the group (Supplementary Figure S1). Most of the LEA2 genes were intronless, with 114 genes, accounting for over 73%, of the LEA2s found to be intronless. The existence of introns in a genome is argued to cause enormous burden on the host (Wahl ). The burden is because the introns requires a spliceosome, which is among the largest molecular complexes in the cell, comprising of 5 small nuclear RNAs and more than 150 proteins (Wahl ). Intron transcription is costly in terms of time and energy (Lane and Martin 2010). Due to various stresses in which the plants are exposed to, the energy demand for survival is relatively high, thus various gene actions within the plant has to function under conserved energy demand threshold (Timperio ). A plant under stress condition requires to survive the effects caused by overload of excessive production of reactive oxygen species (ROS), 3,4-Methylenedioxyamphetamine (MDA) and low levels of Peroxidase (PODs) activities, therefore most of the genes responsible for stress tolerance either lack introns or possess significantly reduced number of introns within their gene structure (Jeffares ). Being the transcription process of the intron laden genes requires a lot of time and energy, which is hypothesized to cause or results into deleterious effect on gene expression (Calderwood ). Conserved motifs in the 157 LEA2 proteins were identified through an online tool MEME (Supplementary Figure S1). The motif lengths identified by MEME (http://meme-suite.org/), were between 14 and 112 amino acids in LEA2 proteins, similar results of conserved motif with lengths between 11 and 164 amino acids were obtained in cotton MYBs protein (He ). The homology in motif lengths with that of MYBs provided significant evidence supporting the possible role of the LEA2s in response to water stress which includes the regulation of stomatal movement, the control of suberin and cuticular waxes synthesis and the regulation of flower development (He ). Most of the LEA2 proteins had distinctive motifs, which are valuable for their identification, the common motifs identified for the cotton LEA proteins were; motif 1 (FFVLFSVFSLILWGASRPQKPKITMKSIKFENFKIQAGSDFSGVPTDMITMNSTVKMTYRNTATFFGVHVTSTPLDLSYSQJTIASG), motif 2 (WLVFRPKKPKFSLQSVTVYAL), motif 3 (NFQVTVTARNPNKRIG IYYD), motif 5 (TVKNPNFGSFKYDNSTVSVNYRGKVVGEA) and motif 14 (RRRSCCCCCCLWTLJ) (Supplementary Figure S2). The number of the conserved motifs in each LEA2s varied between 1 and 7. The majority of close members in the phylogenetic tree exhibited common motif compositions, which suggested they have a functional similarity within the same subgroup. The alignment results of the LEA2 proteins showed various segments such as Y-segment, K-segment and S-segments (Supplementary Figure S3), which have been previously described in dehydrins (Hanin ). Other unique segments identified were E, R and D segments. The K segment has been found to form an amphipathic α-helix (Monera ). The K-segments assumes α-helical structure identical to class A2 amphipathic α-helices mainly found in apolipoproteins, apolipoproteins facilitate the transportation of water-insoluble lipids in plasma, and α-synucleins (Rorat 2006). The conformation of the protein structure in turn leads to functional change (Dyson and Wright 2005). Drought stress alters the protein ambient microenvironment, leading to protein conformational and functional changes (Mahdieh ). The amphipathic α-helices have the ability to interact with the dehydrated surfaces of various other proteins and biomembranes (Cornell and Taneva 2006). The binding of dehydrins to the dehydrated surface of other proteins enhances formation of amphipathic α-helices which protects other proteins from further loss of water. The presence of this K segment in LEA2 revealed the significant role played by these proteins in plants during drought stress. It has been suggested that the protective role of the LEA proteins is due to their ability to form α-helices which enables them to interact with other proteins and or biomembranes (Koag 2003). Kovacs et al., (Kovacs ), reported the protective activities of two dehydrin proteins isolated from A. thaliana, early response to dehydration 10 (ERD10) and early response to dehydration 14 (ERD14), against thermal inactivation of alcohol dehydrogenase and thermal aggregation of citrate synthase.

Chromosomal location and duplication events of cotton LEA2 genes

A gene’s location on a chromosome plays a significant role in shaping how an organism’s traits vary and evolve (Lazazzera and Hughes 2015). Chromosomes hold thousands of genes, with some situated in the middle of their linear structure and others at either end (Bickmore and Van Steensel 2013). Therefore, for us to understand the gene distribution and mapping positions of the LEA2 genes, the positions of each LEA2 genes were mapped on the A, D and AD cotton chromosome by carrying out homology search against the full-lengths of G. arboreum (A-genome), G. raimondii (D-genome) and G. hirsutum (AD genome) assembly. The LEA2 genes were mapped in all the 26 chromosomes in G. hirsutum, 13 chromosomes in G. arboreum and 12 chromosomes in G.raimondii. In diploid cotton genome, G. arboreum and G. raimondii, the gene distribution pattern was almost identical to the tetraploid cotton gene distribution (Supplementary Figure S4). In chromosome 9 in G. arboreum and its homolog chromosome in G. raimondii, a significant level of gene loss was observed in which only a single gene was contained in chr09 of G. arboreum compared to 10 genes in chr09. But more interestingly, there was total gene loss in chr13 of G. raimondii. The lack of LEA2 genes in chr13 in G. raimondii could only be accounted for due to either gene loss or gene deletion, for most of the LEA genes are found in every chromosome. The occurrences of LEA2 genes on every chromosome indicated that the genes are widely distribution on the entire cotton genome. However, the density of these loci was variable across the 26 chromosomes of upland and 13 chromosomes in A and D diploid cotton. The largest number of genes were located on chromosomes At09 (chr09) and Dt09 (chr23), with 12 and 14 genes respectively, followed by chromosome, Dt08 (chr24) with 10 genes, Dt 06 (chr25) with 9 genes, At07 and At12 with 12 genes each. The lowest loci ranged from 1 to 5 genes, with chromosome At02, At05, At09, Dt02 (chr14) and Dt04 (chr22) had a single gene each (Supplementary Figure S5). A total of 39 genes were not mapped and thus grouped as scaffold. The distribution of the genes on the chromosomes appeared to be uneven. In general, the central sections of chromosomes were located with less LEA2 genes and relatively high densities of upland cotton LEA2s were observed in the top and bottom sections of most chromosomes. Similar gene loci clustering pattern was also observed in GrMYB genes distribution in which most of the genes were clumped either on the upper or lower regions of the chromosomes (He ). A gene’s location on a chromosome plays a significant role in shaping how an organism’s traits vary and evolve (Sexton and Cavalli 2015). It has been found that evolution is less a function of what a physical trait is, but more of where the genes that affect that trait are located in the genome (Sexton and Cavalli 2015). The distribution of this subset of LEA genes across the whole cotton genome provided a significant role played by these genes within the plant. The main cause of gene expansion in a genome or organism is either due to segmental or tandem duplication (Cannon ). Two or more genes located on the same chromosome, one following the other, confirms a tandem duplication event, while gene duplication on different chromosomes is designated as segmental duplication event (Yu ). In the present study, cluster formations by the LEA2 genes explained the mechanism behind their expansion in cotton. Most of the duplicated genes were between G. hirsutum and its ancestors, G. arboreum (53) and G. raimondii (11) (Table 3). The tetraploid cotton, G. hirsutum evolved due to whole genome duplication resulting into polyploidy cotton. The Ka/Ks values ranged from 0 to 2.17333, with an average value of 0.4238, which implied that majority of the gene pair had Ka/Ks values of less than 1, which indicated that the LEA2 genes have been influenced extensively by purifying selection during the process of their evolution.
Table 3

Gene duplication, synonymous (Ks), nonsynonymous (Ka) and Ka/Ks values calculated for paralogous LEA2 gene pairs in cotton genome

Gene typeParalogous gene pairsLength (aa)KaKsKa/KsNegative/purifying selectionP-Value (Fisher)
LEA2CotAD_59405CotAD_7612962700.006540YES0
LEA2CotAD_20020Cotton_A_0184575000.005680YES0
LEA2CotAD_19078Cotton_A_2317264800.006720YES0
LEA2CotAD_08181Cotton_A_2754360600.006970YES0
LEA2CotAD_48976Cotton_A_2977966000.006420YES0
LEA2CotAD_35514Gorai.010G176400.154300.008220YES0
LEA2CotAD_31536Cotton_A_134706270.002110.033730.06246YES0.00360292
LEA2CotAD_37888Cotton_A_086639600.043780.558390.07841YES1.73E-37
LEA2CotAD_03649CotAD_378889600.045220.541420.08352YES9.32E-36
LEA2CotAD_03649Cotton_A_144789600.045920.529720.08668YES3.29E-35
LEA2CotAD_03649CotAD_739669600.045970.5270.08723YES4.70E-35
LEA2CotAD_17102CotAD_315366270.004220.033650.12547YES0.0107355
LEA2CotAD_44941Gorai.005G203000.17200.001750.013680.12779YES0.0998325
LEA2CotAD_08181CotAD_465506060.006540.049750.1315YES0.00250188
LEA2CotAD_17101Cotton_A_134696660.001950.013180.14805YES0.121749
LEA2CotAD_09578Cotton_A_021967800.09030.599440.15064YES7.07E-24
LEA2CotAD_35069CotAD_629969540.005510.036430.15116YES0.0017334
LEA2CotAD_59405Cotton_A_403636270.006360.040160.15842YES0.00848415
LEA2CotAD_17045Cotton_A_143546570.002010.012620.15958YES0.13409
LEA2CotAD_09685CotAD_539817530.007110.043860.16211YES0.00252472
LEA2CotAD_01700Cotton_A_021967800.099920.589860.16939YES8.68E-22
LEA2CotAD_17062CotAD_217317320.007190.041610.17276YES0.00506705
LEA2CotAD_35069Cotton_A_243569540.005510.031780.17329YES0.00508945
LEA2CotAD_10376Cotton_A_056258310.006450.034440.18723YES0.00723285
LEA2CotAD_21924Cotton_A_189197860.010280.052190.19697YES0.00026749
LEA2CotAD_31535Gorai.006G150200.16660.003910.019810.19743YES0.082505
LEA2CotAD_25271Cotton_A_146764050.006470.032340.20023YES0.085476
LEA2CotAD_09685Cotton_A_054447530.00890.043870.20282YES0.00516244
LEA2CotAD_46888Cotton_A_095965730.009220.04530.20351YES0.0147038
LEA2CotAD_08181Gorai.009G305100.16060.004350.021030.20672YES0.090366
LEA2CotAD_19078CotAD_667746480.010090.048420.20844YES0.00834864
LEA2CotAD_32487Cotton_A_132406300.004250.019170.22185YES0.103356
LEA2CotAD_23118CotAD_7406112150.016110.068820.23405YES5.00E-05
LEA2CotAD_36328CotAD_643466300.017770.075640.23489YES0.000973496
LEA2CotAD_32847CotAD_390646120.011060.04610.23994YES0.0153075
LEA2CotAD_46873CotAD_606176300.008350.034520.24185YES0.0372109
LEA2CotAD_46873Cotton_A_096156300.008350.034520.24185YES0.0372109
LEA2CotAD_18546CotAD_377765190.010160.041950.24212YES0.0375368
LEA2CotAD_19375Cotton_A_064356750.013450.055410.24268YES0.00759106
LEA2CotAD_46888CotAD_613915730.013870.053130.26111YES0.0175133
LEA2CotAD_23118Cotton_A_3811712150.016110.060770.26514YES0.000321992
LEA2CotAD_19214Cotton_A_308895430.002370.00830.28598YES0.347253
LEA2CotAD_31535Cotton_A_134696660.013770.047180.2919YES0.0234164
LEA2CotAD_21924CotAD_646577860.013730.046930.29247YES0.0120925
LEA2CotAD_31140Cotton_A_159987470.001740.00580.30099YES0.356655
LEA2CotAD_30219Cotton_A_324955970.011050.036260.30482YES0.0618481
LEA2CotAD_46873Gorai.001G124400.16300.002080.006740.30909YES0.361889
LEA2CotAD_46888Gorai.001G122700.15730.00460.01480.31039YES0.238274
LEA2CotAD_28252CotAD_532634920.013560.042850.31656YES0.069282
LEA2CotAD_14147Cotton_A_023706360.004160.013120.3169YES0.244174
LEA2CotAD_23646Cotton_A_273006090.042490.131350.32348YES0.000630664
LEA2CotAD_09578Cotton_A_070367800.003420.010370.33004YES0.256013
LEA2CotAD_17045CotAD_640046570.022470.065230.34445YES0.0157104
LEA2CotAD_37888CotAD_739669600.015280.04420.34576YES0.0157353
LEA2CotAD_37888Cotton_A_144789600.012470.035280.3534YES0.0321315
LEA2CotAD_23646Gorai.006G199800.16090.042490.114110.37237YES0.00460089
LEA2CotAD_17062Cotton_A_143707320.00990.026480.37402YES0.0618224
LEA2CotAD_02652Cotton_A_023706360.012560.033110.37934YES0.101339
LEA2CotAD_19214CotAD_355145430.009550.025090.38065YES0.19023
LEA2CotAD_21731Cotton_A_143707320.008990.023540.38192YES0.138838
LEA2CotAD_13584Cotton_A_018457500.008780.022940.38264YES0.139381
LEA2CotAD_17101CotAD_315356660.015760.040260.3915YES0.077316
LEA2CotAD_35091CotAD_604357530.030160.076890.39221YES0.0144267
LEA2CotAD_20308CotAD_700035730.009150.022910.39918YES0.206152
LEA2CotAD_50359CotAD_665386330.016770.040940.40958YES0.0891624
LEA2CotAD_01700Cotton_A_070367800.015510.037010.41916YES0.0752732
LEA2CotAD_02652CotAD_141476360.008350.019740.42281YES0.226532
LEA2CotAD_35513Cotton_A_308906510.021930.051020.42978YES0.0738291
LEA2CotAD_35514Cotton_A_308895430.007160.016590.43135YES0.312651
LEA2CotAD_28872Gorai.005G203000.17200.012330.027620.44641YES0.170613
LEA2CotAD_56699Cotton_A_385346390.020210.044930.44988YES0.106618
LEA2CotAD_01700CotAD_095787800.012020.026340.45642YES0.151105
LEA2CotAD_40972Cotton_A_296595910.966592.07090.46675YES0.00123143
LEA2CotAD_40972CotAD_389785910.960252.041930.47026YES0.00125135
LEA2CotAD_17101Gorai.006G150200.16660.019770.040180.49197YES0.209339
LEA2CotAD_50359Cotton_A_335486330.016780.033880.49528YES0.175709
LEA2CotAD_74713Cotton_A_335486330.016780.033880.49528YES0.175709
LEA2CotAD_03649CotAD_313449600.011030.022140.49798YES0.177084
LEA2CotAD_13584CotAD_200207500.008780.017110.51348YES0.287642
LEA2CotAD_13115Cotton_A_310595760.02070.03790.54625YES0.312514
LEA2CotAD_19214Gorai.010G176400.15430.009550.016640.57418YES0.403293
LEA2CotAD_20308Cotton_A_176255730.013750.022960.59881YES0.347235
LEA2CotAD_25271CotAD_487694050.006470.010630.6094YES0.539117
LEA2CotAD_12681Cotton_A_082124320.031210.049280.63319YES0.35887
LEA2CotAD_19623CotAD_369992820.032960.047710.69081YES0.631725
LEA2CotAD_23646Cotton_A_272826090.025870.037380.69204YES0.542393
LEA2Cotton_A_13471CotAD_17103180.942.323971.113230.77822YES837
LEA2CotAD_53438CotAD_681896180.023410.028980.80786YES0.519399
LEA2CotAD_56696Cotton_A_385356300.018380.022690.80979YES0.670475
LEA2CotAD_44941Cotton_A_179867200.012330.013690.90056YES0.874489
LEA2CotAD_22539Cotton_A_251954081.232651.241120.99317YES1
LEA2CotAD_28872CotAD_449417200.01410.013691.03042NO0.900519
LEA2CotAD_17103Cotton_A_134718372.587122.323971.11323NO0.778217
LEA2CotAD_13827Cotton_A_1864511042.120921.896531.11832NO0.642563
LEA2CotAD_10044Cotton_A_0947319020.002740.002281.20458NO0.731531
LEA2Cotton_A_31083CotAD_350699392.27481.838581.23726NO0.447623
LEA2CotAD_30219Gorai.006G104100.15970.008840.007071.25015NO0.743557
LEA2CotAD_03649Cotton_A_086639600.005490.004371.25606NO0.744588
LEA2CotAD_11658Cotton_A_404997890.023090.017511.3191NO0.985982
LEA2CotAD_12375CotAD_424085972.420621.682881.43838NO0.288342
LEA2CotAD_35091Cotton_A_243716993.503091.611862.17333NO0.036477

Cis element prediction in LEA2 proteins

Transcription factors (TFs) and cis-acting regulatory elements contained in stress-responsive promoter regions function not only as molecular switches for gene expression, but also as terminal points of signal transduction in the signaling processes (). The cis-regulatory promoters are located on the upstream of genes and functions as binding sites for transcription factors (TFs) which play essential functions in determining the tissue-specificity or stress-responsive expression patterns of the genes (Yamaguchi-Shinozaki and Shinozaki 2005). For better understanding of the potential roles of the LEA2 genes, 1000 bp regions upstream of the transcriptional start site were extracted and used in the identification of cis-regulatory promoters and other important motifs. Abiotic stress-related cis-elements were found in the putative promoters of LEA2 genes in upland cotton, G. hirsutum, (Figure 2) and (Supplementary table S5). For instance, MYBCORE, is known to have a functional role in drought and regulation of flavonoid biosynthesis (Solano ). ABRELATERD1, ABRE-like sequence and ACGTATERD1 are responsive to dehydration (Simpson ). ACGTATERD1 is associated to early responsive to dehydration (Simpson ). The presence of the stress promoter elements strongly supported the possible role of upland cotton LEA2 proteins in enhancing drought tolerance in cotton. The high proportion of cis promoter elements in LEA2 proteins, could possibly explain why genes encoding LEA proteins are highly expressed under abiotic stress, as was found in the root tissues of Arabidopsis under drought stress (Dalal ; Candat ). It is also important to mention that various transcription factors (TFs) and cis-acting regulatory elements contained in stress-responsive promoter regions function not only as molecular switches for gene expression, but also as terminal points of signal transduction in the signaling processes (Yamaguchi-Shinozaki and Shinozaki 2005).
Figure 2

Average number of the cis-elements in promoter region of upland cotton G. hirsutum LEA2 genes. The cis-elements were analyzed in the 1 kb upstream promoter region of translation start site using the PLACE database.

Average number of the cis-elements in promoter region of upland cotton G. hirsutum LEA2 genes. The cis-elements were analyzed in the 1 kb upstream promoter region of translation start site using the PLACE database.

Prediction of LEA genes targeted by miRNAs

Drought is a recurring climate feature in most parts of the world (Kang ). The sessile nature of the plants, has made the plants to developed their own defense systems to cope up with perennial and erratic adverse climatic conditions (Bartwal ). One of the defense mechanisms used by the plants toward the effect of drought stress is the reprogramming of gene expression by microRNAs (Ferdous ). The small RNAs (miRNAs) are known as the small noncoding RNAs with approximately 22 nucleotides length. The miRNAs are mainly involved in the regulation of genes at post-transcriptional levels in a range of organisms (Grivna ). Large groups of small RNAs have been reported as regulators in plant adaptation to abiotic stresses (Xie ). To get more information on the LEA2 genes functions, we determined the prediction of miRNAs targets on LEA2 genes by the use of psRNATarget, the same as been applied for other functional genes in cotton (Dai and Zhao 2011). Out of 157 upland cotton LEA2 genes, 63 genes were found to be targeted by 48 miRNAs, representing 40% of all the LEA2 genes (Supplementary Table S6). The highest levels of target was detected for the following genes with more than 6 miRNAs, CotAD_00799 being targeted by ghr-miR2948-5p, ghr-miR7492a, ghr-miR7492b, ghr-miR7492c, ghr-miR7494 and ghr-miR7510b. CotAD_19205 targeted by ghr-miR390a, ghr-miR390b, ghr-miR390c, ghr-miR7492a, ghr-miR7492b and ghr-miR7492c. CotAD_31936 targeted by ghr-miR7492a, ghr-miR7492b, ghr-miR7492c, ghr-miR827a, ghr-miR827b and ghr-miR827c. CotAD_32487 targeted by ghr-miR156a, ghr-miR156b, ghr-miR156d, ghr-miR7507 and ghr-miR7509. CotAD_33143 targeted by ghr-miR2948-5p, ghr-miR482a, ghr-miR7492a, ghr-miR7492b, ghr-miR7492c and ghr-miR7510b. CotAD_41925 targeted ghr-miR396a, ghr-miR396b, ghr-miR7492a, ghr-miR7492b, ghr-miR7492c, ghr-miR827a, ghr-miR827b and ghr-miR827c. The rest of the genes were either targeted by 1 or 5 miRNAs. The high number of miRNAs targeting LEA2 genes could possibly have direct or indirect correlation to their stress tolerance levels to abiotic stress more so drought. Some specific miRNAs had high level of target to various genes such as ghr-miR164 (4 genes), ghr-miR2949a-3p (4 genes), ghr-miR2950 (8 genes), ghr-miR7492a (10 genes), ghr-miR7492b (10 genes), ghr-miR7492c (10 genes), ghr-miR7504a (5 genes), ghr-miR7507 (5 genes), ghr-miR7510a (6 genes), ghr-miR7510b (10 genes), ghr-miR827b (4 genes) and lastly ghr-miR827c (4 genes). It has been found that miRNAs might be playing a role in response to drought and salinity stresses through targeting a series of stress-related genes. The plant specific transcriptome factors such as NAC gene family have been found to have varied functional roles in plant growth and development (Pereira-Santana ), myeloblastosis (MYB) is highly correlated to various stress factors (Ambawat ). The detection of some the LEA2 genes being targeted by specific miRNA linked to mitogen-activated protein kinase (MAPK), N-acetyl-L-cysteine (NAC) and myeloblastosis (MYB) provided a stronger indication of the significance contributions of the LEA2s in enhancing drought tolerance in plants. The micro/small RNAs mediated post-transcriptional processes have been linked to response to water deficit condition. Plant miRNAs are involved in multi-complex and arrays of processes, including but not limited to response to stress, nutrient limitation, development, pattern formation, flowering time, hormone regulation, and even self-regulation of the miRNA biogenesis pathway (Yamaguchi-Shinozaki and Shinozaki 2005). It is important to note that most of the miRNA target genes encode transcription factors, which place miRNAs at the focal point of gene regulatory networks. Moreover, the availability of genome-wide characterization of cotton miRNA genes enabled us to perform the prediction of the miRNA targets involved in drought response.

Expression Patterns of LEA2 Genes in Different Tissues of Upland cotton as determined Through RNA sequence

Analysis of the RNA expression profile provides an indicator of the functional role of the genes in the plant. We therefore carried the RNA expression analysis (RPKM > 1) in various tissues of the cotton plant, out of the entire 157 LEA2 genes in upland cotton, G. hirsutum, 117 (75%) of all the LEA2 genes showed differential expression in various tissues, such as the leaves, roots, stem, petal, pistil, stamen, torus and calycle (Figure 3). Based on their expression profiling, the genes were clustered into three broad groups. Group 1 members with 29 genes were highly up regulated under drought and salt conditions. Under salt and drought stress, CotAD_33321, CotAD_41571, CotAD_11876, CotAD_24498 and CotAD_59405 showed the highest expression levels, Similarly CotAD_11876, CotAD_24498 and CotAD_59405 were equally significantly up regulated in all the tissues tested. A total of 23 genes were highly up regulated in 5 tissues, which provided a strong evidence of the functional role of the LEA2 genes in enhancing stress tolerance in plants. Majority of the analyzed genes, showed relatively lower expression levels in the root tissues, but CotAD_11876, CotAD_59405 and CotAD_24498 exhibited significant higher expression levels, with expression values of more than 2. A unique observation was made, among the moderately up regulated genes in the roots, the genes exhibited significant up regulation in the calyx. The up regulation of these genes in the reproductive tissues could be an indication of their functional role in the fiber development process.
Figure 3

Expression profile analysis of LEA2 genes in 5 upland cotton tissues. The LEA2 genes expressed (RPKM > 1) in leaf, stem, root, calyx and petal were represented according to their tissue specificity: (A): LEA2 genes RNA seq. expression profile under drought and salt stress. (B): LEA2 expression in the 8 different tissues and (C): Venn diagram quantification and common genes expressed among the 5 tissues.

Expression profile analysis of LEA2 genes in 5 upland cotton tissues. The LEA2 genes expressed (RPKM > 1) in leaf, stem, root, calyx and petal were represented according to their tissue specificity: (A): LEA2 genes RNA seq. expression profile under drought and salt stress. (B): LEA2 expression in the 8 different tissues and (C): Venn diagram quantification and common genes expressed among the 5 tissues. In the validation of the expression profile of the LEA2 genes under drought stress condition, CotAD_24498, CotAD_21924, CotAD_20020 and CotAD_59405 were highly up regulated in root, stem and roots tissues under drought stress condition. However, the expression levels were much higher in G. tomentosum as opposed to G. hirsutum, suggesting that, these genes could be the key genes.

qRT-PCR Expression profiling of the LEA2 genes in leaf, stem and roots of upland cotton

Based on the results obtained from the RNA sequence data, 48 genes were selected for qRT-PCR validation. Two cotton genotypes were used, G hirsutum an elite cultivar, majorly grown around the world; it covers more than 90% the cotton growing regions in China but susceptible to drought stress condition. The second plant used was the G. tomentosum, wild cotton, native to the Hawaiian island, it is known for its high ability to tolerate salinity and drought stress conditions. The two cotton plants were grown in the greenhouse, and at three leaf stage, were exposed to drought for a period of 14 days. The roots stem and leaves were obtained for RNA extraction and qRT-PCR analysis. In the analysis of qRT-PCR profiling of various tissues, the results indicated high variability in transcript abundance of LEA2 genes in upland cotton (Figure 4). In G. tomentosum and G. hirsutum, majority of these genes showed relatively high expression in the root and leaf, except in stem. Leaves and roots are the main plant organs affected by drought stress (Alexandersson ). The plant leaf is the site for photosynthesis; drought stress might possibly be the cause of excess release of reactive oxygen species (ROS). ROS are toxic to the plants, the genes with high expression in the leaves, could perhaps be involved in the ubiquitin of the ROS, thus preventing the damage and maintain the normal functions of the photosynthetic cells. The high osmotic potential generated in the cytoplasm of guard cells during stomatal opening could probably lead to accumulation of LEA2s in leaf tissue. Increased osmotic potential within the guard cells necessitates mass flow of water into the guard cells, leading to its turgidity and thus opening of the stomatal pore, but during drought stress, the osmotic potential is never offset, and thus dehydration stress on the nucleus. The LEA2s increased accumulation within the leaf tissues, could be due to maintaining structural integrity and preventing the membranes from dehydration stress. The finding is consistent to proposed functions of the LEA genes, which is the protective role during abiotic stresses (Nylander ). The roots are the connection point between the water reservoir and the plants. High up regulation of LEA2 genes in the roots indicated that these genes could be involved in the water balance in the roots. Increased or high up regulation of LEA2s in the roots, further augment the primary role of LEA genes in plants, the protective function, roots are the very first plant organs to be affected by drought stress.
Figure 4

Venn den diagram of differential expressions of LEA2 genes in different plants tissues. A. tissues of G. hirsutum and B. tissues of G. tomentosum.

Venn den diagram of differential expressions of LEA2 genes in different plants tissues. A. tissues of G. hirsutum and B. tissues of G. tomentosum.

Expression profiles of LEA2 genes Under drought treatment in G. hirsutum and G. tomentosum

Gene expression profile provides vital information of the roles played by the genes in plants (Movahedi ). In order to determine the expression pattern of the LEA2 genes in tolerant and non-tolerant upland cotton genotypes, we carried the qRT-PCR validation of 48 LEA2 genes in leaves, roots and stem tissues. The 48 genes were selected based on the RNA sequence expression profile, 24 genes were up regulated while the other half were down regulated. The samples for qRT-PCR were collected at 0, 7 and 14th day of stress exposure, in which 0 day (control) was used as the reference point. More genes were up regulated in all the tissues of the drought tolerant genotype, G. tomentosum as compared to the drought sensitive genotype, G. hirsutum (Figure 5). The result obtained denotes that the drought resistant genotype have the potential to mobilize more drought related genes, when exposed to drought tolerance as opposed to the less tolerant genotypes, thus the higher expression levels, similar results were obtained in the expression for cold tolerance genes in Arabidopsis with varying tolerance levels, more genes were up regulated in the cold tolerant and in the cold susceptible genotype (Hannah ).
Figure 5

Differential expression of upland cotton LEA2 genes under drought stress. The heat map was visualized using Mev.exe program. (Showed by log2 values) under control and in treated samples for 7 and 14 days after drought treatment (i) G. tomentosum and (ii) G. hirsutum. Red–up regulated, green-down regulated and black–no expression. Red box indicate the cloned gene.

Differential expression of upland cotton LEA2 genes under drought stress. The heat map was visualized using Mev.exe program. (Showed by log2 values) under control and in treated samples for 7 and 14 days after drought treatment (i) G. tomentosum and (ii) G. hirsutum. Red–up regulated, green-down regulated and black–no expression. Red box indicate the cloned gene. The up regulation of LEA2 genes under drought stress, could possibly explain their protective role in plants tissues under dehydration stress. For instance, HVA1, a LEA gene from barley (Hordeum vulgare L) was found to confer drought stress in transgenic rice (Babu ). Interestingly, some phylogenetic LEA2 gene pairs, orthologous genes were found to have differential expression pattern in either of the cotton genotypes (Figure 6), for instance, CotAD_71431 and CotAD_51205 exhibited varied expression pattern under drought and salt stress conditions as evident in the RNA expression analysis. The result suggests that even if these genes are cladded together; they could have developed different biological function over time. Orthologous genes are members of the genes with a common evolutionary origin and share greater percentage of sequence similarity (Nehrt ). According to the expression pattern of LEA2 genes in different tissues, it would be interesting to functionally characterize these genes in upland cotton, G. hirsutum. Majority of the LEA2 genes showed higher expression level in leaf and root tissues, which indicated the functional conservation of the gene sub family. The variation in expression between G. hirsutum and G. tomentosum could be due to broad changes in environmental conditions, G. tomentosum exhibits divergence signals that are associated with directionally selected traits and are functionally related to stress responses. These results suggest that stress adaptation in G. tomentosum might have involved the evolution of protein-coding sequences and thus these genes can be introgressed in to elite upland cotton, in order to boost their performance in the current face of declining fresh water and precipitation.
Figure 6

Quantitative PCR analysis of the selected LEA2 genes. Abbreviations: 7d-7 days and 14d-14 days of stress. Gh–G. hirsutum and Gt–G. tomentosum. Y-axis: relative expression (2-ΔΔCT. The enclosure indicated the cloned gene.

Quantitative PCR analysis of the selected LEA2 genes. Abbreviations: 7d-7 days and 14d-14 days of stress. Gh–G. hirsutum and Gt–G. tomentosum. Y-axis: relative expression (2-ΔΔCT. The enclosure indicated the cloned gene.

qRT-PCR Analysis of the Transformed Gene in Upland Cotton Tissues

Based on the expression analysis of the LEA2 genes in the various tissues of G. tomentosum (drought susceptible) and G. hirsutum (drought susceptible). We identified a single gene with significant expression in the various tissues and transformed the gene into the model plant, A. thaliana (Colombia ecotype-0). The gene CotAD_24498 was analyzed in various tissues of the upland cotton, G. hirsutum. This was carried out in order to determine its relative abundance within the plant. We found that the gene was more abundantly expressed in the reproductive tissues, more specifically in the petal and stamen (Figure 7A). In addition, we further carried out treatment on cotton seedlings after three true leaves stage under drought stress (PEG6000_15%) the samples for RNA extraction and qRT-PCR analysis were obtained from leaf, root and leaves at intervals of 0 h, 3 hr, 6 hr, 12 hr and 24 hr of post stress treatment. In all the three tissues, 6 hr marked the peak up-regulation of the gene, and then a gradual decline was observed with increase in time of stress exposure. The gene exhibited a significant up regulation in the root as compare to leaf and stem tissues (Figure 7B). We successfully transformed 9 lines with overexpressed gene CotAD_24498 (Figure 7C), out the nine (9) lines, three (3) lines showed the highest level of overexpression and were further used in the investigation of the potential of the gene in the transgenic lines under drought stress conditions (Figure 7D).
Figure 7

The qRT-PCR analysis of the expression of the cloned gene CotAD_24498 (A) Total RNA isolated from various tissue of cotton plant under normal conditions; (B) Total RNA extracted from drought-stressed cotton seedlings; (C) Polymerase chain reaction (PCR) analysis performed to check 630bp coding sequence (CDS) integration in transformed T1 generation, number 1–10 transgenic lines, 11 positive control (pWM101- CotAD_24498 and 12 is the negative control (wild-type, WT). (D) The transcripts expression levels of the CotAD_24498 of T2 transgenic lines analyzed through qRT-PCR.

The qRT-PCR analysis of the expression of the cloned gene CotAD_24498 (A) Total RNA isolated from various tissue of cotton plant under normal conditions; (B) Total RNA extracted from drought-stressed cotton seedlings; (C) Polymerase chain reaction (PCR) analysis performed to check 630bp coding sequence (CDS) integration in transformed T1 generation, number 1–10 transgenic lines, 11 positive control (pWM101- CotAD_24498 and 12 is the negative control (wild-type, WT). (D) The transcripts expression levels of the CotAD_24498 of T2 transgenic lines analyzed through qRT-PCR.

Overexpression of CotAD_24498 in plants promote root growth and confers tolerance to drought stress tolerance

Increased primary root growth and overall plant fresh biomass are indicators of tolerance to various abiotic stresses in which plants are exposed to (Verslues ; Jisha ). We sought to investigate the response of the transgenic lines and the wilt type to drought stress condition in relation to primary root length elongation and fresh biomass accumulation. The transgenic lines showed enhanced performance with relatively increased primary root growth and with higher fresh biomass increment compared to the wild type under drought stress condition. The drought stress was imposed by exposing the transgenic lines to different concentrations of mannitol 0 mM, 100 mM, 200 mM and 300 mM for a period of six (6) days. Under osmotic stress, highest level of root length assays and fresh biomass accumulations was observed at 100 mM of mannitol concentration (Figure 8B). The transgenic lines had significantly higher primary root length and fresh biomass accumulation (Figure 8C), an indication that the photosynthetic processes were not impaired by the drought stress as compared to the wilt type.
Figure 8

Overexpression of CotAD_24498 enhances root growth and drought stress tolerance in Arabidopsis transgenic lines (A) CotAD_24498 overexpressing and WT plants were grown vertically in 0.5 Murashige and Skoog (MS) medium supplemented with 0, 100, 200 and 300 mM mannitol and incubated for 6 days. (B). Root elongation comparisons on 0.5 MS put at normal and osmotic stress for 6 days. The seedlings were scored and photographed after 6 days post germination. (C). Quantitative determination of fresh weight biomass of wild-type (WT) and both transgenic lines (L2, L3 and L3) after 6 days post germination at normal and drought stress condition. In (B, C,), each experiment was repeated three times. Bar indicates standard error (SE). Different letters indicate significant differences between wild-type and transgenic lines (ANOVA; P < 0.05). CK: normal conditions.

Overexpression of CotAD_24498 enhances root growth and drought stress tolerance in Arabidopsis transgenic lines (A) CotAD_24498 overexpressing and WT plants were grown vertically in 0.5 Murashige and Skoog (MS) medium supplemented with 0, 100, 200 and 300 mM mannitol and incubated for 6 days. (B). Root elongation comparisons on 0.5 MS put at normal and osmotic stress for 6 days. The seedlings were scored and photographed after 6 days post germination. (C). Quantitative determination of fresh weight biomass of wild-type (WT) and both transgenic lines (L2, L3 and L3) after 6 days post germination at normal and drought stress condition. In (B, C,), each experiment was repeated three times. Bar indicates standard error (SE). Different letters indicate significant differences between wild-type and transgenic lines (ANOVA; P < 0.05). CK: normal conditions.

Transcripts Investigation of Drought Stress-Responsive Genes

The root appears to be the most relevant organ for breeding drought stress tolerance (Henry 2013). Underlying the ABA-mediated stress responses is the transcriptional regulation of stress-responsive gene expression (Giraudat ). Numerous genes have been reported that are up-regulated under stress conditions in vegetative tissues, these include a class of genes known as LEA genes, which are expressed abundantly in developing seed under normal conditions, osmolyte biosynthetic genes, and genes of general cellular metabolism. We undertook to check the expression of two known abiotic stress responsive genes on the transgenic lines (L2, L3 and L4) and the wild types when the plants are exposed to drought condition. The result showed that the stress responsive genes were highly up-regulated in the transgenic lines as opposed to the wild type (Figure 9). The result obtained was in agreement to the result obtained when the various LEA2 genes were analyzed through qRT-PCR on the tissues obtained from two upland cotton genotypes. More genes were found to be up regulated on the various tissues of the more tolerant genotype as opposed to the less tolerant. Constitutive expression of RD29A and ABF4 demonstrated enhanced drought tolerance in the transgenic Arabidopsis plants.
Figure 9

Expression levels of drought stress-responsive genes (ABF4 and RD29A) in transgenic lines and wild-type. Arabidopsis ACTIN2 was used as the reference gene mean values with ± SD. * P < 0.05 as calculated by Student’s t-test.

Expression levels of drought stress-responsive genes (ABF4 and RD29A) in transgenic lines and wild-type. Arabidopsis ACTIN2 was used as the reference gene mean values with ± SD. * P < 0.05 as calculated by Student’s t-test.

Oxidants and antioxidant determination in the transgenic lines

In order to understand the role of the transformed LEA2 genes in the transgenic lines in relation to drought stress. We carried out the analysis of the various oxidants and antioxidants measurements in the leaves of the transgenic lines and the wild type. The levels of oxidants were significantly reduced in the transgenic lines compared to the wild type (Figure 10A-B). When plants are exposed to drought the level of ROS increases, which results into oxidative stress. MDA concentration provides a measure on the damage caused on the membrane lipids due to oxidative stress (Jain ). The significant reduction in MDA and H2O2 in the leaf tissues of the transgenic lines showed that the transformed gene had a regulatory role in controlling various biological pathways geared toward detoxification of the reactive oxygen species in the cells. In addition, we quantified the levels of various antioxidants, SOD, POD and CAT. In all the three antioxidants, there was significant increased levels in the transgenic lines (L1, L2 and L3) compared to the wild type (Figure 10 C-D). The increased levels of the antioxidants showed that the transgenic lines had a higher ability to tolerant drought stress compared to the wild types. The results obtained in this research, correlates to previous findings, in which drought stressed wheat plants were found to have higher accumulation of oxidants levels (Luna ). More tolerant plants genotypes have ability to induct more of the antioxidants such as the CAT, POD and SOD in order to scavenge on the excess ROS and other deleterious molecules released by the cells due to stress condition (Bian and Jiang 2009).
Figure 10

determination of the oxidants and antioxidants in the transgenic lines under stress condition (A) Determination of hydrogen peroxide (H2O2) accumulation in leaves of wild-type (WT) and both transgenic lines (L2, L3, and L4) after 8-days drought stress (B) Determination of MDA accumulation in leaves of wild-type (WT) and both transgenic lines (L2, L3, and L4) after 8-days drought stress; (C) Catalase (CAT) activity, (D) peroxidase (POD) activity and (E) superoxide dismutase (SOD) activity. Data are means ± SE calculated from three replicates. Different letters indicate a significant difference between the WT and both transgenic lines (ANOVA; P < 0.05).

determination of the oxidants and antioxidants in the transgenic lines under stress condition (A) Determination of hydrogen peroxide (H2O2) accumulation in leaves of wild-type (WT) and both transgenic lines (L2, L3, and L4) after 8-days drought stress (B) Determination of MDA accumulation in leaves of wild-type (WT) and both transgenic lines (L2, L3, and L4) after 8-days drought stress; (C) Catalase (CAT) activity, (D) peroxidase (POD) activity and (E) superoxide dismutase (SOD) activity. Data are means ± SE calculated from three replicates. Different letters indicate a significant difference between the WT and both transgenic lines (ANOVA; P < 0.05).

Conclusions

In this study, the identification, phylogenetic relationships, miRNA targets, cis promoter analysis, GO functional annotation and exon/intron structures of LEA2 genes family members were evaluated in upland cotton, Gossypium hirsutum, and the tissue expression pattern of the two tetraploid cotton species, G. hirsutum (drought sensitive) and G. tomentosum (drought tolerant) were detected under drought stress. The abundance of LEA2 genes and unique gene structure reported in this work provide a solid foundation for future research to understand the evolution of LEA2 gene family and the potential functional role of the 157 LEA2 genes in plants under drought stress condition. Since the discovery of LEA genes, little work has been reported on LEA genes as a whole in upland cotton. The transformation and expression analysis of the transformed LEA2 gene indicated that the LEA2 genes have a profound role in enhancing drought stress tolerance. The transgenic lines L2, L3 and L4 exhibited superior performance compared to the wild type. The roots were significantly longer than the wild type under drought stress condition; similarly, the levels of oxidants in the levels were significantly reduced while the antioxidants levels were higher in the leaves of the transgenic lines compared to the wild type. An indication that the transgenic plants had a higher capacity to regulate the oxidative stress as opposed to the wild type (WT). The genes could be promoting growth of the root cells under limited water condition. Primary root growth is linked to drought stress tolerance; due to increased surface area of the roots thus improving its ability maximally absorb any little moisture available. Deep or extensive root growth is a trait known for most of the xerophytic plants (Brunner ).
  94 in total

Review 1.  Elastic proteins: biological roles and mechanical properties.

Authors:  John Gosline; Margo Lillie; Emily Carrington; Paul Guerette; Christine Ortlepp; Ken Savage
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2002-02-28       Impact factor: 6.237

2.  Global food demand and the sustainable intensification of agriculture.

Authors:  David Tilman; Christian Balzer; Jason Hill; Belinda L Befort
Journal:  Proc Natl Acad Sci U S A       Date:  2011-11-21       Impact factor: 11.205

Review 3.  Cold, salinity and drought stresses: an overview.

Authors:  Shilpi Mahajan; Narendra Tuteja
Journal:  Arch Biochem Biophys       Date:  2005-11-09       Impact factor: 4.013

4.  Using CLUSTAL for multiple sequence alignments.

Authors:  D G Higgins; J D Thompson; T J Gibson
Journal:  Methods Enzymol       Date:  1996       Impact factor: 1.600

5.  Genome-wide identification and comparative expression analysis of LEA genes in watermelon and melon genomes.

Authors:  Yasemin Celik Altunoglu; Mehmet Cengiz Baloglu; Pinar Baloglu; Esra Nurten Yer; Sibel Kara
Journal:  Physiol Mol Biol Plants       Date:  2017-01-06

Review 6.  Genome architecture: domain organization of interphase chromosomes.

Authors:  Wendy A Bickmore; Bas van Steensel
Journal:  Cell       Date:  2013-03-14       Impact factor: 41.582

Review 7.  Transcriptional 'memory' of a stress: transient chromatin and memory (epigenetic) marks at stress-response genes.

Authors:  Zoya Avramova
Journal:  Plant J       Date:  2015-04-15       Impact factor: 6.417

8.  Two Chloroplast Proteins Suppress Drought Resistance by Affecting ROS Production in Guard Cells.

Authors:  Zhen Wang; Fuxing Wang; Yechun Hong; Jirong Huang; Huazhong Shi; Jian-Kang Zhu
Journal:  Plant Physiol       Date:  2016-10-15       Impact factor: 8.340

9.  Chaperone activity of ERD10 and ERD14, two disordered stress-related plant proteins.

Authors:  Denes Kovacs; Eva Kalmar; Zsolt Torok; Peter Tompa
Journal:  Plant Physiol       Date:  2008-03-21       Impact factor: 8.340

10.  A SNARE-Like Superfamily Protein SbSLSP from the Halophyte Salicornia brachiata Confers Salt and Drought Tolerance by Maintaining Membrane Stability, K(+)/Na(+) Ratio, and Antioxidant Machinery.

Authors:  Dinkar Singh; Narendra Singh Yadav; Vivekanand Tiwari; Pradeep K Agarwal; Bhavanath Jha
Journal:  Front Plant Sci       Date:  2016-06-02       Impact factor: 5.753

View more
  15 in total

1.  Histone Demethylases ELF6 and JMJ13 Antagonistically Regulate Self-Fertility in Arabidopsis.

Authors:  Charlie Keyzor; Benoit Mermaz; Efstathios Trigazis; SoYoung Jo; Jie Song
Journal:  Front Plant Sci       Date:  2021-02-12       Impact factor: 5.753

2.  The genome of Prunus humilis provides new insights to drought adaption and population diversity.

Authors:  Yi Wang; Jun Xie; Hongna Zhang; Weidong Li; Zhanjun Wang; Huayang Li; Qian Tong; Gaixia Qiao; Yujuan Liu; Ying Tian; Yongzan Wei; Ping Li; Rong Wang; Weiping Chen; Zhengchang Liang; Meilong Xu
Journal:  DNA Res       Date:  2022-06-25       Impact factor: 4.477

3.  Genome-wide analysis of the cotton G-coupled receptor proteins (GPCR) and functional analysis of GTOM1, a novel cotton GPCR gene under drought and cold stress.

Authors:  Pu Lu; Richard Odongo Magwanga; Joy Nyangasi Kirungu; Qi Dong; Xiaoyan Cai; Zhongli Zhou; Xingxing Wang; Yanchao Xu; Yuqing Hou; Renhai Peng; Kunbo Wang; Fang Liu
Journal:  BMC Genomics       Date:  2019-08-14       Impact factor: 3.969

4.  The Pepper Late Embryogenesis Abundant Protein, CaDIL1, Positively Regulates Drought Tolerance and ABA Signaling.

Authors:  Junsub Lim; Chae Woo Lim; Sung Chul Lee
Journal:  Front Plant Sci       Date:  2018-09-04       Impact factor: 5.753

5.  SSR-Linkage map of interspecific populations derived from Gossypium trilobum and Gossypium thurberi and determination of genes harbored within the segregating distortion regions.

Authors:  Pengcheng Li; Joy Nyangasi Kirungu; Hejun Lu; Richard Odongo Magwanga; Pu Lu; Xiaoyan Cai; Zhongli Zhou; Xingxing Wang; Yuqing Hou; Yuhong Wang; Yanchao Xu; Renhai Peng; Yingfan Cai; Yun Zhou; Kunbo Wang; Fang Liu
Journal:  PLoS One       Date:  2018-11-12       Impact factor: 3.240

6.  RNA-Sequencing, Physiological and RNAi Analyses Provide Insights into the Response Mechanism of the ABC-Mediated Resistance to Verticillium dahliae Infection in Cotton.

Authors:  Qi Dong; Richard Odongo Magwanga; Xiaoyan Cai; Pu Lu; Joy Nyangasi Kirungu; Zhongli Zhou; Xingfen Wang; Xingxing Wang; Yanchao Xu; Yuqing Hou; Kunbo Wang; Renhai Peng; Zhiying Ma; Fang Liu
Journal:  Genes (Basel)       Date:  2019-02-01       Impact factor: 4.096

7.  Genome wide identification of the trihelix transcription factors and overexpression of Gh_A05G2067 (GT-2), a novel gene contributing to increased drought and salt stresses tolerance in cotton.

Authors:  Richard O Magwanga; Joy N Kirungu; Pu Lu; Xiu Yang; Qi Dong; Xiaoyan Cai; Yanchao Xu; Xingxing Wang; Zhongli Zhou; Yuqing Hou; Regina Nyunja; Stephen G Agong; Jinping Hua; Baohong Zhang; Kunbo Wang; Fang Liu
Journal:  Physiol Plant       Date:  2019-02-13       Impact factor: 4.500

8.  AsHSP26.8a, a creeping bentgrass small heat shock protein integrates different signaling pathways to modulate plant abiotic stress response.

Authors:  Xinbo Sun; Junfei Zhu; Xin Li; Zhigang Li; Liebao Han; Hong Luo
Journal:  BMC Plant Biol       Date:  2020-04-28       Impact factor: 4.215

9.  Whole Genome Analysis of Cyclin Dependent Kinase (CDK) Gene Family in Cotton and Functional Evaluation of the Role of CDKF4 Gene in Drought and Salt Stress Tolerance in Plants.

Authors:  Richard Odongo Magwanga; Pu Lu; Joy Nyangasi Kirungu; Xiaoyan Cai; Zhongli Zhou; Xingxing Wang; Latyr Diouf; Yanchao Xu; Yuqing Hou; Yangguang Hu; Qi Dong; Kunbo Wang; Fang Liu
Journal:  Int J Mol Sci       Date:  2018-09-05       Impact factor: 5.923

10.  Cell Wall Epitopes and Endoploidy as Reporters of Embryogenic Potential in Brachypodium Distachyon Callus Culture.

Authors:  Alexander Betekhtin; Magdalena Rojek; Katarzyna Nowak; Artur Pinski; Anna Milewska-Hendel; Ewa Kurczynska; John H Doonan; Robert Hasterok
Journal:  Int J Mol Sci       Date:  2018-11-29       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.