Literature DB >> 27072743

Genome-wide identification, structural analysis and new insights into late embryogenesis abundant (LEA) gene family formation pattern in Brassica napus.

Yu Liang1,2, Ziyi Xiong1, Jianxiao Zheng1, Dongyang Xu1, Zeyang Zhu1, Jun Xiang2, Jianping Gan2, Nadia Raboanatahiry1, Yongtai Yin1, Maoteng Li1,2.   

Abstract

Late embryogenesis abundant (LEA) proteins are a diverse and large group of polypeptides that play important roles in desiccation and freezing tolerance in plants. The LEA family has been systematically characterized in some plants but not Brassica napus. In this study, 108 BnLEA genes were identified in the B. napus genome and classified into eight families based on their conserved domains. Protein sequence alignments revealed an abundance of alanine, lysine and glutamic acid residues in BnLEA proteins. The BnLEA gene structure has few introns (<3), and they are distributed unevenly across all 19 chromosomes in B. napus, occurring as gene clusters in chromosomes A9, C2, C4 and C5. More than two-thirds of the BnLEA genes are associated with segmental duplication. Synteny analysis revealed that most LEA genes are conserved, although gene losses or gains were also identified. These results suggest that segmental duplication and whole-genome duplication played a major role in the expansion of the BnLEA gene family. Expression profiles analysis indicated that expression of most BnLEAs was increased in leaves and late stage seeds. This study presents a comprehensive overview of the LEA gene family in B. napus and provides new insights into the formation of this family.

Entities:  

Mesh:

Year:  2016        PMID: 27072743      PMCID: PMC4829847          DOI: 10.1038/srep24265

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Drought stress is an abiotic environmental state that can affect the morphological, physiological and biochemical characteristics of plants and lead to reductions in crop productivity due to adverse effects on plant growth1. Signaling pathways that are activated in response to drought challenge include ionic and osmotic stress signaling, detoxification signaling, and signaling of cell division coordination2. The expression of many signal transduction genes has been observed; for example, significant drought stress can induce DREB2A over-expression in transgenic Arabidopsis3. The galactinol synthase (GolS) genes of Arabidopsis are induced by drought and play a role in the accumulation of raffinose family oligosaccharides (RFOs), which might act as osmoprotectants in drought stress4. Late embryogenesis abundant (LEA) proteins accumulate during late embryogenesis and contribute to drought tolerance5. In plants, LEA proteins are produced during the last period of seed development concurrent with dehydration. LEA proteins were first observed and studied in late-developing cotton seeds6 and were subsequently identified in many other plants, such as rice, barley, wheat, maize, bean, sunflower7 and Arabidopsis8. LEA proteins have also been identified in other species, such as nematodes9 and chironomids (Polypedilum vanderplanki)10. Subcellular localization analysis has revealed that LEA proteins are mainly located in nuclear regions and the cytoplasm11. Although LEA proteins are mainly observed in plant seeds, they have also been detected in the seedlings, buds and roots of plants78. In contrast to other proteins involved in desiccation tolerance, LEA proteins have no apparent enzymatic activity and likely act as protectants of biomolecules and membranes under stress conditions11. However, some studies have indicated that individual LEA proteins might function as intrinsically disordered proteins to protect enzymes from induced aggregation1213. This protection may be due to space-filling by LEA proteins, referred to as the “molecular shield function”, which decreases the rate of collisions between aggregating proteins14. Moreover, LEA proteins contribute to the isolation of calcium and metal ions, which participate in signaling pathways in plants15. LEA proteins also aid the formation of the glassy state, in which nonreducing sugars accumulate in the cytoplasm of plants during periods of desiccation16. These finding imply that LEA proteins play a role in protecting plants from dehydration. LEA proteins are low-molecular-weight proteins composed of hydrophilic amino acids and are characterized by repeat motifs, structural disorder and high hydrophilicity in their natural forms7817. LEA proteins are classified into at least eight families in the Pfam database based on primary sequence and homology: LEA_1, LEA_2, LEA_3, LEA_4, LEA_5, LEA_6, dehydrin and seed maturation protein (SMP)18. In the LEAPdb database, these proteins are regrouped using a more detailed classification system, with 12 nonredundant classes19. Group 1–5 are considered the major members17. Group 1 proteins contain a 20-amino-acid motif (GGETRKEQLGEEGYREMGRK) and a high proportion of Gly, Glu and Gln residues20. A sequence called the K-segment (EKKGIMDKIKEKLPG), which functions as a chaperone to protect proteins that function in cell metabolism,21. Group 3 proteins have an 11-amino-acid (TAQAAKEKAGE) fragment with 13 repeats17. Group 4 contains no repeated motif sequences but features a conserved structure at the N-terminus that can form α-helical structure7. The amino acid residue homology of Group 5 proteins is low, which implies that these proteins are probably involved in seed maturation and dehydration21. Brassica napus (AACC, 2n = 38) originated from hybridization between Brassica rapa (AA, 2n = 20) and Brassica oleracea (CC, 2n = 18). B. napus is the third largest oil seed crops in the world. Quite a few studies have been conducted on different groups of LEA genes in B. napus in recent years222324. The Group 4 LEA protein of B. napus enhances abiotic stress tolerance in both Escherichia coli and transgenic Arabidopsis plants22. The dehydrin genes of Brassica juncea and B. napus are expressed at the late stages of silique development, suggesting that gene expression might be induced by water deficit and low temperatures, conditions that also affect seed germination23. Expression of the B. napus LEA protein gene in Chinese cabbage enhanced its growth ability under salt and drought stress24. Moreover, LEA proteins have been observed in B. napus lines with higher oil contents, suggesting that LEA proteins might contribute to dehydration tolerance during the oil-accumulation period and increased B. napus oil content25. Because B. napus is a hybrid species, its genome contains many duplications as well as inversions and translocations26. Previous studies have mainly focused on the function of different LEA families, and an analysis of the evolution, distribution and origin of the LEA gene family in B. napus has not been reported. In this study, the LEA gene families in B. napus were identified, and the structure, evolution and chromosome location of BnLEAs were analyzed. This study provides a foundation for further studies of the functions of the LEA family in B. napus.

Results

Genome-wide identification of BnLEA gene families in B. napus

The genome-wide identification of LEA gene families in B. napus was based on homology with LEA genes from Arabidopsis identified using the CNS-Genoscope database. A total of 108 LEA genes were identified in the genome of B. napus and named BnLEA1 to BnLEA108 (Table 1). The BnLEA genes were classified into eight families based on their conserved domain structures. The LEA_4, dehydrin and seed mature protein (SMP) families are the largest (25, 23 and 16 members, respectively) among the families (Fig. 1). The LEA_2 and LEA_3 family include 10 and 13 members, respectively. Fewer than 10 members of the other families were identified. LEA genes were also identified in sixteen other species, including both lower and higher plant species (Figure S1). Only two LEA genes were identified in the bacillariophyta. Vascular plants (except cotton) have more LEA genes than Physcomitrella patens, implying that LEA genes accumulated during the landing process27. Interestingly, in nearly half of the species containing LEA genes, the majority belong to the LEA_4 and dehydrin families, consistent with the predominance of the LEA_4 and dehydrin families in B. napus.
Table 1

LEA genes in B. napus genome and their sequence characteristics and subcellular location prediction.

NameGene IDFamilyChr.Gene position
Gene Length(bp)Protein Length(aa)Mol.Wt.(KD)pIGRAVYIntron numberSubcellular location
StartEndPProwlerTargetP
BnLEA1BnaAnng17910DLEA_2Un-R188498661885084898315116.404924.720.07483441otherO
BnLEA2BnaCnng23520DLEA_2Un-R219429812194389591515116.408914.720.0867551otherO
BnLEA3BnaA10g01410DLEA_3A107493787501717949810.298598.03−0.2285711SPS
BnLEA4BnaC05g01450DLEA_3C57568457575767329810.253559.16−0.2102041SPS
BnLEA5BnaA10g01720DSMPA10859776860870109417718.343294.55−0.3807911SPS
BnLEA6BnaC05g01750DSMPC5889027890157101017718.312274.62−0.3564971SPS
BnLEA7BnaC07g15380DDehydrinC7213011542130225311002164.451794.95−1.3888891SPO
BnLEA8BnaAnng29030DDehydrinUn-R3321461933215637101919421.87355.62−1.3211341otherO
BnLEA9BnaA07g11450DDehydrinA71064088210642008112722024.876395.07−1.4154551SPO
BnLEA10BnaC05g15780DDehydrinC596250569626614155927131.00625.09−1.4940961 O
BnLEA11BnaA09g24240DLEA_1A9170061171700680669013314.497499.24−0.872181otherS
BnLEA12BnaC05g24660DLEA_1C5191593261916005072513214.412369.24−0.8734851otherS
BnLEA13BnaC05g24760DLEA_1C5192078891920861372513214.412369.24−0.8734851otherS
BnLEA14BnaA08g01460DLEA_4A81181931118253560510312.0801311.75−1.4825241SPS
BnLEA15BnaC03g44340DDehydrinC3294654682946656210959510.519386.7−1.8926320 S
BnLEA16BnaA02g15750DLEA_4A291890669190802173644248.817659.13−0.3997742otherC
BnLEA17BnaC02g21020DLEA_4C21764637517648170179546050.692638.95−0.4078261otherC
BnLEA18BnaA02g36030DDehydrinA2-R666514667522100919421.725295.61−1.3515461otherS
BnLEA19BnaA07g32420DDehydrinA7224756992247666897019522.021495.51−1.4707691SPS
BnLEA20BnaA07g21490DDehydrinA71663015616631174101919421.755355.47−1.3082471otherO
BnLEA21BnaC06g21970DDehydrinC62412142924122509108118320.586075.41−1.3278691otherO
BnLEA22BnaC06g36880DDehydrinC6351723053517330199719922.438915.44−1.4703521SPS
BnLEA23BnaC07g22530DLEA_4C72894740828949034162720121.880815.5−0.4174131otherO
BnLEA24BnaC02g35130DLEA_4C23791468337915913123119220.388988.57−0.6947921otherO
BnLEA25BnaA08g15290DLEA_4A81274549612747188169348752.40395.75−0.9314171other 
BnLEA26BnaC07g03410DLEA_4C745850454586676163248052.59356.28−0.9204171other 
BnLEA27BnaC08g34610DLEA_6C83271793232718420489828.508264.58−0.9695120otherO
BnLEA28BnaA09g42180DLEA_6A92934624429346775532828.433114.72−1.0548780otherS
BnLEA29BnaA04g19670DLEA_6A41596221915962678459727.62735.21−1.1736110SPS
BnLEA30BnaC03g18750DLEA_6C396104179610873456717.526195.21−1.1802820SPS
BnLEA31BnaC04g44060DLEA_6C44425527644255629354727.629275.21−1.2111110SPS
BnLEA32BnaC04g09820DLEA_1C4742635274266863359810.647858.9−1.1020410otherO
BnLEA33BnaA05g08680DLEA_1A5481315348134803289810.647858.9−1.1020410otherO
BnLEA34BnaA04g05010DLEA_4A436762353677724149045148.54045.2−1.1521062otherO
BnLEA35BnaA05g07720DLEA_4A541935024194949144841045.01195.55−1.0339022SPS
BnLEA36BnaC04g08680DLEA_4C465039526505333138238842.955685.42−1.0396912SPS
BnLEA37BnaCnng32920DLEA_4Un-R3124774731249402165645248.752745.28−1.1632742other 
BnLEA38BnaA03g18930DLEA_5A389319948932678685889.630385.88−1.6590911otherS
BnLEA39BnaA04g22520DLEA_5A4175165241751704051711412.56039.58−1.0043861SPS
BnLEA40BnaA05g34530DLEA_5A5-R192195192930736849.218896.74−1.6773811otherS
BnLEA41BnaC03g22490DLEA_5C31240987712410478602889.587355.88−1.5988641otherS
BnLEA42BnaCnng27950DLEA_5Un-R2652549026526251762849.179835.69−1.5976191otherS
BnLEA43BnaC03g23770DLEA_4C31325448213255809132814214.838435.29−0.5443661other 
BnLEA44BnaC04g48420DLEA_4C44702114247023669252863868.654276−1.036052otherO
BnLEA45BnaC03g23780DLEA_4C3132591161325990078517418.676338.02−1.1816091otherO
BnLEA46BnaA03g20620DLEA_2A39781842978260176018020.018154.71−0.120 C
BnLEA47BnaC03g24650DLEA_2C31383976813841416164932135.381614.72−0.1950160otherO
BnLEA48BnaC04g49420DLEA_2C44751849947520741224318821.013364.81−0.1547870SPS
BnLEA49BnaA03g21280DLEA_2A31010830310109397109516117.451944.71−0.0155281otherC
BnLEA50BnaA05g01660DLEA_2A5949251950675142516617.725344.810.09277111other 
BnLEA51BnaC03g25640DLEA_2C31438478614385909112416117.451944.71−0.0155281otherC
BnLEA52BnaC04g00870DLEA_2C4759845760933108916617.752364.810.0765061otherC
BnLEA53BnaC04g50740DLEA_2C44825858548259654107016417.911564.59−0.0414631otherC
BnLEA54BnaA01g28600DLEA_4A11988528619886322103722624.598559.02−1.4752211SPS
BnLEA55BnaA05g23860DLEA_4A51798085717981901104522924.821888.81−1.4213971SPS
BnLEA56BnaC03g39230DLEA_4C32420468924205706101819621.047748.58−1.3459181SPS
BnLEA57BnaC05g37670DLEA_4C53660194236603206126522924.781868.83−1.3982531SPS
BnLEA58BnaA03g34560DLEA_4A3168136061681446686128631.45496.11−1.1045450otherO
BnLEA59BnaC03g40050DLEA_4C3250269582502781886128631.569845.79−1.1220280otherO
BnLEA60BnaC05g35990DLEA_4C5351796653518057290829632.25955.56−1.1067570otherO
BnLEA61BnaAnng35040DLEA_4Un-R3983643839837624118729732.330535.54−1.1188550otherO
BnLEA62BnaA03g36660DSMPA31174790511748957105223925.448555.97−0.3753142SPS
BnLEA63BnaA05g17150DSMPA51190236311903533105026226.680534.71−0.213742 S
BnLEA64BnaC03g42830DSMPC3275729172757384693025426.290295.01−0.3149612SPS
BnLEA65BnaC05g29980DSMPC52899459628995799120426226.682474.76−0.2606872SPS
BnLEA66BnaC05g29930DSMPC52884150328842692118926226.86174.71−0.2751912SPS
BnLEA67BnaA07g12750DDehydrinA71258407712583156921758.790099.4−0.6186671otherO
BnLEA68BnaA09g43150DDehydrinA92995895629960700174518319.178916.67−1.0207651SPO
BnLEA69BnaA09g31640DDehydrinA92359906823600339127213614.049479.36−0.9088241SPS
BnLEA70BnaC08g35660DDehydrinC83339068333391896121318019.095716.38−1.0955561SPO
BnLEA71BnaC08g22390DDehydrinC82508073525082027129313413.873299.19−0.8231341SPS
BnLEA72BnaA01g19290DLEA_5A11083002210831119109815316.83946.2−1.5477121otherS
BnLEA73BnaC01g23250DLEA_5C1169044121690536395215216.655126.02−1.5598681otherS
BnLEA74BnaA04g04540DLEA_3A43347814334829448112214.084038.58−0.5844261SPC
BnLEA75BnaC04g27000DLEA_3C4282401582824063848112214.077046.73−0.5122951SPC
BnLEA76BnaA02g36510DLEA_3A2-R9962209971018829310.092489.99−0.4645161SPC
BnLEA77BnaA03g26220DLEA_3A312840782128415367559410.095539.85−0.3489361SPC
BnLEA78BnaA09g00750DLEA_3A94769384778699329410.01549.82−0.3234041SPC
BnLEA79BnaC02g28140DLEA_3C22644438326445280898808.885229.75−0.4651SP 
BnLEA80BnaC03g73200DLEA_3C3-R136877013695047359410.068559.99−0.3308511SPC
BnLEA81BnaCnng19220DLEA_3Un-R1790398017904866887949.983349.82−0.2989361SP 
BnLEA82BnaA01g18270DLEA_3A1979376997946198519910.582019.72−0.41SPO
BnLEA83BnaC01g22200DLEA_3C115640472156414459749910.712179.81−0.3656571 O
BnLEA84BnaAnng29120DLEA_3Un-R38806950388073644145714.049685.380.82988511SPO
BnLEA85BnaA01g10880DLEA_4A154349215436380146025327.82398.55−1.2086962SPS
BnLEA86BnaA08g09930DLEA_4A893471119348357124724126.404175.86−1.1307051SPC
BnLEA87BnaC03g64470DLEA_4C35382798853829257127024126.461225.56−1.1348551SPC
BnLEA88BnaCnng20790DDehydrinUn-R312649383126592098324527.810996.75−1.6016331 O
BnLEA89BnaA01g05360DDehydrinA12510784251138460114915.986465.5−0.91SPO
BnLEA90BnaC01g00220DDehydrinC1660176689087414915.977445.89−0.906041otherO
BnLEA91BnaA10g24180DLEA_1A10158234261582428986415916.180838.93−0.817611SPC
BnLEA92BnaA03g55670DLEA_1A3-R36805236892287115916.229939.22−0.7723271otherC
BnLEA93BnaC09g48810DLEA_1C9474956644749655188815916.263979.3−0.817611other 
BnLEA94BnaCnng44320DLEA_1Un-R433842754338491864415916.200899.43−0.7849061otherC
BnLEA95BnaA06g29020DSMPA6198351341983598685319119.390324.78−0.5089012  
BnLEA96BnaA09g03580DSMPA91813221181414392319119.586744.99−0.4246072SPS
BnLEA97BnaC02g39520DSMPC2423992434240006382118418.887015.08−0.4967392 S
BnLEA98BnaC07g27650DSMPC7330893193309017785919119.404344.76−0.4963352  
BnLEA99BnaC09g02960DSMPC91691274169192164816617.890855.09−0.6307231otherO
BnLEA100BnaAnng11440DSMPUn-R124682751246907980518318.637664.94−0.4786892SPS
BnLEA101BnaA02g10350DSMPA252995695300820125117318.407475.7−0.3728321  
BnLEA102BnaA03g12360DSMPA35635572563642084817718.691815.43−0.3508471 C
BnLEA103BnaC02g14440DSMPC298808769882004112817318.483575.76−0.3687862  
BnLEA104BnaA02g34800DDehydrinA2247581472475910996319018.623947.14−1.0226321SPS
BnLEA105BnaA09g07980DDehydrinA938707153871730101617917.938297.14−1.1111731SPS
BnLEA106BnaC02g45160DDehydrinC2-R90211590310999517817.650977.97−1.0741572SPS
BnLEA107BnaC09g08130DDehydrinC951623765163397102217917.894237.14−1.1094971SPS
BnLEA108BnaCnng27850DDehydrinUn-R2640227426403289101617917.938297.14−1.1111731SPS
Figure 1

Phylogenetic analysis of the B. napus LEA genes.

LEA gene families are distinguished by different colors. The unrooted tree was generated using ClustalW in MEGA6 using the full-length amino acid sequences of the 108 B. napus LEA proteins.

The physicochemical parameters of each LEA gene were calculated using ExPASy. Most proteins in the same family have similar parameters. BnLEAs of the dehydrin family contain a greater number of amino acid residues (except for BnLEA15) than the other LEAs. LEA_6 family members all have relatively low molecular masses (Table 1). Approximately two-thirds of the BnLEA proteins have relatively low isoelectric points (pI < 7), including the LEA_2, LEA_5, LEA_6 and SMP families. The remaining proteins, particularly those in the LEA_1 and LEA_3 families, have pI > 7. The grand average of hydropathy (GRAVY) value was defined by the sum of the hydropathy values of all amino acids divided by the protein length (Table 1). LEA_2 proteins are the most hydrophobic, and LEA_5 members are the most hydrophilic; these results are consistent with those of the LEA proteins in Arabidopsis8. Nearly all of the BnLEAs are hydrophilic, with a GRAVY value <0, indicating that a large proportion of the LEA proteins are hydrophilic. Because low hydrophobicity and a large net charge are features of other LEA proteins82829 that allow them to be “completely or partially disordered”, these proteins may form flexible structural elements such as molecular chaperones that contribute to the protection of plants from desiccation3031. TargetP and PProwler were used to predict the subcellular location of 108 BnLEA proteins; most of the BnLEA proteins were predicted to participate in the secretory pathway (Table S1).

Sequence alignment and phylogenetic analysis of BnLEA genes

To determine the similarity and homology of the BnLEA genes, sequence alignments were performed, and an unrooted phylogenetic tree of the 108 BnLEA genes was constructed (Fig. 1). Gene pairs frequently appeared in the whole genome of B. napus (Fig. 1). Little similarity was observed among the eight families. The sequences of the BnLEA genes of the LEA_6 family are most highly conserved (Table S2, Figure S3). By contrast, the BnLEA genes of the LEA_4 family feature only 17.5% consensus positions, and nearly no identical positions were observed (Table S2, Figure S3). The other families exhibit moderate homology (Table S1, Figure S3). Every family contains the conserved regions. The dehydrin, LEA_4, LEA_6 and LEA_1 families all contains three regions of homology. Two such regions were detected in the LEA_2 and LEA_5 families, and four are present in the LEA_3 family (Figure S2). Interestingly, a large number of alanine residues is present in all LEA families. Lysine and glutamic acid are the second- and third-most abundant residues, respectively, in all BnLEA proteins. A large number of glycine residues are widely present in the LEA_2, LEA_5, LEA_6, SMP and dehydrin families (Figure S2). These amino acids are also abundant in other LEA proteins and may contribute to the function of LEAs in the protection of many enzymes on the membrane51920. Although the different families exhibit low similarity, they cluster into eight major clades (Fig. 1). As expected, the BnLEA genes of LEA_3, LEA_2, SMP, LEA_6, and LEA_1 cluster into a separate branch. However, BnLEA14, which contains an LEA_4 domain, clusters into another clade, closer to the LEA_5 family (Fig. 1). The genetic relationship between BnLEA14 proteins and LEA_5 family proteins may have become closer during many years of evolution (Fig. 1). The analysis demonstrated that although the LEA_1 and LEA_ 6 families contain different conserved domains, they might have evolved to a closer relationship during evolution. Forty sister pairs of genes were identified in the phylogenetic trees with very strong bootstrap support (100%). Another three pairs of genes also had relatively high bootstrap support (90–99%) (Fig. 1). Most of the gene pairs had short branch lengths, suggesting recent divergence (Fig. 1). These findings indicate that gene pairs are relatively common among the 108 LEA genes of B. napus. During evolution, the conserved areas have been preserved, but several variations have also occurred, enabling the division of some genes into subfamilies.

Structural analysis of BnLEA genes

To characterize the structural diversity of the BnLEA genes, exon-intron organization analysis of the individual BnLEA genes was performed, and some genes from each family used in the conserved domain analysis or motifs model structure were selected for further research. The majority of the LEA genes contain two or three exons, whereas members of the LEA_6 family have only one intron, and the 16 BnLEA genes have no introns (Fig. 2). A high proportion of the introns in the BnLEA genes are in phase-0 (interrupted by exactly two triplet codons). All members of the LEA_5 family and some members of other families contain phase-1 introns (separated by the first and second nucleotides of a codon). Twenty-five phase-2 introns (split by the second and third nucleotides of a codon) were observed. The majority of phase-2 introns were observed in the LEA_3 family (Fig. 2). Most of the closely clustered LEA genes in the same families have similar exon numbers and intron lengths. By examining the exon-intron organization and paralogous pairs of LEA genes that clustered together at the terminal branch of the phylogenetic tree, various exon-intron changes were identified. Six pairs of BnLEA genes exhibit exon-intron gain/loss variations (BnLEA15-BnLEA88, BnLEA16-BnLEA17, BnLEA44-BnLEA45, BnLEA104-BnLEA106, BnLEA103-BnLEA101, BnLEA62-BnLEA64), possibly due to a single intron loss or gain event during the long evolution process32.
Figure 2

Exon–intron organization of the BnLEA genes.

Double-sided wedge boxes represent exons, and different colors indicate different LEA gene families. Black lines represent introns, and untranslated regions (UTRs) are indicated by light-gray purple boxes. Numbered marks represent the splicing phases. Phase-0 is not marked. The exon and intron sizes can be estimated using the scale at the bottom.

Because the 108 BnLEA genes did not share high similarity, several typical genes of each family were submitted to MEME for domain or motif structure analysis. Ten motifs were identified as conserved motifs. Motif 1, which was present in every family, encodes a conserved LEA domain, as indicated by the Pfam codes and WebLogo (Fig. 3). Most of the closely related genes in each family exhibit similar motif compositions, suggesting functional similarities in the LEA family. Motif 1s of the LEA_2 family is the biggest motif. The LEA_6 family has the lowest number of motifs, only five or six(Fig. 3). These results imply that the composition of the structural motifs varies among different LEA families but is similar within families and that the motifs encoding the LEA domains are conserved.
Figure 3

Motif patterns of different BnLEA families and WebLogo plot of consensus motifs in each BnLEA gene family.

Representative B. napus LEA proteins were selected for alignment, and LEA motifs are shown as motif 1 (light blue box). The lengths of the proteins and motifs can be estimated using the scale at the bottom. The Pfam codes of the LEA motifs of each family are shown.

Chromosomal location and duplication pattern analysis of BnLEA genes

The chromosomal location of the LEA genes was analyzed, and the positions and chromosome locations of 96 BnLEA genes were clearly identified on the 19 chromosomes of B. napus (Table 1, Fig. 4). The number of BnLEA genes varies considerably among the different chromosomes, and chromosomes C3 and A6 contain the greatest (n = 12) and lowest (n = 1) numbers, respectively (Fig. 4). In general, genes belonging to the same family are distributed in different chromosomes to realize full functionality. Interestingly, genes of the dehydrin and LEA_4 families are only located on chromosomes A7, C6 and A8, suggesting that these genes have a tendency to duplicate and evolve more conservatively within one chromosome. High-density LEA gene clusters were identified in certain chromosomal regions, e.g., at the top of chromosomes A9, C2, C4, and C5 and in the middle of chromosome C3 (Fig. 4). Thus, the final chromosomal locations of the LEA genes may be the result of LEA gene duplication patterns.
Figure 4

Distribution of BnLEA gene family members on B. napus chromosomes.

The 96 BnLEA genes for which exact chromosomal information was available in the database were mapped to the 19 B. napus chromosomes. The color of each gene indicates the corresponding family.

Gene family expansion occurs via three mechanisms: tandem duplication, segmental duplication and whole-genome duplication (WGD)33. The progenitor diploid genomes of B. napus are ancient polyploids, and large-scale chromosome rearrangements have occurred since their evolution from a lower chromosome number progenitor34. Duplicated regions of the Arabidopsis genome occur 10 to 14 times within the B. napus genome35. Moreover, chromosomal gene location and homology synteny analyses have revealed that BnLEA genes are closely phylogenetically related to other Brassicaceae species (B. rapa, B. oleracea, Arabidopsis) and that Brassicaceae experienced an extra whole-genome triplication (WGT) event32. Tandem duplications and segmental duplications also played an important role in the model plant Arabidopsis36. We investigated gene duplication events to understand the genome expansion mechanism of the BnLEA gene superfamily in B. napus. Six tandemly duplicated genes (BnLEA43/BnLEA45, BnLEA12/BnLEA13, and BnLEA66/BnLEA65) located on chromosomes C3 and C5 were identified (Fig. 4, Table 1). All 108 BnLEA genes in the Brassica database were reviewed, and the results revealed that nearly two thirds of the BnLEA genes are associated with segmental duplications. Two loci (At1g32560 and At3g22490) have three copies involved in segmental duplications (Table 2). Comparing the distributions of genes around the LEA genes in the genomes of A. thaliana, B. oleracea, B. rapa and B. napus revealed that the synteny of the LEA_1, LEA_2, LEA_3, LEA_4, SMP and dehydrin families is preserved, along with some genes that were lost or duplicated (Figure S4), whereas the synteny of the LEA_5 and LEA_6 families is poor.
Table 2

Synonymous (Ks) and nonsynonymous (Ka) nucleotide substitution rates for Arabidopsis thaliana and B. napus LEA protein coding loci.

A. thalianaIDB. napus geneB. napus IDLEA familyKaKsKa/Ks
one copy loci
 At1g52690BnLEA14BnaA08g01460DLEA_40.70671.23460.5724
 At1g54410BnLEA15BnaC03g44340DDehydrin0.05480.38030.1442
 At2g03740BnLEA23BnaC07g22530DLEA_40.19250.38260.503
 At2g03850BnLEA24BnaC02g35130DLEA_40.18620.41530.4484
 At2g42540BnLEA43BnaC03g23770DLEA_40.16320.4750.3437
 At3g22500BnLEA66BnaC05g29930DSMP0.07090.64550.1098
 At4g38410BnLEA88BnaCnng20790DDehydrin0.26080.70550.3697
 At5g53270BnLEA103BnaC02g14440DSMP0.20380.6880.2962
two-copy loci
 At1g01470BnLEA1BnaAnng17910DLEA_20.05880.55820.1053
 BnLEA2BnaCnng23520DLEA_20.06830.55660.1227
 At1g02820BnLEA3BnaA10g01410DLEA_30.14980.23280.6434
 BnLEA4BnaC05g01450DLEA_30.15880.33060.4802
 At1g03120BnLEA5BnaA10g01720DSMP0.1470.46960.3131
 BnLEA6BnaC05g01750DSMP0.14080.44270.318
 At1g20440BnLEA7BnaC07g15380DDehydrin0.23340.61260.381
 BnLEA8BnaAnng29030DDehydrin0.30240.76860.3935
 At1g20450BnLEA9BnaA07g11450DDehydrin0.13570.68260.1988
 BnLEA10BnaC05g15780DDehydrin0.16050.71910.2232
 At1g72100BnLEA16BnaA02g15750DLEA_40.11190.64310.1739
 BnLEA17BnaC02g21020DLEA_40.110.60990.1803
 At2g18340BnLEA25BnaA08g15290DLEA_40.37283.49120.1068
 BnLEA26BnaC07g03410DLEA_40.23741.22810.1933
 At2g23120BnLEA27BnaC08g34610DLEA_60.25860.68540.3772
 BnLEA28BnaA09g42180DLEA_60.26660.72450.368
 At2g35300BnLEA32BnaC04g09820DLEA_10.08170.43680.1871
 BnLEA33BnaA05g08680DLEA_10.07950.44420.179
 At2g42560BnLEA44BnaC04g48420DLEA_40.3691.03870.3553
 BnLEA45BnaC03g23780DLEA_40.47970.96530.4969
 At3g51810BnLEA72BnaA01g19290DLEA_50.04350.33130.1312
 BnLEA73BnaC01g23250DLEA_50.03160.37910.0834
 At3g53770BnLEA74BnaA04g04540DLEA_30.17910.6170.2903
 BnLEA75BnaC04g27000DLEA_30.17450.58780.297
 At4g39130BnLEA89BnaA01g05360DDehydrin0.1740.56010.3107
 BnLEA90BnaC01g00220DDehydrin0.1770.48550.3647
 At5g53260BnLEA101BnaA02g10350DSMP0.16250.63610.2555
 BnLEA102BnaA03g12360DSMP0.15810.66270.2386
three-copy loci
 At1g32560BnLEA11BnaA09g24240DLEA_10.14660.69350.2114
 BnLEA12BnaC05g24660DLEA_10.15480.66260.2337
 BnLEA13BnaC05g24760DLEA_10.15480.66260.2337
 At2g33690BnLEA29BnaA04g19670DLEA_60.10350.47560.2177
 BnLEA30BnaC03g18750DLEA_60.10120.34670.292
 BnLEA31BnaC04g44060DLEA_60.10830.25790.4199
 At2g44060BnLEA46BnaA03g20620DLEA_20.02820.7440.0379
 BnLEA47BnaC03g24650DLEA_20.03080.76550.0403
 BnLEA48BnaC04g49420DLEA_20.03471.93410.0179
 At4g15910BnLEA82BnaA01g18270DLEA_30.07220.42180.1712
 BnLEA83BnaC01g22200DLEA_30.09810.46330.2117
 BnLEA84BnaAnng29120DLEA_30.21120.53510.3947
 At4g21020BnLEA85BnaA01g10880DLEA_40.17630.8530.2067
 BnLEA86BnaA08g09930DLEA_40.10870.6840.1576
 BnLEA87BnaC03g64470DLEA_40.11550.62720.1841
four-copy loci
 At2g36640BnLEA34BnaA04g05010DLEA_40.24872.10630.1181
 BnLEA35BnaA05g07720DLEA_40.16210.77670.2088
 BnLEA36BnaC04g08680DLEA_40.16670.8130.205
 BnLEA37BnaCnng32920DLEA_40.25322.05840.123
 At3g15670BnLEA54BnaA01g28600DLEA_40.0690.55190.1251
 BnLEA55BnaA05g23860DLEA_40.0570.49610.1148
 BnLEA56BnaC03g39230DLEA_40.09110.61660.1477
 BnLEA57BnaC05g37670DLEA_40.06910.44810.1542
 At3g17520BnLEA58BnaA03g34560DLEA_40.15280.69350.2203
 BnLEA59BnaC03g40050DLEA_40.16130.71640.2252
 BnLEA60BnaC05g35990DLEA_40.11580.64240.1802
 BnLEA61BnaAnng35040DLEA_40.13530.65930.2052
 At3g22490BnLEA62BnaA03g36660DSMP0.26460.81180.3259
 BnLEA63BnaA05g17150DSMP0.05390.58090.0928
 BnLEA64BnaC03g42830DSMP0.12260.57130.2146
 BnLEA65BnaC05g29980DSMP0.05180.60930.085
 At5g06760BnLEA91BnaA10g24180DLEA_10.07810.47870.1632
 BnLEA92BnaA03g55670DLEA_10.08140.38570.2111
 BnLEA93BnaC09g48810DLEA_10.0860.45480.1891
 BnLEA94BnaCnng44320DLEA_10.07350.35180.2089
five-copy loci
 At1g76180BnLEA18BnaA02g36030DDehydrin0.14860.85390.1741
 BnLEA19BnaA07g32420DDehydrin0.10580.69620.152
 BnLEA20BnaA07g21490DDehydrin0.13510.52540.2571
 BnLEA21BnaC06g21970DDehydrin0.20610.57370.3592
 BnLEA22BnaC06g36880DDehydrin0.10590.74290.1425
 At2g40170BnLEA38BnaA03g18930DLEA_50.12140.53160.2283
 BnLEA39BnaA04g22520DLEA_50.17060.53740.3174
 BnLEA40BnaA05g34530DLEA_50.07010.46390.1512
 BnLEA41BnaC03g22490DLEA_50.12370.54910.2253
 BnLEA42BnaCnng27950DLEA_50.06990.47180.1481
 At2g46140BnLEA49BnaA03g21280DLEA_20.06150.28220.218
 BnLEA50BnaA05g01660DLEA_20.0510.23810.214
 BnLEA51BnaC03g25640DLEA_20.06150.30770.2
 BnLEA52BnaC04g00870DLEA_20.05390.22640.2381
 BnLEA53BnaC04g50740DLEA_20.0710.35120.2022
 At3g50980BnLEA67BnaA07g12750DDehydrin0.83861.85380.4524
 BnLEA68BnaA09g43150DDehydrin0.69222.00640.345
 BnLEA69BnaA09g31640DDehydrin0.16380.57980.2825
 BnLEA70BnaC08g35660DDehydrin0.70281.83440.3831
 BnLEA71BnaC08g22390DDehydrin0.16870.59940.2814
 At5g66400BnLEA104BnaA02g34800DDehydrin0.11770.73690.1597
 BnLEA105BnaA09g07980DDehydrin0.11060.67540.1638
 BnLEA106BnaC02g45160DDehydrin0.1060.76180.1391
 BnLEA107BnaC09g08130DDehydrin0.10880.6850.1588
 BnLEA108BnaCnng27850DDehydrin0.11060.67540.1638
six-copy loci
 At4g02380BnLEA76BnaA02g36510DLEA_30.11840.33850.3497
 BnLEA77BnaA03g26220DLEA_30.06370.18850.3382
 BnLEA78BnaA09g00750DLEA_30.15190.31040.4894
 BnLEA79BnaC02g28140DLEA_30.12490.33850.369
 BnLEA80BnaC03g73200DLEA_30.05190.18650.2784
 BnLEA81BnaCnng19220DLEA_30.15920.30730.518
 At5g27980BnLEA95BnaA06g29020DSMP0.12380.52460.236
 BnLEA96BnaA09g03580DSMP0.12250.65940.1858
 BnLEA97BnaC02g39520DSMP0.10270.52010.1974
 BnLEA98BnaC07g27650DSMP0.11720.52520.2231
 BnLEA99BnaC09g02960DSMP0.45170.81930.5513
 BnLEA100BnaAnng11440DSMP0.12170.61190.1988
Furthermore, the synteny maps of the genes in the clusters located in chromosomes A9 and C4 revealed the process of gene expansion and clustering (Fig. 5). In chromosome C4, the genes in two A. thaliana chromosomes (chromosomes 2 and 3) were linked to B. oleracea genes and were accompanied by gene expansion in B. napus (Fig. 5A). Among the analyzed genes, nearly half of them contained crossovers. In chromosome A9, the homologous A. thaliana LEA genes are distributed in all five Arabidopsis chromosomes. The clustering progress from A. thaliana to B. rapa is more obvious (Fig. 5B), and all groups of genes linked to B. napus contain crossovers, suggesting that the crossover events occurred during the allopolyploidy progress. The BnLEA gene clusters likely formed via the duplication of an ancestral gene during the WGD event, followed by tandem duplication and segmental duplication in the clusters. In the cluster of chromosome C5, BnLEA66/BnLEA65 and BnLEA12/BnLEA13 are tandem duplication genes, although phylogenetic analysis regrouped BnLEA63 and BnLEA11 together with these genes, respectively, indicating that these genes might have descended from a common ancestor (Fig. 6A). Moreover, in the gene cluster, BnLEA4 and BnLEA6 are associated with segmental duplication because they exhibitsynteny relationships with BnLEA3 and BnLEA5, respectively. Phylogenetic analysis also demonstrated that BnLEA3/BnLEA4 and BnLEA5/BnLEA6 are pairs of homologous genes, suggesting that the four genes might have descended from two ancestors. Interestingly, BnLEA3 and BnLEA5 are located in close proximity on chromosome A10, which implies that segmental duplication also played a role in LEA gene cluster formation (Fig. 6B).
Figure 5

Synteny analysis map of gene clusters in B. napus chromosomes.

(A) Genes located on B. napus chromosome C4 are syntenic with genes of B. oleracea and A. thaliana. (B) Genes located on B. napus chromosome A9 are syntenic with genes of B. rapa and A. thaliana. The different colors of the gene IDs indicate their individual LEA families (brown: dehydrin; light green: LEA_4;bottle green: LEA_3; red: LEA_1; purple: LEA_6; blue: SMP).

Figure 6

Phylogenetic relationships and hypothetical evolutionary progress of the clustering of BnLEA genes in B. napus chromosome C5.

(A) Phylogenetic relationships of selected BnLEA genes in the cluster. (B) Hypothetical mechanism of BnLEA gene cluster formation. The letters T, S, and W in the schematic diagram of the hypothetical origins of BnLEA genes indicate putative tandem duplication, segmental duplication and whole-genome duplication, respectively.

Synonymous (Ks) and nonsynonymous (Ka) values were used to explore the selective pressure on duplicated BnLEA genes. In general, a Ka/Ks ratio greater than 1 indicates positive selection, a ratio less than 1 indicates functional constraint, and a Ka/Ks ratio equal to 1 indicates neutral selection37. The orthologous LEA gene pairs between the B. napus and A. thaliana genomes were used to estimate Ka, Ks, and Ka/Ks (Table 2). The results revealed that most of the BnLEA genes have Ka/Ks ratios greater than 0.1. However, the lowest Ka/Ks ratio is only 0.0179 (BnLEA48), and the highest Ka/Ks ratio is 0.6434 (BnLEA3). The genes of the LEA_3 and LEA_6 families exhibit relatively high Ka/Ks ratios, whereas the LEA_2 and LEA_5 gene families have lower Ka/Ks ratios. The Ka/Ks ratios of the other families are 0.2–0.3. These findings indicate that LEA_2 and LEA_5 genes might preferentially conserve function and structure under selective pressure.

Expression profiles analysis of BnLEA genes in different tissues

To investigate the expression pattern of LEA genes in B. napus, the qRT-PCR of BnLEAs genes were performed. The present results indicated that the accumulation of BnLEA genes was associated with different tissues, and the expression pattern also differed among each LEA gene family (Fig. 7). Pair-wise genes showed similar expression pattern, further analysis revealed that the expression of more than two thirds of BnLEAs were increased in leaf, especially BnLEA91 and BnLEA43. Compared with the early developmental stage seeds (19 weeks after seeding), late stage developmental seeds (40 weeks after seeding) showed much higher expression level of BnLEAs, for example, BnLEA93 and BnLEA34. Leaves are sensitive tissues in stress environment, they become wilt or died in stress condition and affect the photosynthesis in plants1. Late developmental stage seeds frequently suffered from dehydration, the present reported high expression of BnLEA genes in the late developmental stage seeds was consistent with reported LEA protein function7. Interestingly, some phylogenetic gene pairs have different expression pattern (BnLEA7/BnLEA9, BnLEA25/BnLEA26, BnLEA60/BnLEA61, BnLEA91/BnLEA93). The result suggests even if these genes contain close phylogenetic relationship they may develop different biological function.
Figure 7

Hierarchical clustering of the expression profiles of BnLEA genes in different tissues.

The log-transformed values of the relative expression levels of BnLEA genes were used for hierarchical cluster analysis (original data shown in Table S4). The color scale represents relative expression levels with increased transcript (yellow) or decreased transcript (purple). Early_stage seeds were got 19 weeks after seeding, late_stage seeds were 40 weeks after seeding.

Discussion

LEA gene family has been reported in many crops, such as rice and maize7. However, the genome-wide identification and annotation of LEA genes has not been reported in B. napus. In this study, 108 LEA family genes were identified in B. napus. The BnLEA gene family is larger than LEA families in homologous crucifer plants, such as B. oleracea (40 LEA genes), B. rapa (66 LEA genes) and A. thaliana (51 LEA genes). B. napus originated from the hybridization of B. oleracea and B. rapa, and its assembled genome size is larger than that of B. oleracea (540 Mb) and B. rapa (312 Mb)26. The preservation of LEA genes during a polyploidy event suggests that these genes play important roles in plant development7242728. In general, genes that respond to stress contain fewer introns28. Confirming this assumption, 92 of the 108 BnLEA genes have no more than two introns. Low intron numbers have also been observed in other stress-response gene families, such as the trehalose-6-phosphate synthase gene family38. Introns can have a deleterious effect on gene expression by delaying transcript production. Moreover, introns can extend the length of the nascent transcript, resulting in an additional expense for transcription39. The motif numbers and composition of each family vary, although some amino acid-rich regions were detected, similar to the Gly-rich region in Arabidopsis LEA_2 proteins, and the most-conserved motif is rich in lysine (K) residues8. The amino acid composition of the LEA proteins suggests disordered structure along their sequences540. Although LEA proteins are relatively small and intrinsically unstructured, they play important roles in cells30, likely by forming flexible, residual structural elements30, such as α-helical structures and polyproline II (PII) helices41. These elements contribute to structural flexibility and thus enable proteins to bind DNA, RNA, and proteins as interaction partners31. Conformational changes may facilitate interactions between LEA proteins and other macromolecules, such as membrane proteins, to maintain cell stability42. These results demonstrate that LEA proteins feature unique conserved amino acid-rich regions and an unstructured form that allow LEA proteins to function as flexible interactors to protect other molecules under stress conditions83031. Gene duplication not only expands genome content but also diversifies gene function to ensure optimal adaptability and evolution of plants33. Brassica species have undergone WGD events during their evolution32, and B. napus was formed by allopolyploidy26. Several independent lineage-specific WGD events have been identified in Brassicaceae3543. In this study, only 6 tandemly duplicated genes were identified, by contrast to LEA genes in Prunus mume (tandem duplication = 40%), perhaps because WGD did not occur in this species29. Most BnLEA genes showed a close relationship with respect to the block locations of Arabidopsis LEA genes. Phylogenetic and homology analyses suggested that WGD contributed to BnLEA gene family expansion. WGD has also been observed in the LEA family of another Brassicaceae species (Arabidopsis)21. The Arabidopsis genome contains 51 LEA gene family members8; therefore, a WGT event would be expected to produce more than 150 LEA genes in the B. rapa or B. oleracea genome, ultimately leading to even more LEA genes in B. napus. However, only 108 genes remain in the B. napus genome. This new finding implies that more than 50% of duplicated LEA genes were lost after WGT, likely due to extensive chromosome reshuffling during rediploidization after WGT. In fact, natural selection drove the rediploidization process via chromosomal rearrangement, thus removing extra homologous chromosomes, and further rounds of genomic reshuffling of the rediploid ancestor occurred at different evolutionary time points to create the different species of Brassica32. The number of LEA genes was possibly sufficient for Brassica during the long natural selection process, and thus some duplicated LEA genes did not remain in the B. napus genome. Similar deletions or losses of genes after WGT have been observed in the NBS-encoding genes of Brassica species44. Segmental duplication also plays a role in BnLEA superfamily expansion, and 72 BnLEA genes were determined to have one or two close relatives in the corresponding duplicated regions. Therefore, 66% (72/108) of the BnLEA genes can be accounted for by segmental duplication. This finding is similar to observations of the LEA gene family of Arabidopsis (12 pairs of 51 genes)8. Synteny analysis demonstrated that most LEA gene family members are located in well-conserved synteny regions, and some genes were deleted or gained. These findings indicate that some genes might have been translocated into a non-syntenic region. Similarly collinear genomic regions with some deleted genes have been identified in other gene families44. These present and previous findings suggest that segmental duplications and WGD likely played an important role in the expansion of the LEA family in B. napus, even though some genes were lost after WGT. Tandem duplication was also identified but played only a minor role. As discussed above, WGD and segmental duplication may be the main mechanisms underlying the expansion of the B. napus LEA gene family. During duplication, mutational targets may increase, and some genes are convergently restored to single-copy status45. In this study, genes of the A genomes from B. rapa and C genomes from B. oleracea exhibited greater homology to B. napus than to A. thaliana. A clustering phenomenon was also observed, accompanied by the loss or gain of some genes. Gene clustering has also been observed in the LEA gene families of other species28. Synteny analysis of LEA gene clustering in C4 and A9 revealed that some translocation and inversion events occurred during the evolution of A. thaliana, B. rapa, B. oleracea and B. napus. These events may have been the result of chromosomal rearrangement during the evolution of Brassica32. The formation of LEA gene clusters might have been affected by the subgenome dominance effect, resulting in one subgenome that retained more genes via gene fractionation after WGT3246. This hypothetical BnLEA gene clustering mechanism is similar to that identified in the LEA family of Populus28. First, the WGD event promoted genomic reshuffling accompanied by chromosome reduction, which contributed to the speciation of diploid Brassica plants. Genomic differentiation of the three basic genomes then generated the stable allotetraploid species B. napus. Second, after WGD, biased gene retention via gene fractionation and multicopy gene appearance promoted the gene-level evolution of Brassica species263246. This proposed mechanism of BnLEA gene cluster formation reflects both the WGD effect during the evolution of Brassica species and other duplications after WGD that resulted in the abundant morphotypes and genotypes of Brassica species. Genomic comparison is a rapid means of obtaining knowledge about less-studied taxon35. Many studies have revealed that LEA genes contribute to abiotic stress tolerance, particularly to drought stress716. According to the expression pattern of BnLEA genes in different tissues, it would be interesting to functionally characterize these genes in B. napus. As many BnLEA genes showed higher expression level in same tissue (leaf and late developmental stage seeds), which indicated the functional conservation of this gene family. Some of the BnLEA genes were more abundant in different tissues, which point toward their functional differences. Similar expression pattern were also observed in other gene families of Brassica species44. Therefore, existing knowledge on the function of LEA genes may explain why the LEA gene family expanded in terrestrial plants but not algae because algae are rarely exposed to drought stress27. LEA families with close taxonomic relationships generally exhibit similar scales and distributions. However, the scales of the LEA gene family differ in maize and rice. Due to variations in the evolutionary rates of the whole genomes of grasses, which are subject to broad changes in environmental conditions, maize exhibits divergence signals that are associated with directionally selected traits and are functionally related to stress responses. These results suggest that stress adaptation in maize might have involved the evolution of protein-coding sequences47. Additionally, these evolutionary changes probably led to the observed differences in the LEA gene families of maize and rice. In Brassicaceae, BrLEAs, BoLEAs and BnLEAs are homologous to AtLEAs. In the A. thaliana genome, chromosomes are divided into 24 blocks48. The chromosomal locations of the BnLEA genes exhibit a genomic block distribution similar to that of the LEA genes in A. thaliana21. B. napus inherited most of its genes in the homologous genomic blocks of B. oleracea and B. rapa. The chromosome evolution of Brassica plants involved these genomic blocks32. The WGT events promoted gene-level and genomic evolution, thus contributing to the diversification of Brassica plants3248. The evolution of LEA gene families in Brassicaceae is part of the long history of evolution of Brassicaceae species. In conclusion, a total of 108 LEA genes were identified in B. napus and classified into eight groups. Chromosomal mapping and synteny analysis revealed that 108 BnLEA genes were distributed in all B. napus chromosomes with some gene clustering. Segmental duplication and WGD were identified as the main patterns of LEA gene expansion in B. napus. The BnLEA genes all contain the LEA motif and have few introns. Genes belonging to the same family exhibit similar gene structures, consistent with their Ka/Ks ratios. This current increases our understanding of LEA genes in B. napus and lays the foundation for further investigations of the functions of these LEA proteins in oilseed rape.

Methods

Identification of LEA family genes in B. napus and other species

LEA genes were identified in B. napus based on homology with the 51 LEA protein sequences from Arabidopsis8 using the BLAT search program in the CNS-Genoscope database (http://www.genoscope.cns.fr/brassicanapus/)26. Redundant sequences were removed manually. All BnLEA gene candidates were analyzed using the Hidden Markov Model of the Pfam database (http://pfam.sanger.ac.uk/search)18, SMART database (http://smart.embl-heidelberg.de/)49, and NCBI Conserved Domain Search database (http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi)50 to confirm that each gene was a member of LEA family. Using the Pfam nomenclature, the LEA gene family of B. napus was divided into eight groups: LEA_1 to LEA_6, SMP and dehydrin. A univocal name consisting of two italic letters denoting the source organism, the family name, and subfamily numeral for each gene was assigned to each LEA gene (e.g., BnLEA1). To trace the evolutionary origin of the LEA gene family in plants, LEAs were identified in other plant species using Phytozome (http://phytozome.jgi.doe.gov/pz/portal.html)828, including Oryza sativa, Zea mays, Gossypium hirsutum, Glycine max, Arabidopsis thaliana, Brassica rapa, Brassica oleracea, Selaginella moellendorffii, Physcomitrella patens, Thalassiosira pseudonana, Vitis vinifera, Populus trichocarpa and Setaria italica. Finally, sixteen species were chosen, including three green algae, a moss, two lycophytes, three gramineae, four cruciferae, grape, populus, cotton and soybean. The number of amino acids, CDS lengths and chromosome locations of the BnLEA genes were obtained from the B. napus database. The physicochemical parameters, including molecular weight (kDa) and pI, of each BnLEA protein were calculated using the compute pI/Mw tool of ExPASy (http://www.expasy.org/tools/). GRAVY (grand average of hydropathy) values were calculated using the PROTPARAM tool (http://web.expasy.org/protparam/)51. Subcellular location prediction was conducted using the TargetP1.1 (http://www.cbs.dtu.dk/services/TargetP/) server52 and Protein Prowler Subcellular Localisation Predictor version 1.2 (http://bioinf.scmb.uq.edu.au/pprowler_webapp_1-2/)53.

Multiple alignment and phylogenetic analysis of BnLEA family genes

Multiple sequence alignment of all predicted BnLEA protein sequences was performed using ClustalW software. An unrooted phylogenetic tree of the 108 full-length LEA protein sequences was constructed using MEGA 6 with the Neighbor Joining (NJ) method, and bootstrap analysis was conducted using 1,000 replicates5455.

Gene structure analysis of BnLEA family genes

The exon-intron structures of the BnLEA family genes were determined based on alignments of their coding sequences with the corresponding genomic sequences, and a diagram was obtained using GSDS (Gene structure display server: GSDS: http://gsds.cbi.pku.edu.cn/)56. MEME (Multiple Expectation Maximization for Motif Elicitation) (http://alternate.meme-suite.org/) was used to identify the conserved motif structures encoded by the BnLEA family genes57. In addition, each structural motif was annotated using Pfam (http://pfam.sanger.ac.uk/search)18 and SMART (http://smart.embl-heidelberg.de/) tools49. To confirm the gene structures, all 108 BnLEA gene sequences were queried against published transcriptome RNA-seq data from B. napus in the NCBI database using BLAST (all genes sequence were consistent with No. ERX515977, ERX515976, ERX515975, ERX515974, or ERX397800 transcriptome data)2658.

Chromosomal location and gene duplication of BnLEA family genes

The chromosomal locations of the BnLEA genes were determined based on the positional information obtained from the B. napus database. Tandemly duplicated LEA genes were defined adjacent to homologous LEA genes on B. napus chromosomes or within a sequence distance of 50 kb44. The synteny relationships between the BnLEAs and A. thaliana LEAs, B. rapa LEAs, and B. oleracea LEAs were evaluated using the search syntenic genes tool in BRAD (http://brassicadb.org/brad/)46 and synteny tools of the B. napus Genome Browser (http://www.genoscope.cns.fr/brassicanapus/cgi-bin/gbrowse_syn/colza/)26.

Calculation of the Ka/Ks values of BnLEA family genes

The LEA gene sequences of each paralogous pair were first aligned using ClustalW. The files containing the multiple sequence alignments of the LEA gene sequences were then converted to a PHYLIP alignment using MEGA 6. Finally, the converted sequence alignments were imported into the YN00 program of PAML to calculate synonymous and non-synonymous substitution rates59.

RNA extraction and qRT-PCR analysis

An RNAprep Pure Plant Kit (Tiangen) was used to isolate total RNA from each frozen sample and first-strand cDNA was synthesized from the RNA by using a PrimeScriptTM RT Master Mix Kit (TaKaRa) according to the manufacturer’s instructions. Gene-specific primers were designed by using Primer5.0 (Table S3). Each reaction was carried out in triplicate with a reaction volume of 20 μl containing 1.6 μl of gene-specific primers (1.0 μM), 1.0 μl of cDNA, 10 μl of SYBR green(TaKaRa), and 7.4 μl sterile distilled water. The PCR conditions were as follows: Stage 1: 95 °C for 3 min; stage 2: 40 cycles of 15 s at 95 °C and 45 s at 60 °C; stage 3: 95 °C for 15 s, 60 °C for 1 min, 95 °C for 15 s. At stage 3, a melting curve was generated to estimate the specificity of the reactions. A housekeeping gene (actin) constitutively expressed in B. napus was used as a reference for normalization and analzsed by using an ABI3100 DNA sequencer (Applied Biosystems; Quantitation-Comparative: ΔΔCT)60.

Additional Information

How to cite this article: Liang, Y. et al. Genome-wide identification, structural analysis and new insights into late embryogenesis abundant (LEA) gene family formation pattern in Brassica napus. Sci. Rep. 6, 24265; doi: 10.1038/srep24265 (2016).
  56 in total

1.  Preformed structural elements feature in partner recognition by intrinsically unstructured proteins.

Authors:  Monika Fuxreiter; István Simon; Peter Friedrich; Peter Tompa
Journal:  J Mol Biol       Date:  2004-05-14       Impact factor: 5.469

2.  Divergence of duplicate genes in exon-intron structure.

Authors:  Guixia Xu; Chunce Guo; Hongyan Shan; Hongzhi Kong
Journal:  Proc Natl Acad Sci U S A       Date:  2012-01-09       Impact factor: 11.205

3.  Dehydration-induced expression of LEA proteins in an anhydrobiotic chironomid.

Authors:  Takahiro Kikawada; Yuichi Nakahara; Yasushi Kanamori; Ken-ichi Iwata; Masahiko Watanabe; Brian McGee; Alan Tunnacliffe; Takashi Okuda
Journal:  Biochem Biophys Res Commun       Date:  2006-07-12       Impact factor: 3.575

4.  Inventory, evolution and expression profiling diversity of the LEA (late embryogenesis abundant) protein gene family in Arabidopsis thaliana.

Authors:  Natacha Bies-Ethève; Pascale Gaubier-Comella; Anne Debures; Eric Lasserre; Edouard Jobet; Monique Raynal; Richard Cooke; Michel Delseny
Journal:  Plant Mol Biol       Date:  2008-02-12       Impact factor: 4.076

5.  Late embryogenesis abundant proteins: versatile players in the plant adaptation to water limiting environments.

Authors:  Yadira Olvera-Carrillo; José Luis Reyes; Alejandra A Covarrubias
Journal:  Plant Signal Behav       Date:  2011-04-01

6.  Identification of the trehalose-6-phosphate synthase gene family in winter wheat and expression analysis under conditions of freezing stress.

Authors:  D W Xie; X N Wang; L S Fu; J Sun; W Zheng; Z F Li
Journal:  J Genet       Date:  2015-03       Impact factor: 1.166

7.  Convergent gene loss following gene and genome duplications creates single-copy families in flowering plants.

Authors:  Riet De Smet; Keith L Adams; Klaas Vandepoele; Marc C E Van Montagu; Steven Maere; Yves Van de Peer
Journal:  Proc Natl Acad Sci U S A       Date:  2013-02-04       Impact factor: 11.205

8.  A metal-binding member of the late embryogenesis abundant protein family transports iron in the phloem of Ricinus communis L.

Authors:  Claudia Kruger; Oliver Berkowitz; Udo W Stephan; Rudiger Hell
Journal:  J Biol Chem       Date:  2002-04-30       Impact factor: 5.157

9.  The genome of the mesopolyploid crop species Brassica rapa.

Authors:  Xiaowu Wang; Hanzhong Wang; Jun Wang; Rifei Sun; Jian Wu; Shengyi Liu; Yinqi Bai; Jeong-Hwan Mun; Ian Bancroft; Feng Cheng; Sanwen Huang; Xixiang Li; Wei Hua; Junyi Wang; Xiyin Wang; Michael Freeling; J Chris Pires; Andrew H Paterson; Boulos Chalhoub; Bo Wang; Alice Hayward; Andrew G Sharpe; Beom-Seok Park; Bernd Weisshaar; Binghang Liu; Bo Li; Bo Liu; Chaobo Tong; Chi Song; Christopher Duran; Chunfang Peng; Chunyu Geng; Chushin Koh; Chuyu Lin; David Edwards; Desheng Mu; Di Shen; Eleni Soumpourou; Fei Li; Fiona Fraser; Gavin Conant; Gilles Lassalle; Graham J King; Guusje Bonnema; Haibao Tang; Haiping Wang; Harry Belcram; Heling Zhou; Hideki Hirakawa; Hiroshi Abe; Hui Guo; Hui Wang; Huizhe Jin; Isobel A P Parkin; Jacqueline Batley; Jeong-Sun Kim; Jérémy Just; Jianwen Li; Jiaohui Xu; Jie Deng; Jin A Kim; Jingping Li; Jingyin Yu; Jinling Meng; Jinpeng Wang; Jiumeng Min; Julie Poulain; Jun Wang; Katsunori Hatakeyama; Kui Wu; Li Wang; Lu Fang; Martin Trick; Matthew G Links; Meixia Zhao; Mina Jin; Nirala Ramchiary; Nizar Drou; Paul J Berkman; Qingle Cai; Quanfei Huang; Ruiqiang Li; Satoshi Tabata; Shifeng Cheng; Shu Zhang; Shujiang Zhang; Shunmou Huang; Shusei Sato; Silong Sun; Soo-Jin Kwon; Su-Ryun Choi; Tae-Ho Lee; Wei Fan; Xiang Zhao; Xu Tan; Xun Xu; Yan Wang; Yang Qiu; Ye Yin; Yingrui Li; Yongchen Du; Yongcui Liao; Yongpyo Lim; Yoshihiro Narusaka; Yupeng Wang; Zhenyi Wang; Zhenyu Li; Zhiwen Wang; Zhiyong Xiong; Zhonghua Zhang
Journal:  Nat Genet       Date:  2011-08-28       Impact factor: 38.330

10.  Plant genetics. Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome.

Authors:  Boulos Chalhoub; France Denoeud; Shengyi Liu; Isobel A P Parkin; Haibao Tang; Xiyin Wang; Julien Chiquet; Harry Belcram; Chaobo Tong; Birgit Samans; Margot Corréa; Corinne Da Silva; Jérémy Just; Cyril Falentin; Chu Shin Koh; Isabelle Le Clainche; Maria Bernard; Pascal Bento; Benjamin Noel; Karine Labadie; Adriana Alberti; Mathieu Charles; Dominique Arnaud; Hui Guo; Christian Daviaud; Salman Alamery; Kamel Jabbari; Meixia Zhao; Patrick P Edger; Houda Chelaifa; David Tack; Gilles Lassalle; Imen Mestiri; Nicolas Schnel; Marie-Christine Le Paslier; Guangyi Fan; Victor Renault; Philippe E Bayer; Agnieszka A Golicz; Sahana Manoli; Tae-Ho Lee; Vinh Ha Dinh Thi; Smahane Chalabi; Qiong Hu; Chuchuan Fan; Reece Tollenaere; Yunhai Lu; Christophe Battail; Jinxiong Shen; Christine H D Sidebottom; Xinfa Wang; Aurélie Canaguier; Aurélie Chauveau; Aurélie Bérard; Gwenaëlle Deniot; Mei Guan; Zhongsong Liu; Fengming Sun; Yong Pyo Lim; Eric Lyons; Christopher D Town; Ian Bancroft; Xiaowu Wang; Jinling Meng; Jianxin Ma; J Chris Pires; Graham J King; Dominique Brunel; Régine Delourme; Michel Renard; Jean-Marc Aury; Keith L Adams; Jacqueline Batley; Rod J Snowdon; Jorg Tost; David Edwards; Yongming Zhou; Wei Hua; Andrew G Sharpe; Andrew H Paterson; Chunyun Guan; Patrick Wincker
Journal:  Science       Date:  2014-08-21       Impact factor: 47.728

View more
  36 in total

1.  Internal and External Regulatory Elements Controlling Somatic Embryogenesis in Catharanthus: A Model Medicinal Plant.

Authors:  A Mujib; Yashika Bansal; Moien Qadir Malik; Rukaya Syeed; Jyoti Mamgain; Bushra Ejaz
Journal:  Methods Mol Biol       Date:  2022

2.  Genome-wide identification, phylogeny, and expression analysis of the bHLH gene family in tobacco (Nicotiana tabacum).

Authors:  Nasreen Bano; Preeti Patel; Debasis Chakrabarty; Sumit Kumar Bag
Journal:  Physiol Mol Biol Plants       Date:  2021-08-07

3.  Tandem Mass Tag-Based Quantitative Proteomics Reveals Implication of a Late Embryogenesis Abundant Protein (BnLEA57) in Seed Oil Accumulation in Brassica napus L.

Authors:  Zhongjing Zhou; Baogang Lin; Jinjuan Tan; Pengfei Hao; Shuijin Hua; Zhiping Deng
Journal:  Front Plant Sci       Date:  2022-06-02       Impact factor: 6.627

4.  Late embryogenesis abundant (LEA) gene family in Salvia miltiorrhiza: identification, expression analysis, and response to drought stress.

Authors:  Juan Chen; Na Li; Xiaoyu Wang; Xue Meng; Xiaomin Cui; Zhiyong Chen; Hui Ren; Jing Ma; Hao Liu
Journal:  Plant Signal Behav       Date:  2021-04-05

5.  Rice Ribosomal Protein Large Subunit Genes and Their Spatio-temporal and Stress Regulation.

Authors:  Mazahar Moin; Achala Bakshi; Anusree Saha; Mouboni Dutta; Sheshu M Madhav; P B Kirti
Journal:  Front Plant Sci       Date:  2016-08-24       Impact factor: 5.753

6.  DrwH, a novel WHy domain-containing hydrophobic LEA5C protein from Deinococcus radiodurans, protects enzymatic activity under oxidative stress.

Authors:  Shijie Jiang; Jin Wang; Xiaoli Liu; Yingying Liu; Cui Guo; Liwen Zhang; Jiahui Han; Xiaoli Wu; Dong Xue; Ahmed E Gomaa; Shuai Feng; Heng Zhang; Yun Chen; Shuzhen Ping; Ming Chen; Wei Zhang; Liang Li; Zhengfu Zhou; Kaijing Zuo; Xufeng Li; Yi Yang; Min Lin
Journal:  Sci Rep       Date:  2017-08-24       Impact factor: 4.379

7.  Transcriptome analysis of the tea oil camellia (Camellia oleifera) reveals candidate drought stress genes.

Authors:  Bin Dong; Bin Wu; Wenhong Hong; Xiuping Li; Zhuo Li; Li Xue; Yongfang Huang
Journal:  PLoS One       Date:  2017-07-31       Impact factor: 3.240

8.  Whole-Genome Identification and Expression Pattern of the Vicinal Oxygen Chelate Family in Rapeseed (Brassica napus L.).

Authors:  Yu Liang; Neng Wan; Zao Cheng; Yufeng Mo; Baolin Liu; Hui Liu; Nadia Raboanatahiry; Yongtai Yin; Maoteng Li
Journal:  Front Plant Sci       Date:  2017-05-09       Impact factor: 5.753

9.  Transcriptomic basis for drought-resistance in Brassica napus L.

Authors:  Pei Wang; Cuiling Yang; Hao Chen; Chunpeng Song; Xiao Zhang; Daojie Wang
Journal:  Sci Rep       Date:  2017-01-16       Impact factor: 4.379

10.  Characterization of the late embryogenesis abundant (LEA) proteins family and their role in drought stress tolerance in upland cotton.

Authors:  Richard Odongo Magwanga; Pu Lu; Joy Nyangasi Kirungu; Hejun Lu; Xingxing Wang; Xiaoyan Cai; Zhongli Zhou; Zhenmei Zhang; Haron Salih; Kunbo Wang; Fang Liu
Journal:  BMC Genet       Date:  2018-01-15       Impact factor: 2.797

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.