Literature DB >> 22312320

In silico identification and comparative genomics of candidate genes involved in biosynthesis and accumulation of seed oil in plants.

Arti Sharma1, Rajinder Singh Chauhan.   

Abstract

Genes involved in fatty acids biosynthesis, modification and oil body formation are expected to be conserved in structure and function in different plant species. However, significant differences in the composition of fatty acids and total oil contents in seeds have been observed in different plant species. Comparative genomics was performed on 261 genes involved in fatty acids biosynthesis, TAG synthesis, and oil bodies formation in Arabidopsis, Brassica rapa, castor bean and soybean. In silico expression analysis revealed that stearoyl desaturase, FatB, FAD2, oleosin and DGAT are highly abundant in seeds, thereby considered as ideal candidates for mining of favorable alleles in natural population. Gene structure analysis for major genes, ACCase, FatA, FatB, FAD2, FAD3 and DGAT, which are known to play crucial role in oil synthesis revealed that there are uncommon variations (SNPs and INDELs) which lead to varying content and composition of fatty acids in seed oil. The predicted variations can provide good targets for seed oil QTL identification, understanding the molecular mechanism of seed oil accumulation, and genetic modification to enhance seed oil yield in plants.

Entities:  

Year:  2012        PMID: 22312320      PMCID: PMC3270531          DOI: 10.1155/2012/914843

Source DB:  PubMed          Journal:  Comp Funct Genomics        ISSN: 1531-6912


1. Introduction

A major challenge mankind is facing in this century is the gradual exhaustion of the fossil energy resources. The combustion of those fossil fuels used in transportation is one of the key factors responsible for global warming and environment pollution due to large-scale carbon dioxide emissions. Thus, alternative energy sources based on sustainable and ecologically friendly processes are urgently required. At present gasoline or diesel are being largely substituted by two biofuels, bioethanol and biodiesel, capturing ∼90% of the market [1]. Biodiesel is made from renewable biomass mainly by alkali-catalysed transesterification of triacylglycerols (TAGs) from plant oils [2]. Manipulation of biosynthetic pathways offers a number of exciting opportunities for plant biologists to redesign plant metabolism toward production of specific TAGs. The biosynthesis of fatty acids in plants begins with the formation of acetyl Co-A from pyruvate. The acetyl CoA produced in plastids is activated to malonyl CoA; the malonyl group is subsequently transferred to acyl carrier protein (ACP) giving rise to malonyl ACP, the primary substrate of the fatty acid synthase complex. The formation of malonyl CoA is the committed step in fatty acid synthesis and is catalyzed by the highly regulated plastidic acetyl CoA carboxylase complex [3]. De novo fatty acid synthesis in the plastids occurs through a repeated series of condensation, reduction, and dehydration reactions that add two carbon units derived from malonyl ACP to the elongating fatty acid chain. A series of condensation reactions proceed with acetyl-CoA and malonyl-ACP, then acyl-ACP acceptors. Three separate condensing enzymes, or 3-ketoacyl-ACP synthases (KAS I–III) are necessary for the production of an 18-carbon fatty acid. Three additional condensation reactions are required; each condensation step to obtain a saturated fatty acid that is two carbons longer than at the start of the cycle. These reactions are catalysed by 3-ketoacyl-ACP reductase (KAR), 3-hydroxyacyl-ACP dehydratase (HD), and enoyl-ACP reductase (ENR). The first desaturation step also occurs in the plastid; while the acyl chain is still conjugated to ACP, a Δ 9-desaturase converts stearoyl ACP to oleoyl ACP. Termination of fatty acid elongation is catalyzed by acyl ACP thioesterases, which are two main types in plants. The FatA class removes oleate from ACP, whereas FatB thioesterases are involved in saturated and unsaturated acyl ACPs, and, in some species, with shorter-chain-length acyl ACPs [4-6]. After release from ACP, the free fatty acids are exported from the plastid and converted to acyl CoAs. Nascent fatty acids can be incorporated into TAGs in developing seeds [4]. Oleic acid can be further desaturated to oleate acids by FAD2 [7] and FAD6 [8] in the cytosol and the plastid, respectively. Cytosolic and plastid ω-3 desaturations that result in the production of linolenic acids are catalyzed by FAD3 [9] and FAD7 [10], respectively. Fatty acids can be incorporated into TAGs in developing seeds in a number of ways. For example, a series of reactions known as the Kennedy pathway results in the esterification of two acyl chains from acyl CoA to glycerol-3-phosphate to form phosphatidic acid (PA) and, following phosphate removal, diacylglycerol (DAG). A diacylglycerol acyltransferase (DGAT), using acyl CoA as an acyl donor, converts DAG to TAG. Two classes of DGAT enzymes have been isolated [11, 12], and orthologs have been identified in numerous plant species. DAG and phosphatidylcholine (PC) are interchangeable via the action of cholinephosphotransferase, suggesting a route for the flux of fatty acids into and out of PC. Acyl chains from PC can be incorporated into TAG, either via conversion back to DAG or by the action of a phospholipid diacylglycerol acyltransferase (PDAT) that uses PC as an acyl donor to convert DAG to TAG. There are two predominant seed oil storage proteins in plants: caleosin and oleosin. TAG assembled in these storage proteins form oil bodies in seeds. The fatty acid composition of seed oil varies considerably both between species and within species. The variation of fatty acids occurs both in chain length and degrees of desaturation. Consequently, the fuel properties of biodiesel derived from a mixture of fatty acids are dependent on the composition of fatty acids in seed oil. Altering the fatty acid profile can, therefore, improve fuel properties of biodiesel such as cold-temperature flow characteristics, oxidative stability, and NOx emissions [13]. Fatty acid biosynthetic pathway is highly conserved in plants, but there are significant variations in fatty acid contents and composition in plants (Table 1). What determines differences in the contents and composition of fatty acids and subsequently the total oil yield in the seeds is not understood. The availability of whole genome sequences, ESTs, and individual gene sequences from different oil rich plant species provide an opportunity to investigate what differences in the structure and sequences of genes determine variation in contents and composition so as to identify distinguishing gene signatures to assist in genetic improvement of crop plants either through marker-assisted breeding or by metabolic engineering [32]. Tanhuanpää et al. [33] developed an allele-specific PCR marker for oleic acid by comparing the wild-type and high-oleic allele of the FAD 2 gene locus in spring turnip rape (Brassica rapa ssp. oleifera). The accumulation of ricinoleic acid in transgenic Arabidopsis seeds was doubled by expressing the castor FAH12 hydroxylase in a FAD 2/FAE1 mutant [34]. The FatA and FatB genes of castor bean were heterologously expressed in Escherichia coli for biochemical characterization after purification, resulting in high catalytic efficiency of RcFatA on oleoyl-ACP and palmitoleoyl-ACP and high efficiencies of RcFatB for oleoyl-ACP and palmitoyl-ACP. The expression profile of these genes displayed the highest levels in expanding tissues that typically are very active in lipid biosynthesis such as developing seed endosperm and young expanding leaves [35]. Arabidopsis thaliana gene diacylglycerol acyltransferase (DGAT) coding for a key enzyme in TAG biosynthesis was expressed in tobacco under the control of a strong ribulose-biphosphate carboxylase small subunit promoter. This modification led up to a 20-fold increase in TAG accumulation in tobacco leaves and translated into an overall twofold increase in extracted fatty acids up to 5.8% of dry biomass in Nicotiana tabacum [36]. Dimov and Mollers [37] tested genetic variation for saturated fatty acid content in two sets of modern winter oilseed rape cultivars (Brassica napus L.) in field experiments under typical German growing conditions. They observed highly significant genetic differences among the cultivars for total saturated fatty acid content, which ranged from 6.8% to 8.1%. Singh et al. [38] constructed genetic map using AFLP, RFLP, and SSR markers for oil palm. They detected quantitative trait loci (QTLs) controlling oil quality (measured in terms of iodine value and fatty acid composition) and identified significant QTLs associated with iodine value (IV), myristic acid (C14 : 0), palmitic acid (C16 : 0), palmitoleic acid (C16 : 1), stearic acid (C18 : 0), oleic acid (C18 : 1), and linoleic acid (C18 : 2) content. The Brassica napus mutant line DMS100 carrying a G-to-A base substitution at the 5′ splice site of intron 6 in FAD 3 had reduced C18 : 3 content in oil seeds [39]. These studies suggest that the comparative analysis of oil biosynthesis and accumulation genes is a suitable strategy to investigate the molecular basis of oil content and composition variation in seed oils of different plant species. Additionally, these variations can be used to develop functional markers for increasing selection efficiency by marker- assisted selection in plant breeding.
Table 1

Variations for fatty acids and TAG biosynthesis pathway genes associated with high oil content in different plant species.

Targeted GenesDescriptions of variationsGene regions harboring variationsPlant/organismReferences
FAD 2, FAD 3SNP for high oleic acid and low linolenic acidExonBrassicaHu et al., 2006 [14]
Stearoyl—ACP desaturaseSNP for high stearic acidExonSoybeanZhang et al., 2008 [15]
FAD 2SNPs for high oleic acidExonPeanutLópez et al., 2000 [16]
FAD 3SNP for low linolenic acidIntron-Exon junctionSoybeanBilyeu et al., 2005 [17]
KAS ISNPs and Indel associated with oleic acid content5′UTR, Exon, IntronSoybeanHa et al., 2010 [18]
KAS III, ACCase, Stearoyl—ACP desaturase, DGATIndels, SNPs and SSRs associated with variation in composition and concentration of oilMaizeYang et al., 2010 [19]
FAD 23 base pair variation leads to change in amino acid which contribute to high oleate content in oilExonPeanutBruner et al., 2001 [20]
DGAT13 bp Insertion leads to high oleic acid contentExonMaizeZheng et al., 2008 [21]
FAD 2SSR linked to oleic acid content SoybeanBachlava et al., 2008 [22]
FAD 3Deletion in soybean FAD 3 leads to reduced linolenateExonSoybeanAnai et al., 2005 [23]
KAS IIISNP associated with high palmitic acid contentExonSoybeanAghoram et al., 2006 [24]
Stearoyl—ACP desaturaseSSRs associated with high stearic acidSoybeanSpencer et al., 2003 [25]
Stearoyl—ACP desaturaseSSRs and INDELs associated with high stearic acidSunflowerPérez-Vich et al., 2006 [26]
FatBDeletions associated with low palmitic acid contentExons and IntronsSoybeanCardinal et al., 2007 [27]
In the present study, four plant species, Arabidopsis, Brassica, soybeans, and castor bean were considered for comprehensive analysis of fatty acid biosynthesis genes due to the availability of their genome sequences and several ESTs collections. Moreover, soybeans and brassicas are the biggest source of plant oil in the world, whereas castor bean contains unusual fatty acid ricinoleate that have chemical properties useful for industrial applications. The total seed oil contents of Arabidopsis, castor bean Brassica, and soybean are 30–37%, 40–45%, 30–40%, and 15–20%, respectively (Table 2) [28-31]. Plant oils are mostly composed of five common fatty acids, namely, palmitate (16 : 0), stearate (18 : 0), oleate (18 : 1), linoleate (18 : 2) and linolenate (18 : 3), although, depending on the particular species, longer or shorter fatty acids may also be major constituents. These fatty acids differ from each other in terms of acyl chain length and number of double bonds, leading to different physical properties. Here we put forward the questions (1) whether there are common variations in genes, if any, which contribute to increased seed oil content in plants? (2) Which are the major genes responsible for the higher amounts of five fatty acids mentioned above in different plant species? For answering these questions the present study aimed at (1) the identification of candidate genes for fatty acid biosynthesis, TAG synthesis and oil body formation proteins in plant species under study, (2) the comparative structure analysis of these candidate genes, (3) the in silico identification of sequence variations in fatty acid biosynthesis genes, and (4) the in silico association of sequence variations in candidate genes for oil content and composition.
Table 2

Fatty acid composition of four plant species.

Fatty acid composition (%)Arabidopsis [28]Castor bean [29]Brassica [30]Soybean [31]
Palmitic acid8.72.01.57–11
Stearic acid3.61.00.42–6
Oleic acid15.07.0 22.0 22–34
Linolenic acid 29.0 6.85–11
Linoleic acid19.25.014.2 43–56
Ricinoleic acid 86–90
Others24.5 47 (Erucic)

Total oil content30–3745–5033–4015–20

2. Materials and Methods

2.1. Retrieval of Sequences

Thirty-two genes involved in the biosynthesis and storage of fatty acids were retrieved from Arabidopsis database (http://lipids.plantbiology.msu.edu/) by referring to the comprehensive lipid gene catalog provided by Beisson et al. [40]. The selected genes covered all the major biochemical events in the biosynthesis and storage of fatty acids [41, 42]. The protein sequences of these genes were used as query against castor bean database in TIGR (http://blast.jcvi.org/er-blast/index.cgi?project=rca1) and soybean database in soybase (http://soybase.org/). Full-length coding sequences of Brassica were downloaded from GenBank (http://www.ncbi.nlm.nih.gov/genbank/GenbankSearch.html). Protein function domains were examined with “CDD” from NCBI (http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml).

2.2. Prediction of Gene Structures

Gene models for castor bean and soybean genomes were downloaded from Phytozome (http://www.phytozome.net/). The Arabidopsis gene models were downloaded from TAIR (http://www.arabidopsis.org/). Arabidopsis, castor bean, Brassica. Rapa, and soybean sequences were further annotated for gene models (open reading frames, including the 5′UTRs and 3′UTRs) using gene prediction algorithms of FGenesH (http://linux1.softberry.com/berry.phtml?topic=fgenesh&group=programs&subgroup=gfind) [43-45] (see Table 1 of the Supplementary Material available online at doi:10.1155/2012/914843). Sequence identity among Brassica rapa, castor bean, soybean, and Arabidopsis genes was confirmed using ClustalW in MegAling in DNASTAR (DNASTAR Inc., Madison, WI, USA). The in silico expression status of candidate genes belonging to different families was searched with an e-value cutoff 0.0 in the ESTdb of NCBI (National Centre for Biotechnology Information) at http://www.ncbi.nlm.nih.gov/BLAST/ and TIGR (The Institute of Genomic Research) at http://blast.jcvi.org/er-blast/index.cgi?project=rca1.

3. Results

3.1. Comparative Genomics of Fatty Acid Biosynthesis Genes in Major Oil Seed Plant Species

The fatty acid biosynthesis pathway includes 32 gene families involved in the conversion of acetyl Co-A into different fatty acids and their storage in oil bodies. A total of 68 protein sequences were retrieved for 32 gene families from the comprehensive lipid gene catalog of Arabidopsis [40] and functional domains were identified for each gene family. The 68 protein sequences from Arabidopsis were queried for fatty acid biosynthesis genes in Brassica rapa, soybean, and castor bean databases. A total of 261 genes belonging to 32 gene families were identified and retrieved from four plant species, out of which, 68 were from Arabidopsis, 62 from Brassica rapa, 55 from castor bean, and 76 from soybean (Table 3). Detailed gene structures, exon- intron coordinates of each gene are given in Supplementary Table  1.
Table 3

Oil synthesis and accumulation genes in Arabidopsis, Brassica rapa, castor bean, and soybean.

CategoryGene nameAccession numberCoding DNA sequence length (bp)
Arabidopsis Brassica rapa Castor beanSoybeanArabid-opsis Brassica rapa Castor beanSoybean
ACCaseACCaseAt1g36180X7738229908.m005991Glyma04g115505997219367236834
Alpha-carboxyl transferaseAt2g38040AY53867527798.m000585Glyma18g422802346229523552130
Beta-carboxyl transferaseATCG00500Z5086828890.m000006636984528
Biotin carboxylaseAt5g35360AY03441030185.m000954Glyma05g364501683160819351731
Biotin carrierAt5g15530AY53867429630.m00080929929.m004560Glyma08g031207267837809241611

Malonyl Co-A transacylaseAt2g30200AJ00704630113.m001448Glyma18g0650011049939871113
Beta-Ketoacyl ACP synthase IAt5g46290AF24451929693.m002034Glyma10g046801422138014371452
Beta-Ketoacyl ACP synthase IIAt1g74960AF24452029739.m003711Glyma13g190101770130220131305
Beta-Ketoacyl ACP synthase IIIAt1g62640AF17985428455.m000368Glyma09g41380Glyma15g00550Glyma18g443501215186123311948311254
Elongase3-Ketoacyl- acp- dehydraseAt1g62610At3g46170At3g55290At3g55310AF38214630147.m013777Glyma08g10760Glyma18g01280924867822822 735924963
3-Ketoacyl- Co-A reductase (KAR)At1g24360AY19619729929.m004732Glyma11g37320927960987963
Enoyl-ACP reductase (ENR)At2g05990AJ243087AJ243088AJ243089AJ243090x9546227843.m00016029650.m000277Glyma11g10770Glyma12g03060Glyma18g31780111551158116111611164115810831083117612031533
Hydroxyacyl ACP Dehydrase (HD)At2g22230AF38214630200.m000354Glyma05g24650Glyma08g07870Glyma08g19200Glyma15g05800663672534417513219822
Plastidial 1 acylglycerol phosphate acyltransferaseAt4g3058029687.m000572Glyma06g28540Glyma12g28470107198724061776
Plastidial Glycerol phosphate acyltransferaseAt1g3220030068.m002660Glyma09g34110138012361413

Monogalactosylacylglycerol desaturase (FAD 5)At3g15850 29841.m002863Glyma07g03370Glyma08g227301116 116111011173
Stearoyl-ACP desaturaseAt1g43800At2g43710At3g02610At3g02630At3g02620At5g16230At5g16240 X63364X74782AY64253727985.m00087728470.m00042829929.m00451530020.m000203Glyma02g15600Glyma07g32850Glyma13g08970Glyma14g27990 11761206984119198412061185120012061200117633696011911176117611851014
Oleate desaturase (FAD 6)At4g30950AY642535AY64254029696.m000105Glyma02g364601347133212938251287
DesaturaseLinoleate desaturase (FAD 7)At3g11170AY592974AY599884FJ985689FJ985690FJ985691L01418L2296228176.m00027329681.m00136029814.m000719Glyma01g29630Glyma03g07570Glyma07g1835014671320115211341299133511521134135911311383135913621362
Linoleate desaturase (FAD 8)At5g05580AY592974AY599884FJ98569128176.m00027329681.m00136029814.m000719Glyma01g29630Glyma03g07570Glyma07g1835013081152132011521335135911311383135913621362
ER-Oleate desaturase (FAD 2)At3g12120AY577313DQ518276DQ518277DQ518278FJ907397FJ907398FJ907399FJ907400FJ907401FJ95214428035.m000362Glyma03g30070Glyma09g17170Glyma10g42470Glyma19g32940Glyma20g2453011521155780780780115511551155115511551155116411521161114011521140
ER-Linoleate desaturase (FAD 3)At2g29980AY592974AY599884FJ985689L01418L2296229681.m001360Glyma01g29630Glyma07g18350116113201152113411521134113113591362

ThioesteraseAcyl-ACP thioesterase (FatA)At3g25110At4g13050X8784230217.m00026229842.m003515Glyma08g46360Glyma18g36130108911041176126975311911125
Palmitoyl-ACP thioesterase (FatB)At1g08510DQ847275Fj71595229848.m004677Glyma0421910Glyma05g08060Glyma06g23560Glyma17g1294012391245123912601332125114221251

TAG synthesisDiacylglycerol Acyltranferase (DGAT 1)At2g19450AF16443429912.m005373Glyma13g165601593151218301347
Diacylglycerol Acyltranferase (DGAT 2)At3g51520AF15522429682.m000581Glyma09g327909451056768987
Lysophosphosphatidic acid acyltransferase (LPAAT)At1g01610At1g51260At1g78690At1g80950At2g27090At2g38110At3g05510At3g11430At3g18850At3g57650At4g00400At5g06090AF111161Gu045434GU04535Gu045436Z4986027810.m00064629851.m00244830169.m00643330170.m01399029736.m00207029822.m00344129969.m00026730174.m008615Glyma01g27900Glyma03g01070Glyma03g14180Glyma07g07580Glyma07g17720Glyma10g23560Glyma18g425801512111987311402232150613471509114611701503150310351173117611739361188118810508527381515153959415061536678858675149114281620
Diacylglycerol cholinephosphotransferaseAt3g25585AY17956030138.m003845Glyma12g08720Glyma02142101170144914497711188
Digalactosyldiacyglycerol synthase (DGD1)At3g1167028726.m000069Glyma03g36050Glyma19g387202388253823522361
ER Phosphatidate PhosphataseAt1g1508029586.m00062029660.m00076029660.m00075929660.m000759Glyma09g18450Glyma10g41580Glyma20g25650813954564930945957969909

Oil body proteinCaleosinAt1g23240At1g23250At1g70670At1g70680At2g33380At4g26740At5g55240AY966447DQ14038029673.m00093230008.m000820Glyma3g41030Glyma09g22310Glyma09g22330Glyma09g22580Glyma09g25350Glyma10g33350Glyma19g43680Glyma20g34300669663588552711738732717705597702723615606384384570723402
OleosinAt1g48990At2g25890At3g01570At3g18570At3g27660At4g25140At5g40420At5g51210DQ328612S3703229794.m00337230147.m01389130147.m014333Glyma05g07880Glyma14g15020Glyma17g13120510450552501576522600426699426564489495492492471

3.2. Expression Status of Fatty Acid Biosynthesis Genes

In silico expression analysis revealed that for 32 gene families, ESTs were detected for 68 genes in Arabidopsis, 62 genes in Brassica, 49 genes in castor bean, and 76 genes in soybean (Figure 1). Thirteen genes of Arabidopsis, 15 from castor bean, 8 from soybean, and 2 from Brassica showed tissue preferential expression patterns as per their identities to ESTs from tissue-specific libraries. Twenty-two genes from four plant species were expressed in seeds, 4 in leaves, 3 in flower, and 1 in roots (Table 4). FAD 2 and one homolog of Stearoyl desaturase gene had maximum seed ESTs in castor bean.
Figure 1

In silico transcript abundance (based on matching ESTs available in the database) of oil biosynthesis and accumulation genes in different tissues.

Table 4

In silico expression status of fatty acids biosynthesis and accumulation genes.

TissueGeneAccession no.
Arabidopsis B. rapa SoybeanCastor bean
SeedsAlpha carboxyltransferase 27798.m000585
Enoyl ACP reductase27843.m000160
Stearoyl desaturaseX74782Glyma13g0897027985.m000877
FAD-2Glyma10g4247028035.m000362
ER Phosphatidate Phosphatase29660.m000760
DGAT 2Glyma17g0612029682.m000581
FatBGlyma17g1294029842.m003515
OleosinS37032Glyma14g1502030147.m014333
OleosinAt5g40420Glyma17g1312030147.m013891
Oleosin29794.m003372
Hydroxyacyl ACP dehydrase30200.m000354
CaleosinAt5g55240Glyma20g34300

LeavesFatB 29848.m004677
LPAAT Glyma07g07580
3-Ketoacyl- acp- dehydraseAt3g55290
At3g55310

FlowersDGD1 28726.m000069
Beta- carboxyl transferase 28890.m000006
ACCase 29908.m005991

RootsStearoyl desaturaseAt3g02620

Roots + flowersStearoyl desaturaseAt3g02610

Seed + flowersOleosinAt1g48990

Leaves + flowersFAD 7At3g11170
OleosinAt2g25890
At3g18570
CaleosinAt1g23240
At1g23250
At4g26740

3.3. Comparative Analysis of Gene Structures in Different Plant Species

Comparative genomics of fatty acid biosynthesis genes was done to understand as what determines differences, if any, for variations in contents and compositions of fatty acids in different plant species. The gene structure analysis revealed that the exon-intron structure of fatty acid biosynthesis genes in castor bean and soybean gene homologs shared more structure similarity in comparison to Arabidopsis fatty acid biosynthesis genes. However, insertion, deletion, and intron size variations were found in castor bean and soybean genes with reference to Arabidopsis. Fatty acid biosynthesis genes of Brassica rapa were not analyzed for gene structure because for most of the Brassica genes only coding DNA sequences were available in the GeneBank. Conversion of acetyl Co-A to malonyl Co-A by acetyl carboxylase (ACCase) is the most committed step in fatty acid biosynthesis. Exon/intron number and CDS length for ACCase gene was almost same between castor bean (31 exons) and soybean (33 exons), whereas slightly less in Arabidopsis (26 exons). Comparative structural analysis revealed that homomeric ACCase gene from Arabidopsis (1–26 exons) showed microsynteny with castor bean (6–31 exons) and soybean (6–33 exons), with a 3 bp deletion in 8th and 26th exons of castor bean, 3 bp deletion and 3 bp insertion in 29th and 31st exons of soybean, and a 12 bp insertion in 24th and 26th exons of castor bean and soybean, respectively. First five exons of homomeric ACCase in castor bean and soybean (missing in Arabidopsis) showed colinearity for exon size, with the exception of a 3 bp insertion in the first exon of castor bean gene. Sixteenth exon of ACCase in castor bean showed sequence identity to 3 exons (16th, 17th, and 18th) of soybean (Figure 2).
Figure 2

Structure of ACCase gene in Arabidopsis (26 exons), castor bean (31 exons), and soybean (33 exons); thick arrows and thin lines represented exons and introns, respectively. Arabidopsis 1–26 exons showed identity to 6 to 31st and 6 to 33rd exons of castor bean and soybean, respectively; 16th exon of castor bean showed identity to three exons of soybean (16th, 17th, and 18th). A 3 bp deletion (del) in the 8th and 26th exons of castor bean, 3 bp deletion and 3 bp insertion (in) in the 31st and 29th exons of soybean, and a 12 bp insertion in the 24th and 26th exons of castor bean and soybean, respectively. At1g36180: Arabidopsis ACCase gene; 29908.m005991: Castor bean ACCase gene; Glyma04g11550: Soybean ACCase gene.

Two distinct classes of thioesterases, FatA and FatB, are responsible for release of fatty acids from ACP by thioesterases. FatA gene structure was diverse with respect to exons number (varying from 5 to 11) among four plant species. Two homologs of FatA gene were present in Arabidopsis, castor bean, and soybean, whereas FatB gene had 4 homologs in soybean. The first exon of FatB gene had an insertion of 3 bp in castor bean and 27 bp insertion in one of soybean homologs (Glyma0421910) and other three homologs of soybean (Glyma05g08060, Glyma17g12940, and Glyma06g23560) had 6 bp deletion compared to Arabidopsis (Figure 3). An 69 bp insertion of one exon was present in FatB genes of castor bean and soybean but was absent in Arabidopsis. The last exon of FatB (5th exon) in Arabidopsis showed homology to the last exon (6th exon) of one of the homologs of soybean (Glyma04g21910) and last two exons (6th and 7th) of another homolog of soybean (Glyma06g23560), whereas last exon of castor bean showed homology to the last exon of other two homologs of soybean (Glyma05g08060 and Glyma17g12940).
Figure 3

Structure of FatB (palmitoyl thioesterase) gene in Arabidopsis (At1g08510), castor bean (29848.m004677), and four soybean homologs (Glyma0421910, Glyma05g08060, Glyma17g12940, and Glyma06g23560). The 5th exon of FatB in Arabidopsis showed homology to the 6th exon of one of the homologs of soybean (Glyma04g21910) and last two exons (6th and 7th) of another homolog of soybean (Glyma06g23560), whereas 6th exon of castor bean showed homology to the 6th exons of other two homologs of soybean (Glyma05g08060 and Glyma17g12940).

Stearoyl ACP desaturase gene had maximum number of homologs (6 in Arabidopsis, 3 in Brassica, 4 in soybean, and 4 in castor bean) in fatty acid desaturase category of enzymes. Oleoyl desturase (Fad2) and Linoleate desaturase (Fad3) genes showed more relatedness in relation to number and sizes of exons and introns in each homolog among four plant species. Oleoyl desaturase (FAD 2) had only one exon in Arabidopsis, castor bean, and soybean with an insertion of 12 bp in the exon of castor bean and 9 bp insertion in the exon of one homolog of soybean (Glyma09g17170). FAD 3 gene structure was conserved with respect to exon-intron number and size between Arabidopsis, castor bean, and soybean except for first and last exons. A 21 bp deletion in the first exon of castor bean (29681.m001360) and an insertion of 210 and 213 bp was observed in two homologs of soybean (Glyma01g29630 and Glyma07g18350), respectively. Two deletions of 3 and 12 bp were observed in the last exon (8th exon) of castor bean and soybean, respectively. A deletion of 6 bp was observed in the 3rd exon of FAD 3 of castor bean. An SNP (G→A) was also identified at the exon-intron junction of FAD 3 gene in the 3rd exon of one homolog of soybean (Glyma01g29630) with respect to castor bean, Arabidopsis, and other homologs of soybean (Figure 4).
Figure 4

Structure of FAD 3 (linoleoyl desaturase) gene in Arabidopsis (At2g29980), castor bean (29681.m001360), and two soybean homologs (Glyma01g29630, Glyma07g18350). Exon/intron numbers are conserved in FAD 3 while variation in sizes was observed in the first and last exons. SNP identified in the 6th exon of soybean homolog (Glyma01g29630) was reported to be associated with low linolenic acid content [17].

The DGAT gene involved in TAG (Tri-acyl Glyceride) synthesis has two isoforms, DGAT-1 and DGAT-2. These two genes showed variation in number and sizes of exons and introns. DGAT-1 gene had 15 exons in Arabidopsis, 13 exons in castor bean, and 16 exons in soybean. DGAT-2 had 8 exons in Arabidopsis and castor bean and 7 exons in soybean. The detailed comparative genomics of fatty acid biosynthesis genes in 4 oil seed plant species provided insights to undertake identification and utilization of castor bean fatty acid biosynthesis genes and sequence variations for the development of candidate gene markers in Jatropha. Fatty acid biosynthesis genes showed evolutionary relatedness but there is no synteny in gene order and position of genes on the chromosomes. Location of genes on chromosomes in Arabidopsis and soybean is given in Supplementary Table 2.

4. Discussion

In general, plant oil biosynthesis mostly follows the common biosynthetic pathways for fatty acids in the plastid as well as TAG in the endoplasmic reticulum (ER) and the oil further accumulates in oil bodies. However, there are significant differences for content and composition of seed oil in different plant species. Using comparative genomics, we tried to infer the effect of change in gene structure differences on oil content in different plant species. In this study, 261 genes involved in biosynthesis and accumulation of seed oil were identified in four oil seed plant species, Arabidopsis, Brassica, castor bean, and soybean. The genes corresponded to six different categories (ACCase, desturase, elongase, thioesterase, TAG synthesis and oil body proteins). Gene families corresponding to these six categories of enzymes had multiple copies in plant species with the exception of homomeric ACCase. In higher plants, many proteins and enzymes are encoded by gene families, and in Arabidopsis, it has been estimated that 20% of genes are members of gene families [46]. The existence of gene families can sometimes reflect additional levels of genetic control or isoforms of proteins with specific functions. Therefore, it is of interest to detect potential gene families involved in the fatty acid biosynthesis pathway. There is a possibility that different copies of fatty acid biosynthesis genes are present in low oil content genotypes which gives leaky phenotypes as in the case of starch biosynthesis pathway where different copies of genes were responsible for low, medium, and high amylase contents in rice [47]. The oil biosynthesis may be limited by the production of fatty acids [48], which is regulated by acetyl CoA carboxylase (ACCase). Reduction of ACCase activity lowered (1.5–16%) the fatty acid content in transgenic seeds [49]. Conversion of acetyl Co-A to malonyl Co-A by acetyl carboxylase (ACCase) is the most committed step in fatty acid biosynthesis. ACCase of castor bean and soybean showed microsynteny to Arabidopsis, with a 3 bp deletion in 8th and 26th exons in castor bean, 3 bp deletion and 3 bp insertion in 29th and 31st exons in soybean and a 12 bp insertion in 24th, and 26th exons of castor bean and soybean, respectively with respect to Arabidopsis. These sequence variations in ACCase genes may be possibly influencing the variations in fatty acid composition and content in seed oil among Arabidopsis, castor bean, and soybean, as fatty acid content and composition was altered in many plant species with the variations in sequences or expression of ACCase gene [19, 50]. Yang et al. [19] identified two SNPs (T→G, G→A) in ACCase gene which lead to increase (1.3%) in oleic acid, lenolenic acid, and lenoleic acid content in maize. Addition of a plastid transit sequence targeted the introduced ACCase protein to chloroplasts, ultimately resulting in a 5% increase in seed oil of rapeseed [50]. The insertion or deletion identified in our analysis between Arabidopsis, castor bean, and soybean might be responsible for reduction or enhancement of ACCase activity, which is associated with the variations in total fatty acid composition in seed oil among these plant species. Studies in transgenic plants have demonstrated that thioesterases contribute to the regulation of fatty acid chain length [51]. Typically, FatB accepts saturated acyl-ACP substrates of varying length, while FatA is specific to unsaturated fatty acids and acts on C18:1, oleic, acyl-ACPs [51]. In Brassica napus and Arabidopsis, genetic engineering of Acyl-ACP thioesterase (FatB) resulted in maximum increase of 58% in palmitic acid content [52, 53]. Preventing the release of saturated fatty acids from ACP by downregulating FatB, which encodes a palmitoyl ACP thioesterase, lowered the levels of saturated fatty acids [54]. Variations in palmitate content in seed oil in plant species can be related to the variations in FatB gene [27, 52, 53]. Cardinal et al. [27] identified deletion in exon-inrton junction in one homolg of FatB gene which was associated with low palmitic acid content in soybean cultivar Century (N79-2077 and N93-2008). Palmitate content was ~8% in Arabidopsis [55], ~2% in castor bean [56] and 7–11% in soybean [57]. Variations in the amount of palmitic acid in the seeds of Arabidopsis, castor bean, and soybean might be due to deletions in first exon of FatB gene, which can be further utilized for identification of markers associated with high level of palmitate (saturated fatty acid) in total seed oil in plant species desired for biodiesel purpose. Soybean lines with high levels of oleic acid (85%) and low levels of saturated fatty acids (6%) have been developed using a transgenic strategy that results in downregulation of two genes, FAD 2, and FatB involved in fatty acid synthesis. Downregulation of the FAD 2 gene, encoding a Δ12 fatty acid desaturase, prevented the conversion of oleic acid to polyunsaturated fatty acids, resulting in increased levels of oleic acid. Additionally, preventing the release of saturated fatty acids from acyl carrier protein (ACP) by downregulating FatB gene, which encodes a palmitoyl ACP (acyl carrier protein) thioesterase, lowered the levels of saturated fatty acids [54]. Hu et al. [14] sequenced the FAD 2 gene fragment from the mutant line DMS100 and wild-type line Quantum of Brassica napus, and identified a single nucleotide mutation (C→T) in the FAD 2 gene. This particular mutation created a stop codon (TAG) leading to premature termination of the peptide chain during translation which leads to high oleic acid content in mutant line DMS100. B. napus mutant line DMS100 carrying a G-to-A substitution at the 5′ splice site of intron 6 in FAD 3 had reduced lenolenic acid content in seed oil [39]. In our analysis insertions or deletions in FAD 2 and FAD 3 genes of soybean might be the possible causes of higher oleate and linoleate content in high oil yielding soybean genotypes. Higher amount of ricinoleic acid in castor bean can be due to an insertion in the FAD 2 gene resulting in higher level of oleic acid because oleic acid is further utilized as a substrate by fatty acid hydroxylase (FAH) to convert oleate to ricinoleate. Low level of linoleate in castor bean oil may be due to a deletion in the 3rd exon of FAD 3 gene because each copy of FAD 3 in Arabidopsis and soybean is conserved. In our analysis, the acyl-CoA:diacylglycerol acyltransferases (DGAT) gene was highly diverse, which might be involved in the overall variation in triacylglycerols in the oil among the plant species as it is a key enzyme in determining the levels of triacylglycerols in seed oils [58, 59]. Burgal et al. [58] demonstrated that coexpressing the castor bean DGAT2 gene with the castor FA 12 hydroxylase resulted in almost double the levels of hydroxylated fatty acids in neutral lipids (up to 30% of total, compared with 17% in the absence of DGAT2). In our study, most of the variations observed in the coding regions are either insertion or deletion of 3 bp or multiple of three that represent codon usage which either leads to shift in reading frame or functional mutation that are expected to be related to oil content. Thus, the sequence variations identified in fatty acid biosynthesis genes in this study can be tested for their functional role in altering content and composition of seed oil in Jatropha.

5. Conclusion

Comparative genomics, for gene structures and coding sequence variations, was performed on 261 genes involved in fatty acids biosynthesis, TAG synthesis, and oil bodies formation in four oil seed plant species, Arabidopsis, Brassica rapa, castor bean, and soybean to understand whether differences in gene structures or coding sequence determine preferential biosynthesis of higher amounts of particular fatty acids and their contents in the seeds of different plant species. Overall comparative gene structure of fatty acid biosynthesis related genes provided an insight to improve oil quality for biodiesel by exploiting the variations for engineering FAD5, FAD6, and FatB genes to enhance the content of saturated fatty acids. The variations in FAD2, FAD3, Stearoyl desaturase, DGAT-1, and DGAT-2 will be helpful to enhance the oil content in plants. The close relationship between genes under study would be helpful for comparative genomics to study these genes in related species for oil content modification.
  39 in total

Review 1.  Metabolic engineering of fatty acid biosynthesis in plants.

Authors:  Jay J Thelen; John B Ohlrogge
Journal:  Metab Eng       Date:  2002-01       Impact factor: 9.783

2.  Major and minor QTL and epistasis contribute to fatty acid compositions and oil concentration in high-oil maize.

Authors:  Xiaohong Yang; Yuqiu Guo; Jianbing Yan; Jun Zhang; Tongming Song; Torbert Rocheford; Jian-Sheng Li
Journal:  Theor Appl Genet       Date:  2009-10-25       Impact factor: 5.699

Review 3.  Lipid biosynthesis.

Authors:  J Ohlrogge; J Browse
Journal:  Plant Cell       Date:  1995-07       Impact factor: 11.277

4.  Palmitoyl-acyl carrier protein (ACP) thioesterase and the evolutionary origin of plant acyl-ACP thioesterases.

Authors:  A Jones; H M Davies; T A Voelker
Journal:  Plant Cell       Date:  1995-03       Impact factor: 11.277

5.  Supply of fatty acid is one limiting factor in the accumulation of triacylglycerol in developing embryos

Authors: 
Journal:  Plant Physiol       Date:  1999-08       Impact factor: 8.340

6.  DGAT2 is a new diacylglycerol acyltransferase gene family: purification, cloning, and expression in insect cells of two polypeptides from Mortierella ramanniana with diacylglycerol acyltransferase activity.

Authors:  K D Lardizabal; J T Mai; N W Wagner; A Wyrick; T Voelker; D J Hawkins
Journal:  J Biol Chem       Date:  2001-07-31       Impact factor: 5.157

7.  A draft sequence of the rice genome (Oryza sativa L. ssp. japonica).

Authors:  Stephen A Goff; Darrell Ricke; Tien-Hung Lan; Gernot Presting; Ronglin Wang; Molly Dunn; Jane Glazebrook; Allen Sessions; Paul Oeller; Hemant Varma; David Hadley; Don Hutchison; Chris Martin; Fumiaki Katagiri; B Markus Lange; Todd Moughamer; Yu Xia; Paul Budworth; Jingping Zhong; Trini Miguel; Uta Paszkowski; Shiping Zhang; Michelle Colbert; Wei-lin Sun; Lili Chen; Bret Cooper; Sylvia Park; Todd Charles Wood; Long Mao; Peter Quail; Rod Wing; Ralph Dean; Yeisoo Yu; Andrey Zharkikh; Richard Shen; Sudhir Sahasrabudhe; Alun Thomas; Rob Cannings; Alexander Gutin; Dmitry Pruss; Julia Reid; Sean Tavtigian; Jeff Mitchell; Glenn Eldredge; Terri Scholl; Rose Mary Miller; Satish Bhatnagar; Nils Adey; Todd Rubano; Nadeem Tusneem; Rosann Robinson; Jane Feldhaus; Teresita Macalma; Arnold Oliphant; Steven Briggs
Journal:  Science       Date:  2002-04-05       Impact factor: 47.728

8.  Association and linkage analysis of aluminum tolerance genes in maize.

Authors:  Allison M Krill; Matias Kirst; Leon V Kochian; Edward S Buckler; Owen A Hoekenga
Journal:  PLoS One       Date:  2010-04-01       Impact factor: 3.240

Review 9.  Biofuels from microbes.

Authors:  Dominik Antoni; Vladimir V Zverlov; Wolfgang H Schwarz
Journal:  Appl Microbiol Biotechnol       Date:  2007-09-22       Impact factor: 4.813

10.  Arabidopsis FAD2 gene encodes the enzyme that is essential for polyunsaturated lipid synthesis.

Authors:  J Okuley; J Lightner; K Feldmann; N Yadav; E Lark; J Browse
Journal:  Plant Cell       Date:  1994-01       Impact factor: 11.277

View more
  12 in total

Review 1.  Agrigenomics for microalgal biofuel production: an overview of various bioinformatics resources and recent studies to link OMICS to bioenergy and bioeconomy.

Authors:  Namrata Misra; Prasanna Kumar Panda; Bikram Kumar Parida
Journal:  OMICS       Date:  2013-09-17

Review 2.  Systems genetics in "-omics" era: current and future development.

Authors:  Hong Li
Journal:  Theory Biosci       Date:  2012-11-09       Impact factor: 1.919

3.  Bioinformatics study of delta-12 fatty acid desaturase 2 (FAD2) gene in oilseeds.

Authors:  Fatemeh Dehghan Nayeri; Kazem Yarizade
Journal:  Mol Biol Rep       Date:  2014-05-11       Impact factor: 2.316

4.  Structural organization of fatty acid desaturase loci in linseed lines with contrasting linolenic acid contents.

Authors:  Dinushika Thambugala; Raja Ragupathy; Sylvie Cloutier
Journal:  Funct Integr Genomics       Date:  2016-05-03       Impact factor: 3.410

5.  Endoplasmic reticulum retention signaling and transmembrane channel proteins predicted for oilseed ω3 fatty acid desaturase 3 (FAD3) genes.

Authors:  Mohammad Fazel Soltani Gishini; Alireza Zebarjadi; Maryam Abdoli-Nasab; Mokhtar Jalali Javaran; Danial Kahrizi; David Hildebrand
Journal:  Funct Integr Genomics       Date:  2019-11-28       Impact factor: 3.410

6.  Genome-Wide Association Study of Arabidopsis thaliana Identifies Determinants of Natural Variation in Seed Oil Composition.

Authors:  Sandra E Branham; Sara J Wright; Aaron Reba; C Randal Linder
Journal:  J Hered       Date:  2015-12-24       Impact factor: 2.645

Review 7.  The FAD2 Gene in Plants: Occurrence, Regulation, and Role.

Authors:  Aejaz A Dar; Abhikshit R Choudhury; Pavan K Kancharla; Neelakantan Arumugam
Journal:  Front Plant Sci       Date:  2017-10-18       Impact factor: 5.753

8.  Cloning and characterization of EgGDSL, a gene associated with oil content in oil palm.

Authors:  Yingjun Zhang; Bin Bai; May Lee; Yuzer Alfiko; Antonius Suwanto; Gen Hua Yue
Journal:  Sci Rep       Date:  2018-07-30       Impact factor: 4.379

9.  Phylogenomic study of lipid genes involved in microalgal biofuel production-candidate gene mining and metabolic pathway analyses.

Authors:  Namrata Misra; Prasanna Kumar Panda; Bikram Kumar Parida; Barada Kanta Mishra
Journal:  Evol Bioinform Online       Date:  2012-09-20       Impact factor: 1.625

10.  De novo assembly and characterization of Camelina sativa transcriptome by paired-end sequencing.

Authors:  Chao Liang; Xuan Liu; Siu-Ming Yiu; Boon Leong Lim
Journal:  BMC Genomics       Date:  2013-03-05       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.