| Literature DB >> 28886750 |
Kuang-Lim Chan1,2, Tatiana V Tatarinova3,4, Rozana Rosli1,5, Nadzirah Amiruddin1, Norazah Azizi1, Mohd Amin Ab Halim1, Nik Shazana Nik Mohd Sanusi1, Nagappan Jayanthi1, Petr Ponomarenko4, Martin Triska6, Victor Solovyev7, Mohd Firdaus-Raih2, Ravigadevi Sambanthamurthi1, Denis Murphy5, Eng-Ti Leslie Low8.
Abstract
BACKGROUND: Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools.Entities:
Keywords: Fatty acids; Gene prediction; Intronless; Oil palm; Resistance genes; Seqping
Mesh:
Year: 2017 PMID: 28886750 PMCID: PMC5591544 DOI: 10.1186/s13062-017-0191-4
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
Fig. 1Integration workflow of Fgenesh++ and Seqping gene predictions. Trans – Gene models with oil palm transcriptome evidence; Prot – Gene models with RefSeq protein evidence. # The 26,059 gene models formed the representative gene set that was used for further analysis. The representative gene set was also used to identify and characterize oil palm IGs, R and FA biosynthesis genes
Fig. 2Overlap thresholds using the rate of increase of single-isoform loci. Based on the widening divergence at 85%, the level was selected as the overlap threshold
Fig. 3Distribution of oil palm gene models. a Number of genes vs. number of exons per gene b Number of genes vs lengths of CDS
Fig. 4GC3 distribution in oil palm gene models. a GC (red) and GC3 (blue) composition of coding regions of E. guineensis. b Genome signature for GC3-rich and -poor genes. c GC3 gradient along the open reading frames of GC3-rich and -poor genes. d CG3 skew gradient along the open reading frames of GC3-rich and -poor genes. Figures c and d: x-axis is number of codons in coding sequence. Figure d: C3 and G3 is frequency of cytosine or guanine in third position of codon. CG3 is frequency of cytosine and guanine in third position of codon
Fig. 5GC3 contents of oil palm intronless and multi-exonic genes
Fig. 6Classification of oil palm intronless genes (IG) in different taxonomy groups. The Venn diagram shows the projections of 26,059 oil palm high quality loci and 3658 oil palm IG (in parenthesis) into three domains of life based on homology, archaea, bacteria and eukaryotes. The sub-diagram shows the distribution of oil palm IG from the eukaryote domain into three major taxonomy groups of life - Green Plants, Fungi and Animals. ORFans refers to the unique sequence that shares no significant similarity with other organisms
Fig. 7Classification of candidate R genes. a Distribution of the genes in oil palm, A. thaliana, Z. mays, O. sativa, S. bicolor and V. carteri b Examples of key domains identified via InterProScan in oil palm candidate R-genes. Number of identified candidate oil palm genes are in brackets
Fig. 8Fatty acid biosynthesis in E. guineensis a Schematic pathway diagram for fatty acid biosynthesis. Numbers of identified oil palm candidate genes are in brackets. b Fatty acid composition in mesocarp and kernel
Fig. 9Transcriptome analysis of a FABF, b FAB2, c FAD2, d FAD3, e FATA and f FATB genes in mesocarp and kernel
Fig. 10Evolutionary relationship of FAB2 in oil palm (E. guineensis), A. thaliana and Z. mays. Analyses carried out using UPGMA method in MEGA 6 software. Abbreviations: Eg - E. guineensis; At - A. thaliana; Zm - Z. mays