Literature DB >> 31390382

Genome-wide association study for leaf area, rachis length and total dry weight in oil palm (Eleaeisguineensis) using genotyping by sequencing.

B Kalyana Babu1, R K Mathur1, G Ravichandran1, P Anitha1, M V B Venu1.   

Abstract

The marker-trait association for complex traits using genotyping by sequencing (GBS) method is being widely spread in plants. The study aimed to identify significant single nucleotide polymorphism (SNP) associations for rachis length (RL), leaf area (LA) and total dry weight (TrDW) in oil palm among diverse African germplasm. The Illumina NextSeq platform has been used for SNP genotyping and retained 4031 fully informative SNPs after applying the filter criterion. These 4031 SNPs were used for genome wide association study for the above three traits. The LD decay rates of the African germplasm using GBS data of SNP is observed to be 25 Kb at 0.45 of average pair wise correlation coefficient (r2). Association mapping led to the identification of seven significant associations for three traits using MLM approach at a P value of ≤ 0.001. Three associations were identified for total dry weight, two each for leaf area index and rachis length. The qtlLA1 was found to be highly significant at a P value of 7.39E-05 (18.4% phenotypic variance) which is located on chromosome 4. Two QTLs (qtlLA2 and qtlRL1) were located on chromosome 1, which explained 11.9% and 12.4% of phenotypic variance respectively. Three QTLs for total dry weight were located on chromosome 2, 14 and 16, all-together explained 40% phenotypic variance. The results showed that the SNP-trait associations identified in the present study could be used in selection of elite oil palm germplasm for higher yields.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 31390382      PMCID: PMC6685610          DOI: 10.1371/journal.pone.0220626

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Oil palm (Elaeis guineensis Jacq.) belongs to the family Arecaceae, which is a major contributor (approximately 40%) of world edible vegetable oil. The genome size of oil palm is 1.8 Gb with haploid chromosome number of 16. The genome sequence of an elite dura palm was published [1] and freely available in the open source. It is a perennial crop having potential oil yield capacity of 10 t/ha, against the current yields which varied between 2 to 6 t/ha [2]. Biotechnological tools like molecular markers facilitates in bridging the gap for genetic improvement in yield, oil quality and other important agro-morphological traits of interest. Leaf area and rachis length are found to be highly correlated with oil palm yield which is proven in other crops as well. Hence, leaf area and rachis length are considered as important traits influencing the yield in oil palm. Several studies using GBS for genome-wide association study (GWAS) were reported in rice [3], date palm [4] and oil palm [5], but no reports is available for leaf area, rachis length and total dry weight in oil palm using African germplasm. Molecular markers like RFLPs, AFLPs [6], EST-SSRs [7] and SSRs [5] were used for identification of QTLs for important traits in oil palm. However, genome-wide association study (GWAS) is often hindered due to lack of high diverse genetic pool and well structured germplasm [8]. The GWAS analysis using SNPs emerged as a powerful tool initially in human population to identify complex diseases and then widely implemented in several crops including rice [9], foxtail millet [10-11], maize [12] and in some tree species [13]. The genotyping by sequencing (GBS) method based on restriction digestion of DNA samples has been widely used for identification of QTLs. In oil palm, there are few reports on GBS for GWAS studies for different traits like oil to bunch, oil to dry mesocarp etc. which were mostly based on linkage mapping populations. Recently, Ithnin et al. [14] performed GWAS for oil yield and vegetative traits where they identified 19 QTLs for 8 traits. However, all the studies were reported on F1 populations, which lacks clear segregation pattern which can be improved by using advanced segregating generations or highly diverse natural population for association mapping [15]. Till now, no report is available on GWAS for leaf area, rachis length and total dry weight content using SNPs by association mapping. In perennial crops, natural populations offer advantages like much higher mapping resolution, greater allele number and broader reference population, and less research time in establishing an association [16]. The African sub-continent holds a large diverse germplasm which has highly potential oil yielding and quality traits [17]. In the last 50 years, there was high increase in crude palm oil yield (2.0 to 4.1 t/ha) by conventional breeding, which is still much lower than the theoretical estimates of 18 t/ha [18]. Oil palm yield is highly influenced by leaf area, rachis length and total dry weight content [19]. The shape, leaf area and rachis length of crowns determine light interception and thus influence yield of oil palms (Elaeis guineensis Jacq.) by increasing the photosynthetic efficiency as evidenced by high significant correlation observed by several workers [14]. The long breeding cycle of oil palm makes a complicated process for oil palm improvement by conventional breeding. Hence, molecular markers like SNPs plays important role in identification of markers linked to important traits. Hence, in the present study African germplasm were used for identifying SNP-trait associations using GWAS by GBS method. The objectives of the study are 1) Phenotypic characterization of oil palm germplasm for leaf area, rachis length and total dry matter weight, and 2) GWAS analysis for identification of SNP-trait associations for leaf area, rachis length and total dry weight content in oil palm germplasm.

Methodology

Plant materials and DNA extraction

Oil palm germplasm representing four African countries were used for SNP genotyping (S1 Fig). The germplasm used in the study were obtained from Malaysian Palm Oil Board (MPOB), Malaysia as part of MoU between ICAR-Indian Institute of Oil Palm Research (ICAR-IIOPR) and MPOB. The palms are grown at the ICAR-Indian Institute of Oil Palm Research (ICAR-IIOPR), Pedavegi, Andhra Pradesh, India experimental fields as per the standard procedures (latitude 160 480N, longitude 81070E). All the germplasm belongs Dura fruit type. The 96 oil palm germplasm were treated identically in terms of fertilizer application and pest controls. The samples selection was based on the phenotypic variation of selected morphological traits. The experiments were conducted with the consent of Director, ICAR-IIOPR, India. The genomic DNA of oil palm genotypes was isolated by standard method as described by Murray and Thomson [20] with few modifications as described in Babu et al. [21]. The genomic DNA quality and quantity was checked on 0.8% agarose gel using lambda DNA as standard.

Phenotypic data analysis

The phenotypic data of leaf area, rachis length and total dry weight content of oil palm were recorded for four consecutive years (2010–2014) according to Corley et al. [22]. The traits used for association mapping are leaf area, rachis length and total dry weight content. Statistical analysis of the traits was estimated for their mean, range, standard deviation and 95% confidence interval. The phenotypic data of rachis length, leaf area and total dry weight is given in S1 Table. The correlation coefficient between the traits and the above statistical parameters calculated using the software JMP [23] and correlation thresholds were considered significant at alpha of 0.001 and 0.01. The distribution of phenotypic data of leaf area and rachis length is given in Fig 1.
Fig 1

The bar charts for rachis length (RL) and leaf area (LA) of oil palm.

Genotyping by sequencing, SNP calling and annotation

SNP genotyping among the oil palm genotypes was done by following the GBS approach as given by Elshire et al. [24] with ApeKI restriction enzyme used for complexity reduction. The library was prepared using paired end method in the Illumina NextSeq 500 platform. The GBS was done by outsourcing to the Bio Serve (India, Pvt. Ltd). Data normalization has been done at minimum allele frequency (MAF) of 0.05, and by removing 80% missing SNP calls. Only SNP calls >90% has been considered for further analysis. The sequences were mapped to oil palm reference genome [1] and SNPs were called using TASSEL ver. 4.3.10 GBS pipe line [25].

Genome-wide association mapping

The population structure analysis was done by model based clustering method [26] as adopted in the STRUCTURE v2.3.4 software [27]. Marker-trait associations were performed by using data of 96 oil palm germplasm, SNP genotypic data by GBS approach and Q matrix obtained from the population structure data by using the software TASSEL [28]. The GWAS was performed using MLM approach embedded in the TASSEL software. The kinship matrix was used in addition to the genotypic, phenotypic and Q matrix data in the MLM approach. The significant threshold for the association was set at P of 1X10-4. The genome wide patterns of LD decay were estimated by average of R2 in distance intervals across the 16 chromosomes of whole genome of oil palm.

Results and discussion

Genotyping using GBS for SNP discovery

A set of 96 diverse germplasm from four African countries viz., Cameroon, Tanzania, Guinea-Bissau, and Zambia were used for SNP discovery by genotyping by sequencing method. The genomic DNA of 96 African germplasm were subjected to ApeKI digested paired end libraries and sequenced on Illumina NextSeq500 platform. The sequencing resulted in 325 million reads covering 50.78 Gb, at a mean of 3.4 million reads per sample. The sequence data obtained by GBS is lesser than earlier studies [29-31]. Pootakham et al. [13] obtained 524 million reads in F1 oil palm population. Teh et al. [32] obtained 900 million raw reads from 132 oil palm F1 progeny. The GBS raw data was imputed to SNP calling and data normalization has been done using MAF of 0.05 and by removing 80% missing SNP calls. Only SNP calls >90% has been considered for further analysis. A total of 325 million raw reads generated, of which 2.4 million SNPs passed bar coding and quality thresholds. Finally, 4031SNPs were used for mapping to the reference genome with a range between 157 on chromosome 15 to 455 on chromosome 1 (S2 Table). The physical location of SNPs mapped on the oil palm genome was given in Fig 2. The number of SNPs was similar to the earlier reports [14] where they identified 4451 SNPs across the 422 samples. The SNP allele data files were given in supplementary information (S1 Text).
Fig 2

The location of QTLs on oil palm chromosomes.

The population structure analysis

The population structure analysis is conducted among African germplasm of oil palm to estimate the population structure and admixture using SNP markers. To estimate the exact population structure among the oil palm germplasm, Ks were ran from 1 to 10 with 10 iterations and grouping was done based on LnP(D) values. The highest peak value of ΔK was observed at K = 4. High amount of admixture also presented among different groups which may be due to involvement of exotic germplasm. The same type of admixture was also found in earlier reports [32]. The structure analysis revealed that there are four structural patterns existed among the African germplasm which divided them into four groups (Fig 3). The grouping pattern observed to be according the geographic origin of the germplasm to some extent. High amount of admixture of alleles also evident in the populations, which showed a high germplasm exchange between African countries for better breeding programmes. It is also evident that all the present germplasm of oil palm were derived from two Bogor palms, hence high admixture percentage is present.
Fig 3

The population structure among the 96 African germplasm as revealed by SNPs by genotyping by sequencing method.

Linkage disequilibrium (LD)

The pattern of LD decay is crucial for identifying significant QTLs of any association mapping studies [33] and for interpretation of association peaks [34]. To estimate the mapping resolution for genome wide association mapping, the average extent of LD decay was quantified. The LD decay rates of the oil palm germplasm using GBS data of SNP was 25 Kb at 0.45 of average pair wise correlation coefficient (r2). The long range LD in the highly diverse populations can be further explained by breeding selections. The process of selective sweep will much enhance the breeding value of the crops like oil palm. The LD of oil palm plantations in general decays considerably faster than in annual crops like rice and foxtail millet. In both the crops LD decays 100 Kb to 200 Kb (2–5), which decays to 50% of its initial value by 1 Kb with in 150 Kb. This difference may be due to low genome coverage of markers and fewer genotypes in the study. Similar LD decay also observed by Teh et al. [32] where they also found 25Kb and 20Kb at 0.12 and 0.15 of average R2 value.

Genome wide association study (GWAS)

High yielding oil palm germplasm is an immediate output required for the farming community. There is a big gap in the theoretical CPO yield (18 tons/ha) and global actual average yield (2.0 to 4.0 tons/ha) over the past 50 years [18]. Due to long breeding cycle of oil palm, it became a complicated process in selection of high yielding germplasm. Oil palm yield is highly influenced by leaf area and rachis length. These are complex, quantitative and often controlled by large number of genetic loci [35]. It is important to search for effective QTLs associated with these important traits and utilizing in identification of desirable palms with good yield related traits. Hence, the present investigation aimed to identify significant QTLs using SNPs by genotyping by sequencing method. The study represents the first GWAS study of African germplasm by association mapping approach. Association of leaf area, rachis length and total dry weight using SNPs, resulted identification of seven highly significant QTLs at a P value of ≤ 0.001 by MLM approach. A high significant cut off was given in order to remove the errors generated due to false associations as expected. Three SNP loci were identified to be associated with total dry weight content, followed by two each for leaf area and rachis length. The physical location of QTLs linked to three traits on chromosomes given in Fig 2. All the seven significant QTLs were identified on five chromosomes viz., 1, 2, 4, 14, and 16 which explained a phenotypic variance from 11.5 to 18.5%. Two significant QTLs were identified for leaf area, which together explained 30.5% of phenotypic variance. Among them, qtlLA1 (SGI|741241788|) found significant at P value of 7.39E-05 (18.5% phenotypic variance). The qtlLA2 (SGI|741241880|) linked at P of 0.001 and explained 12% R2 value. These two QTLs located on chromosome 4 and 1 respectively. The manhatton plot for leaf area given in Fig 4.
Fig 4

The Man-Hatton plot for leaf area of oil palm as revealed by association mapping.

Three QTLs for total dry weight content and two for rachis length were identified. The loci for total dry weight content were located on chromosome 2, 14 and 16. The QTL (qtlTDM1) located on chromosome 14 found significant at P of 0.0005 which explained 15% phenotypic variance. The three QTLs for total dry weight content together explained 42% of total phenotypic variance. Likewise two QTLs were found significant for rachis length. The manhatton plots for rachis length is given in Fig 5. These two QTLs located on chromosome 1 and 16, together explained 24% of R2 value (Table 1). The present study results are similar to the earlier reports [14–15, 31]. Ithnin et al [14] and Billottee et al [31] found significant SNP loci for rachis length on chromosome 2, 4, 10 and 16 on F1 progeny by linkage mapping studies. In the present study also QTLs for rachis length identified on 16th chromosome along with unique loci on chromosome 1. The results showed that chromosome 16 harboring the major QTL for rachis length which need attention in cloning the whole gene for rachis length. Billottee et al [31] also identified QTLs for rachis length on 2, 4 and 16 chromosomes. These two reports were based on F1 progeny, however the present study is based on genome-wide association study which gives more polymorphism and high precision in identifying QTLs for specific traits. In addition to the above results, the study also identified unique loci for leaf area and rachis length. The identified QTLs in the present study helps in selection of desired oil palm germplasm for high yield characters in different genetic backgrounds which helps in reducing time and increases accuracy.
Fig 5

The Man-Hatton plot for rachis length of oil palm.

Table 1

The details of significant QTLs identified by GBS method and their P values and phenotypic variances.

TraitQTL nameLinked SNP lociP valueR2 (%)Chromosome number
Leaf AreaqtlLA1SGI|741241788|7.39E-0518.54
qtlLA2SGI|741241880|0.001121
Total dry weight contentqtlTDM1SGI|741241610|0.00051514
qtlTDM2SGI|741241857|0.000913.52
qtlTDM3SGI|741241587|0.000913.516
Rachis lengthqtlRL1SGI|741241880|0.00112.51
qtlRL2SGI|741241589|0.00111.516

Conclusion

The genotyping by sequencing method used for GWAS study of African germplasm using GBS method to identify the QTLs for three yield related traits viz., leaf area, rachis length and total dry weight content. Association mapping resulted in identification of seven highly significant QTLs for different traits at a P value of ≤ 0.001 by MLM approach. More genetic loci were identified to be associated with total dry weight, followed by leaf area and rachis length. The QTLs identified in the present study could be used in selection of elite oil palm germplasm for higher yields.

The map showing the distribution of oil palm germplasm in four African countries.

(DOCX) Click here for additional data file.

The phenotypic data of the traits under association mapping.

(DOCX) Click here for additional data file.

The number of SNPs chromosome wise along with their positions.

(DOCX) Click here for additional data file.

Allele calling data files for some markers.

(DOCX) Click here for additional data file.
  22 in total

1.  A method for fine mapping quantitative trait loci in outbred animal stocks.

Authors:  R Mott; C J Talbot; M G Turri; A C Collins; J Flint
Journal:  Proc Natl Acad Sci U S A       Date:  2000-11-07       Impact factor: 11.205

2.  Genome-wide association studies of 14 agronomic traits in rice landraces.

Authors:  Xuehui Huang; Xinghua Wei; Tao Sang; Qiang Zhao; Qi Feng; Yan Zhao; Canyang Li; Chuanrang Zhu; Tingting Lu; Zhiwu Zhang; Meng Li; Danlin Fan; Yunli Guo; Ahong Wang; Lu Wang; Liuwei Deng; Wenjun Li; Yiqi Lu; Qijun Weng; Kunyan Liu; Tao Huang; Taoying Zhou; Yufeng Jing; Wei Li; Zhang Lin; Edward S Buckler; Qian Qian; Qi-Fa Zhang; Jiayang Li; Bin Han
Journal:  Nat Genet       Date:  2010-10-24       Impact factor: 38.330

3.  Recombination and linkage disequilibrium in Arabidopsis thaliana.

Authors:  Sung Kim; Vincent Plagnol; Tina T Hu; Christopher Toomajian; Richard M Clark; Stephan Ossowski; Joseph R Ecker; Detlef Weigel; Magnus Nordborg
Journal:  Nat Genet       Date:  2007-08-05       Impact factor: 38.330

4.  TASSEL: software for association mapping of complex traits in diverse samples.

Authors:  Peter J Bradbury; Zhiwu Zhang; Dallas E Kroon; Terry M Casstevens; Yogesh Ramdoss; Edward S Buckler
Journal:  Bioinformatics       Date:  2007-06-22       Impact factor: 6.937

5.  A high-performance computing toolset for relatedness and principal component analysis of SNP data.

Authors:  Xiuwen Zheng; David Levine; Jess Shen; Stephanie M Gogarten; Cathy Laurie; Bruce S Weir
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

6.  QTL detection by multi-parent linkage mapping in oil palm (Elaeis guineensis Jacq.).

Authors:  N Billotte; M F Jourjon; N Marseillac; A Berger; A Flori; H Asmady; B Adon; R Singh; B Nouy; F Potier; S C Cheah; W Rohde; E Ritter; B Courtois; A Charrier; B Mangin
Journal:  Theor Appl Genet       Date:  2010-02-25       Impact factor: 5.699

7.  A haplotype map of genomic variations and genome-wide association studies of agronomic traits in foxtail millet (Setaria italica).

Authors:  Guanqing Jia; Xuehui Huang; Hui Zhi; Yan Zhao; Qiang Zhao; Wenjun Li; Yang Chai; Lifang Yang; Kunyan Liu; Hengyun Lu; Chuanrang Zhu; Yiqi Lu; Congcong Zhou; Danlin Fan; Qijun Weng; Yunli Guo; Tao Huang; Lei Zhang; Tingting Lu; Qi Feng; Hangfei Hao; Hongkuan Liu; Ping Lu; Ning Zhang; Yuhui Li; Erhu Guo; Shujun Wang; Suying Wang; Jinrong Liu; Wenfei Zhang; Guoqiu Chen; Baojin Zhang; Wei Li; Yongfang Wang; Haiquan Li; Baohua Zhao; Jiayang Li; Xianmin Diao; Bin Han
Journal:  Nat Genet       Date:  2013-06-23       Impact factor: 38.330

8.  A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species.

Authors:  Robert J Elshire; Jeffrey C Glaubitz; Qi Sun; Jesse A Poland; Ken Kawamoto; Edward S Buckler; Sharon E Mitchell
Journal:  PLoS One       Date:  2011-05-04       Impact factor: 3.240

9.  A large maize (Zea mays L.) SNP genotyping array: development and germplasm genotyping, and genetic mapping to compare with the B73 reference genome.

Authors:  Martin W Ganal; Gregor Durstewitz; Andreas Polley; Aurélie Bérard; Edward S Buckler; Alain Charcosset; Joseph D Clarke; Eva-Maria Graner; Mark Hansen; Johann Joets; Marie-Christine Le Paslier; Michael D McMullen; Pierre Montalent; Mark Rose; Chris-Carolin Schön; Qi Sun; Hildrun Walter; Olivier C Martin; Matthieu Falque
Journal:  PLoS One       Date:  2011-12-08       Impact factor: 3.240

10.  Mapping quantitative trait loci (QTLs) for fatty acid composition in an interspecific cross of oil palm.

Authors:  Rajinder Singh; Soon G Tan; Jothi M Panandam; Rahimah Abdul Rahman; Leslie C L Ooi; Eng-Ti L Low; Mukesh Sharma; Johannes Jansen; Suan-Choo Cheah
Journal:  BMC Plant Biol       Date:  2009-08-26       Impact factor: 4.215

View more
  3 in total

Review 1.  CRISPR/Cas mediated base editing: a practical approach for genome editing in oil palm.

Authors:  Rajesh Yarra; Hongxing Cao; Longfei Jin; Yang Mengdi; Lixia Zhou
Journal:  3 Biotech       Date:  2020-06-15       Impact factor: 2.406

Review 2.  Phenomics, genomics of oil palm (Elaeis guineensis Jacq.): way forward for making sustainable and high yielding quality oil palm.

Authors:  B Kalyana Babu; R K Mathur; P Anitha; G Ravichandran; H P Bhagya
Journal:  Physiol Mol Biol Plants       Date:  2021-03-14

3.  Genome wide association study of agronomic and seed traits in a world collection of proso millet (Panicum miliaceum L.).

Authors:  Sameh Boukail; Mercy Macharia; Mara Miculan; Alberto Masoni; Alessandro Calamai; Enrico Palchetti; Matteo Dell'Acqua
Journal:  BMC Plant Biol       Date:  2021-07-09       Impact factor: 4.215

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.