Literature DB >> 20198184

Identification of single nucleotide polymorphism in ginger using expressed sequence tags.

Arumugam Chandrasekar1, Aikkal Riju, Kandiyl Sithara, Sahadevan Anoop, Santhosh J Eapen.   

Abstract

UNLABELLED: Ginger (Zingiber officinale Rosc) (Family: Zingiberaceae) is a herbaceous perennial, the rhizomes of which are used as a spice. Ginger is a plant which is well known for its medicinal applications. Recently EST-derived SNPs are a free by-product of the currently expanding EST (Expressed Sequence Tag) databases. The development of high-throughput methods for the detection of SNPs (Single Nucleotide Polymorphism) and small indels (insertion/deletion) has led to a revolution in their use as molecular markers. Available (38139) Ginger EST sequences were mined from dbEST of NCBI. CAP3 program was used to assemble EST sequences into contigs. Candidate SNPs and Indel polymorphisms were detected using the perl script AutoSNP version 1.0 which has used 31905 ESTs for detecting SNPs and Indel sites. We found 64026 SNP sites and 7034 indel polymorphisms with frequency of 0.84 SNPs / 100 bp. Among the three tissues from which the EST libraries had been generated, Rhizomes had high frequency of 1.08 SNPs/indels per 100 bp whereas the leaves had lowest frequency of 0.63 per 100 bp and root is showing relative frequency 0.82/100bp. Transitions and transversion ratio is 0.90. In overall detected SNP, transversion is high when compare to transition. These detected SNPs can be used as markers for genetic studies. AVAILABILITY: The results of the present study hosted in our webserver www.spices.res.in/spicesnip.

Entities:  

Keywords:  Expressed Sequence Tag; Ginger; Indel; SNPs; Zingiber officinaleRosc; in silico

Year:  2009        PMID: 20198184      PMCID: PMC2828891          DOI: 10.6026/97320630004119

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

Ginger (Zingiber officinale) is a perennial plant in the family Zingiberaceae ‐ its rhizome is commonly used as a cooking spice throughout the world. The ginger plant has a long history of cultivation known to originate in China and then spread to India, Southeast Asia, West Africa, and the Caribbean. India is a leading producer of ginger in the world. Ginger is cultivated in most of the states in India. Kerala and Meghalaya are major ginger growing states in the country. The rhizomes and stems of ginger have assumed significant roles in Chinese, Japanese, and Indian medicine since the 1500s [1].The oleoresin of ginger is often contained in digestive, antitussive, antiflatulent, laxative, and antacid compounds [2]. Ginger has a large genome of 23618 Mbp distributed in 2n=22 chromosomes. The phytochemistry and pharmacology of this is well studied but the molecular biological process involved in this is not yet studied. Singlepass sequencing of the 5' and/or 3' ends of randomly selected cDNA clones, is an effective approach to provide genetic information of an organism. These sequences can serve as markers or tags for transcripts, and have been used in the development of SNP markers for reference genetic map and recovery of full-length cDNA and genomic sequences. Expressed sequence tags (ESTs) are also useful for the discovery of novel genes, investigation of genes of unknown function, comparative genomic study, and recognition of exon/intron boundaries. Currently, there are 38139 available ginger sequences in the GenBank, and majority of these sequences are ESTs which had been deposited at NCBI (dbEST) http://www.ncbi.nlm.nih.gov/dbEST/. The lack of sequence information has limited the progress of gene discovery and characterization, global transcript profiling, probe design for development of gene arrays, and generation of molecular markers for Ginger. In this study, we have categorized 38139 ESTs in to three tissue libraries leaves 13274, rhizomes 12763 and roots 12092 ESTs. The availability of these EST sequences will allow comparative genomic studies between ginger and other monocotyledonous and dicotyledonous plants, development of molecular markers for the establishment of reference genetic map, design and construction of cDNA microarray for global gene expression profiling. Single nucleotide polymorphisms (SNPs) are a second class of genetic markers that can be mined from sequence data and are useful for characterizing allelic variation, genome-wide mapping, and as a tool for marker-assisted selection. In the field of human genetics, SNPs are a major focus of efforts to increase the efficiency of mapping [3-6] and are already being used for detection and mapping of a variety of diseases [7-9]. In many crop plants, SNPs are present with sufficient frequency to offer an alternative for genetic mapping and markerassisted selection. Although SNPs can be identified by sequencing selected DNA fragments, a practical limitation to this approach for ginger follows from the fact that the sequencing error rate is often higher than the polymorphism rate. The cost of SNP discovery through sequencing amplified fragments is therefore high even with reductions in the cost of sequencing. The objectives of the research described in this paper were to assess the potential of existing public databases for the discovery of single nucleotide polymorphisms. We have mined updated EST tissue libraries of zingiber officinale for this analysis to find the SNP / Indel polymorphisms. SNP detecting perl scripts AutoSNP version 1.0 is used indentify the SNP / Indel polymorphisms, DNA substitution like Transversion vs Transition and Indel [10]. There are some other SNP detecting software such as SEAN [11] PolyPhred [12] PolyBayes [13] TRACE_DIFF [14] and HarvEST (http://harvest.ucr.edu) but AutoSNP provides user friendly approach and interpretable results as html file. Thus there are ten kinds of SNP/indel (two types of transition and four types of transversion and four groups of indels) are possible in the SNP/indel sites in EST libraries. We have used three tissue libraries [15,16] of Zingiber officinale.

Methodology

EST database of NCBI (dbEST release 092509) contains 38139 Zingiber officinale Express sequence tag data. We have mined 38139 EST sequences consist of three tissue libraries of leaves 13274 (DV544275-ES560515), rhizomes 12763 (DY350707-DY363469) and roots 12092 (DY363470-DY375561). CAP3 program is used to assemble the EST sequence in to contigs. The SNP detection tool AutoSNP version.1.0 was used to find the candidate SNPs from these libraries. AutoSNP required input as ace or fasta format. But the perl script edited manually to analyse fasta or ace format. Sequence assembly program CAP3 is integrated in AUTOSNP to make fasta files in to contigs (http://bioweb.pasteur.fr/seqanal/interfaces/cap3.html) [17]. The DNA substitution like transition (Ts) versus transversion (Tv) ratio of all the libraries in Ginger genome was also calculated.

Discussion

In this study it is discovered that total of 64026 SNP sites and 7034 indel polymorphisms in 38139 ESTs analyzed with an average frequency of 0.84 SNPs / 100 bp. Results of the tissue wise SNP and indel discovery are listed in Table 1 (see supplementary material) and Figure 1. In Ginger leaves tissue libraries showing high indels 1983 while comparing other tissues. Rhizome tissues showing the high SNP frequency 1SNP in 100bp. In Ginger a total of 27083 transitions, 29909 transversions and 7034 indels were found while analysis. But we found in tissue wise manner, rhizome transitions are high in number 13433. Rhizome tissues having more SNPs than others. Rhizome part is more expressed than other tissues. While discovering all SNP with DNA substitution overall transitions and transversions ratio is 0.90. When compared to ginger with others, the studies on the occurrence and nature of SNPs are beginning to receive considerable attention, particularly Arabidopsis where over 37,000 SNPs have been identified through the comparison of two accessions [18]. It has been reported in maize that there occurs a frequency of one non-coding SNP per 31 bp and 1 coding SNP per 124bp in 18 maize genes assayed in 36 inbred lines [19]. Moreover the recent evidence has indicated that SNPs appear to be even more abundant in plant systems than in the human genome. Germano and Klein [20] identified five SNPs in 1 kb of cDNA of Picea rubens and Picea mariana, and also discovered SNPs in the chloroplasts of these species. Recently, in soyabean (Glycine max), two SNPs found approximately 400 bp [21]. In maize (Zea mays), SNP has been detected even more frequently, with one SNP approximately every 48 bp and every 130 bp in 3' untranslated regions and coding regions, respectively [22,23]. The SNP analysis on Apple (Malus domestica) ESTs the Bi-allelic SNPs were on an average of every 706 bp [24] and the study in Maize ESTs [25] also showed the relative increase of over transversion and transition sites. This in silico analysis will help ginger researchers about the single nucleotide polymorphism related study and nucleotide substitution in this important crop.
Figure 1

DNA substitution and indel polymorphism of SNPs in Ginger EST libraries.

Large-scale sequencing of Expressed Sequence Tags and complete genomes offers information of use to plant breeding programs. With the completion of the first crop genome sequencing projects [26,27] the potential for plant breeding to be impacted by new technology has never been greater. In ginger, sequencing projects offer a potential solution to the scarcity of markers that can be used in elite breeding populations. Of special interest is the ability to discover DNA polymorphisms by mining sequence data [28,29] . The frequency of single nucleotide polymorphisms that we detected is considerably lower than reported for maize, wheat, barley, and soybean. Not surprisingly it is also lower than the one SNP per approximately 100 bases that was detected in some of tissue libraries [30]. There was a relative increase in the proportion of transition (6805) over transversion (7258) in Ginger ESTs except in leaves libraries (Figure 1). C / T transition was found to be high in ginger (Table 1 in supplementary material). High frequency of the C to T mutation is usually seen due to methylation. We also used the Shannon information index to analyze the proportion of ten possible types of SNP/indels. ESTs from tissues of root showed highest values of indices (0.164) whereas leaves had the least value (0.150) and rhizome showed relatively increased value (0.152). Our study on higher number and Shannon index of SNP/indel sites in root tissue than other tissues also gives the additional information about in genomic variation in genes expressed specifically in root tissue. Ratio of transition to transversion (Ts/Tv) was very useful to compare the genotypes of hepatitis virus C and also differences among the mitochondrial genomes of animals. Our study gives a method, which compares the ten possible types of SNP/ indels in a single index. The results of detected SNPs were accessed through online at www.spices.res.in/spicesnip/

Conclusion

In total, we have identified over 64026 candidate SNP polymorphisms with frequency of 0.84 SNPs/100bp in Ginger EST sequence data, along with two measures of confidence for each predicted polymorphism. Segregation of these SNPs with haplotype along with validation demonstrates that candidate SNPs with high redundancy and co-segregation confidence scores are likely to represent true SNPs. The transition to transversion ratio and indel size frequencies correspond to those observed by the analysis methods of SNP discovery and suggest that the majority of predicted SNPs and indel identified using this approach represent true genetic variation in ginger. Overall transversion is high because ginger is vegetative propagated through rhizome. This in silico analysis on ginger shows the potential SNP markers for use in ginger breeding and the online information we created would help to designing new primers and develop more markers and to saturate the linkage maps.
  26 in total

1.  Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP.

Authors:  Gary Barker; Jacqueline Batley; Helen O' Sullivan; Keith J Edwards; David Edwards
Journal:  Bioinformatics       Date:  2003-02-12       Impact factor: 6.937

2.  Data mining of public SNP databases for the selection of intragenic SNPs.

Authors:  Jan Aerts; Yves Wetzels; Nadine Cohen; Jeroen Aerssens
Journal:  Hum Mutat       Date:  2002-09       Impact factor: 4.878

3.  Automated detection of point mutations using fluorescent sequence trace subtraction.

Authors:  J K Bonfield; C Rada; R Staden
Journal:  Nucleic Acids Res       Date:  1998-07-15       Impact factor: 16.971

4.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms.

Authors:  R Sachidanandam; D Weissman; S C Schmidt; J M Kakol; L D Stein; G Marth; S Sherry; J C Mullikin; B J Mortimore; D L Willey; S E Hunt; C G Cole; P C Coggill; C M Rice; Z Ning; J Rogers; D R Bentley; P Y Kwok; E R Mardis; R T Yeh; B Schultz; L Cook; R Davenport; M Dante; L Fulton; L Hillier; R H Waterston; J D McPherson; B Gilman; S Schaffner; W J Van Etten; D Reich; J Higgins; M J Daly; B Blumenstiel; J Baldwin; N Stange-Thomann; M C Zody; L Linton; E S Lander; D Altshuler
Journal:  Nature       Date:  2001-02-15       Impact factor: 49.962

5.  A draft sequence of the rice genome (Oryza sativa L. ssp. indica).

Authors:  Jun Yu; Songnian Hu; Jun Wang; Gane Ka-Shu Wong; Songgang Li; Bin Liu; Yajun Deng; Li Dai; Yan Zhou; Xiuqing Zhang; Mengliang Cao; Jing Liu; Jiandong Sun; Jiabin Tang; Yanjiong Chen; Xiaobing Huang; Wei Lin; Chen Ye; Wei Tong; Lijuan Cong; Jianing Geng; Yujun Han; Lin Li; Wei Li; Guangqiang Hu; Xiangang Huang; Wenjie Li; Jian Li; Zhanwei Liu; Long Li; Jianping Liu; Qiuhui Qi; Jinsong Liu; Li Li; Tao Li; Xuegang Wang; Hong Lu; Tingting Wu; Miao Zhu; Peixiang Ni; Hua Han; Wei Dong; Xiaoyu Ren; Xiaoli Feng; Peng Cui; Xianran Li; Hao Wang; Xin Xu; Wenxue Zhai; Zhao Xu; Jinsong Zhang; Sijie He; Jianguo Zhang; Jichen Xu; Kunlin Zhang; Xianwu Zheng; Jianhai Dong; Wanyong Zeng; Lin Tao; Jia Ye; Jun Tan; Xide Ren; Xuewei Chen; Jun He; Daofeng Liu; Wei Tian; Chaoguang Tian; Hongai Xia; Qiyu Bao; Gang Li; Hui Gao; Ting Cao; Juan Wang; Wenming Zhao; Ping Li; Wei Chen; Xudong Wang; Yong Zhang; Jianfei Hu; Jing Wang; Song Liu; Jian Yang; Guangyu Zhang; Yuqing Xiong; Zhijie Li; Long Mao; Chengshu Zhou; Zhen Zhu; Runsheng Chen; Bailin Hao; Weimou Zheng; Shouyi Chen; Wei Guo; Guojie Li; Siqi Liu; Ming Tao; Jian Wang; Lihuang Zhu; Longping Yuan; Huanming Yang
Journal:  Science       Date:  2002-04-05       Impact factor: 47.728

6.  A draft sequence of the rice genome (Oryza sativa L. ssp. japonica).

Authors:  Stephen A Goff; Darrell Ricke; Tien-Hung Lan; Gernot Presting; Ronglin Wang; Molly Dunn; Jane Glazebrook; Allen Sessions; Paul Oeller; Hemant Varma; David Hadley; Don Hutchison; Chris Martin; Fumiaki Katagiri; B Markus Lange; Todd Moughamer; Yu Xia; Paul Budworth; Jingping Zhong; Trini Miguel; Uta Paszkowski; Shiping Zhang; Michelle Colbert; Wei-lin Sun; Lili Chen; Bret Cooper; Sylvia Park; Todd Charles Wood; Long Mao; Peter Quail; Rod Wing; Ralph Dean; Yeisoo Yu; Andrey Zharkikh; Richard Shen; Sudhir Sahasrabudhe; Alun Thomas; Rob Cannings; Alexander Gutin; Dmitry Pruss; Julia Reid; Sean Tavtigian; Jeff Mitchell; Glenn Eldredge; Terri Scholl; Rose Mary Miller; Satish Bhatnagar; Nils Adey; Todd Rubano; Nadeem Tusneem; Rosann Robinson; Jane Feldhaus; Teresita Macalma; Arnold Oliphant; Steven Briggs
Journal:  Science       Date:  2002-04-05       Impact factor: 47.728

Review 7.  Ginger--chemistry, technology, and quality evaluation: part 1.

Authors:  V S Govindarajan
Journal:  Crit Rev Food Sci Nutr       Date:  1982       Impact factor: 11.176

8.  Association of the G289S single nucleotide polymorphism in the HSD17B3 gene with prostate cancer in Italian men.

Authors:  Katia Margiotti; Eugene Kim; C Leigh Pearce; Enrico Spera; Giuseppe Novelli; Juergen K V Reichardt
Journal:  Prostate       Date:  2002-09-15       Impact factor: 4.104

9.  Analysis and functional annotation of expressed sequence tags (ESTs) from multiple tissues of oil palm (Elaeis guineensis Jacq.).

Authors:  Chai-Ling Ho; Yen-Yen Kwan; Mei-Chooi Choi; Sue-Sean Tee; Wai-Har Ng; Kok-Ang Lim; Yang-Ping Lee; Siew-Eng Ooi; Weng-Wah Lee; Jin-Ming Tee; Siang-Hee Tan; Harikrishna Kulaveerasingam; Sharifah Shahrul Rabiah Syed Alwee; Meilina Ong Abdullah
Journal:  BMC Genomics       Date:  2007-10-22       Impact factor: 3.969

10.  SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines.

Authors:  Ada Ching; Katherine S Caldwell; Mark Jung; Maurine Dolan; Oscar S Smith; Scott Tingey; Michele Morgante; Antoni J Rafalski
Journal:  BMC Genet       Date:  2002-10-07       Impact factor: 2.797

View more
  6 in total

Review 1.  Molecular markers: a potential resource for ginger genetic diversity studies.

Authors:  Nor Asiah Ismail; M Y Rafii; T M M Mahmud; M M Hanafi; Gous Miah
Journal:  Mol Biol Rep       Date:  2016-09-01       Impact factor: 2.316

2.  High quality SNPs/Indels mining and characterization in ginger from ESTs data base.

Authors:  Mahendra Gaur; Aradhana Das; Enketeswara Subudhi
Journal:  Bioinformation       Date:  2015-02-28

Review 3.  Genomic profile of the plants with pharmaceutical value.

Authors:  Saikat Gantait; Sandip Debnath; Md Nasim Ali
Journal:  3 Biotech       Date:  2014-04-18       Impact factor: 2.406

4.  In Silico identification of SNP diversity in cultivated and wild tomato species: insight from molecular simulations.

Authors:  Archana Bhardwaj; Yogeshwar Vikram Dhar; Mehar Hasan Asif; Sumit K Bag
Journal:  Sci Rep       Date:  2016-12-08       Impact factor: 4.379

5.  Genome-wide identification and characterization of InDels and SNPs in Glycine max and Glycine soja for contrasting seed permeability traits.

Authors:  G Ramakrishna; Parampreet Kaur; Deepti Nigam; Pavan K Chaduvula; Sangita Yadav; Akshay Talukdar; Nagendra Kumar Singh; Kishor Gaikwad
Journal:  BMC Plant Biol       Date:  2018-07-09       Impact factor: 4.215

6.  Analyses of hypomethylated oil palm gene space.

Authors:  Eng-Ti L Low; Rozana Rosli; Nagappan Jayanthi; Ab Halim Mohd-Amin; Norazah Azizi; Kuang-Lim Chan; Nauman J Maqbool; Paul Maclean; Rudi Brauning; Alan McCulloch; Roger Moraga; Meilina Ong-Abdullah; Rajinder Singh
Journal:  PLoS One       Date:  2014-01-30       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.