Literature DB >> 22359437

Analysis of expressed sequence tags (ESTs) from cocoa (Theobroma cacao L) upon infection with Phytophthora megakarya.

Sudalaimuthu Asari Naganeeswaran, Elain Apshara Subbian, Manimekalai Ramaswamy.   

Abstract

Phytophthora megakarya, the causative agent of cacao black pod disease in West African countries causes an extensive loss of yield. In this study we have analyzed 4 libraries of ESTs derived from Phytophthora megakarya infected cocoa leaf and pod tissues. Totally 6379 redundant sequences were retrieved from ESTtik database and EST processing was performed using seqclean tool. Clustering and assembling using CAP3 generated 3333 non-redundant (907 contigs and 2426 singletons) sequences. The primary sequence analysis of 3333 non-redundant sequences showed that the GC percentage was 42.7 and the sequence length ranged from 101 - 2576 nucleotides. Further, functional analysis (Blast, Interproscan, Gene ontology and KEGG search) were executed and 1230 orthologous genes were annotated. Totally 272 enzymes corresponding to 114 metabolic pathways were identified. Functional annotation revealed that most of the sequences are related to molecular function, stress response and biological processes. The annotated enzymes are aldehyde dehydrogenase (E.C: 1.2.1.3), catalase (E.C: 1.11.1.6), acetyl-CoA C-acetyltransferase (E.C: 2.3.1.9), threonine ammonia-lyase (E.C: 4.3.1.19), acetolactate synthase (E.C: 2.2.1.6), O-methyltransferase (E.C: 2.1.1.68) which play an important role in amino acid biosynthesis and phenyl propanoid biosynthesis. All this information was stored in MySQL database management system to be used in future for reconstruction of biotic stress response pathway in cocoa.

Entities:  

Keywords:  EST; annotation; cDNA; cocoa

Year:  2012        PMID: 22359437      PMCID: PMC3282258          DOI: 10.6026/97320630008065

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

Theobroma cacao (cocoa) is a diploid tree grown in tropical countries [1]. Worldwide many people depend on cocoa for their income. Cocoa is grown in a range of conditions such as full sun, or more traditionally under shade. In India, cocoa has been grown as a mixed crop under arecanut, coconut and oil palm shades. Demand for cocoa has been increased tremendously not only as a raw material for chocolate industry, but also for its flavor and other properties which imparts several health benefits [2, 3]. Diseases are major problem for decline in cocoa production and causing annual crop loss of 20– 30 % [4]. The major diseases of cocoa include black pod (Phytophthora spp.), witches' broom (Crini pellis perniciosa), and frosty pod rot (Moniliophthora roreri) causing heavy loss in production worldwide. Phytophthora megakarya, causative agent for black pod disease in West African countries is the most damaging pathogen in cocoa industry. Although Phytophthora megakarya only exists in Africa, the species Phytophthora palmivora and Phytophthora capsici are responsible for the disease in South America and India. Fungicides are used to control the disease with varying success and at significant cost to small hold farmers [5, 6]. Genomic research provides new tools to study the genetic and molecular bases of different traits. Complete genome of the cocoa has been recently published [7]. Expressed Sequence Tags (ESTs) are sequenced regions of cDNA copies of mRNA that are expressed under different conditions and represents part of the transcribed portion of the genome [8]. ESTs can be used for gene annotation, gene discovery and sequence determination. Various cocoa EST sequencing projects have been done to understand the transcriptome of cocoa [9, 10]. The EST sequence information is essential for the molecular based assays leading to cocoa crop improvement. With the objective of identifying the functional genes expressed in diseased condition (black pod), we analyzed 4 libraries of ESTs derived from Phytophthora megakarya infected cocoa leaf and pod tissues. These studies would lead to the development of secondary cocoa EST database for specific stress conditions (biotic, abiotic) that will be helpful for the researchers of cocoa crop improvement.

Methodology

Primary sequence source:

Four libraries of EST sequences derived from Phytophthora megakarya infected cocoa leaf and pod tissues belonging to the genotypes PNG and UPA134 were used in this study. Totally 6379 redundant EST sequences were retrieved from ESTtik database [9] Table 1 (see supplementary material).

EST Analysis:

EST analysis includes the following steps: 1) EST preprocessing, 2) EST assembly, and 3) functional annotation. The implemented steps are illustrated in Figure 1.
Figure 1

EST analysis work flow

EST pre -processing and assembly:

EST processing like removal of vector contamination, trimming poly A/ T tail and low complexity region, removal of linker and adaptor sequence were performed using SeqClean [11] tool. Vector contamination database UniVec was configured with local Blast [12] and used in SeqClean tool. Repeatmasker (http://www.repeatmasker.org) [13] tool was used to remove the low complexity regions from the EST sequences. Clustering and assembling were done by adopting CAP3 tool [14].

Primary sequence analysis:

The primary sequence analysis of GC percentage, average length of contigs and length range of contigs were processed using custom developed perl script (DSA.pl). For the detection of number of clustered sequences present in different contigs the CAP3 assembly files (.ace) were analyzed using a perl script (cap3_analyzer.pl).

Functional annotation:

Non redundant EST sequences were subjected to blastx [12] similarity search. Further, the homologous sequences were made stringent by selecting those having E-value below e-10. Gene Ontology [15] search, enzyme search, Interproscan and KEGG mapping were done using Blast2go (www.blast2go.org) tool [16].

Database design:

The information which was obtained from the processing and annotation of the EST sequences were deposited in a MySQL relational database. Three different tables were created using SQL for storing sequences, blast hit and functional annotation.

Results and Discussion

The 6379 EST sequences retrieved from ESTtik database were processed using SeqClean tool resulting in 6349 good quality EST sequences which were used for further analysis. By the contig assembly using CAP3 tool 3333 non redundant EST (907 contigs and 2426 singletons) sequences were obtained Table 2 (see supplementary material). The primary sequence analysis showed that total GC content of non redundant EST collection as 42.7%, average length of the EST collections is 419 residues/ sequence and the sequence length ranged from 101 residues to 2576 residues Table 2 (see supplementary material). CAP3 clustered the total ESTs in to 907 contigs. Number of sequences in different contigs ranged from 2 to 141. Contig374 contained the maximum number of sequences i.e. 141 (Figure 2).
Figure 2

Contigs vs number of sequences clustered in corresponding contigs.

Similarity search (blastx) was executed against the nonredundant database. Totally 1230 orthologous genes were annotated with a significant E-value of < e-10. The blast result showed that contig473 (2 sequence assembled) showed high similarity (95.75% with E-value 3.5E-138) to heat shock protein and contig396 (3 sequence assembled) showed high similarity (93% with E-value 1.9E-71) to low molecular weight heat shock protein. Contig338, contig345, contig668 and contig687 showed similarity in defense of related proteins. Various biotic stress related proteins like GTP-binding protein (Contig366, Contig635), chitinase (contig568), beta-cyanoalanine synthase (Contig292), metallothionein (contig 542), thaumatin, trypsin inhibitor (contig810) heat shock protein, hydroxyproline-rich glycoprotein(Contig526), Omethyltransferase( Contig846), abc transporter family proteins, 12-oxophytodienoate reductase (Contig230, Contig379), carbonic anhydrase (Contig576), glutamine synthetase (Contig308), thioredoxin (Contig80, Contig280, Contig667), cyclophilin (Contig758), f-box family protein (Contig660), glutathione peroxidase (Contig184), ascorbate peroxidase (Contig573, Contig487), lipid transfer protein (Contig480, Contig342, ), cellulose synthase (Contig356), expansin, and pathogen related protein (contig 868) could be identified in the present study. Other major proteins involved in cell growth, cellular communication, cellular transport, transport mechanisms, energy pathway, protein destination and protein synthesis process can also be found in our EST collection (see supplementary material). Gesteira et al [17] identified pathogenesis related proteins, receptor kinase, MAP kinase and trypsin inhibitors as proteins related to Moniliophthora perniciosa infection in cocoa through comparative analysis of EST. In a similar work, Verica et al [18] identified proteins like chitinase, heat-shock proteins and beta-cyanoalanine synthase in cocoa were upregulated when treated with inducer of defense response. The cDNAs developed for the differently expressed genes in cocoa in response to witche's broom disease were putatively categorized as belonging to signal transduction, response to biotic and abiotic stress, metabolism, RNA and DNA metabolism, protein metabolism and cellular maintenance classes [19]. Gene Ontology classification (GO), HMMER search against Pfam database, Interproscan and Enzyme search were done using Blast2go tool. Gene ontology results revealed that most of the sequences were related to cellular function; stress response and biological process (see supplementary material). Enzyme search against KEGG, annotated 272 enzymes belonging to 114 metabolic pathways. The annotated enzymes were aldehyde dehydrogenase (E.C: 1.2.1.3), catalase (E.C: 1.11.1.6), acetyl-CoA C-acetyltransferase (E.C: 2.3.1.9), threonine ammonia-lyase (E.C: 4.3.1.19), acetolactate synthase (E.C: 2.2.1.6), dihydroxy-acid dehydratase (E.C: 4.2.1.9), Omethyltransferase (E.C: 2.1.1.68) and cinnamoyl-CoA reductase (E.C: 1.2.1.44) and most of them play an important role in amino acid biosynthesis. Many other enzymes involved in biosynthesis of secondary metabolites, fatty acid metabolism, and fructose and mannose metabolism were annotated. Three different tables were created using SQL commands in MySQL relational database management system. The results obtained in EST processing and primary sequence analysis were organized in the first table. The second table possessed the information obtained in similarity search and further functional annotation results were saved in the third table. These three tables were logically linked. Each row in the table was assigned a unique serial number. All the information was deposited in 3333 rows in each table that can be retrieved by either logical or key word search.

Conclusion

Four libraries of EST sequences derived from Phytophthora megakarya infected cocoa tissues have been analysed. Functional annotation resulted in 1230 orthologous genes, which included 272 enzymes and others were defense related and cellular functional genes. The annotated information was organized in a MySQL database. This information will be useful for the reconstruction of biotic stress response pathways in cocoa.
  14 in total

1.  The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro.

Authors:  Evelyn Camon; Michele Magrane; Daniel Barrell; David Binns; Wolfgang Fleischmann; Paul Kersey; Nicola Mulder; Tom Oinn; John Maslen; Anthony Cox; Rolf Apweiler
Journal:  Genome Res       Date:  2003-03-12       Impact factor: 9.043

2.  Isolation of ESTs from cacao (Theobroma cacao L.) leaves treated with inducers of the defense response.

Authors:  Joseph A Verica; Siela N Maximova; Mary D Strem; John E Carlson; Bryan A Bailey; Mark J Guiltinan
Journal:  Plant Cell Rep       Date:  2004-08-31       Impact factor: 4.570

3.  STATUS OF CACAO WITCHES' BROOM: biology, epidemiology, and management.

Authors:  L H Purdy; R A Schmidt
Journal:  Annu Rev Phytopathol       Date:  1996       Impact factor: 13.078

4.  Genes differentially expressed in Theobroma cacao associated with resistance to witches' broom disease caused by Crinipellis perniciosa.

Authors:  Gildemberg Amorim Leal; Paulo S B Albuquerque; Antonio Figueira
Journal:  Mol Plant Pathol       Date:  2007-05       Impact factor: 5.663

5.  The genome of Theobroma cacao.

Authors:  Xavier Argout; Jerome Salse; Jean-Marc Aury; Mark J Guiltinan; Gaetan Droc; Jerome Gouzy; Mathilde Allegre; Cristian Chaparro; Thierry Legavre; Siela N Maximova; Michael Abrouk; Florent Murat; Olivier Fouet; Julie Poulain; Manuel Ruiz; Yolande Roguet; Maguy Rodier-Goud; Jose Fernandes Barbosa-Neto; Francois Sabot; Dave Kudrna; Jetty Siva S Ammiraju; Stephan C Schuster; John E Carlson; Erika Sallet; Thomas Schiex; Anne Dievart; Melissa Kramer; Laura Gelley; Zi Shi; Aurélie Bérard; Christopher Viot; Michel Boccara; Ange Marie Risterucci; Valentin Guignon; Xavier Sabau; Michael J Axtell; Zhaorong Ma; Yufan Zhang; Spencer Brown; Mickael Bourge; Wolfgang Golser; Xiang Song; Didier Clement; Ronan Rivallan; Mathias Tahi; Joseph Moroh Akaza; Bertrand Pitollat; Karina Gramacho; Angélique D'Hont; Dominique Brunel; Diogenes Infante; Ismael Kebe; Pierre Costet; Rod Wing; W Richard McCombie; Emmanuel Guiderdoni; Francis Quetier; Olivier Panaud; Patrick Wincker; Stephanie Bocs; Claire Lanaud
Journal:  Nat Genet       Date:  2010-12-26       Impact factor: 38.330

6.  Fungal and plant gene expression during the colonization of cacao seedlings by endophytic isolates of four Trichoderma species.

Authors:  B A Bailey; H Bae; M D Strem; D P Roberts; S E Thomas; J Crozier; G J Samuels; Ik-Young Choi; K A Holmes
Journal:  Planta       Date:  2006-07-11       Impact factor: 4.116

7.  SSR mining in coffee tree EST databases: potential use of EST-SSRs as markers for the Coffea genus.

Authors:  Valérie Poncet; Myriam Rondeau; Christine Tranchant; Anne Cayrel; Serge Hamon; Alexandre de Kochko; Perla Hamon
Journal:  Mol Genet Genomics       Date:  2006-08-19       Impact factor: 3.291

8.  Comparative analysis of expressed genes from cacao meristems infected by Moniliophthora perniciosa.

Authors:  Abelmon S Gesteira; Fabienne Micheli; Nicolas Carels; Aline C Da Silva; Karina P Gramacho; Ivan Schuster; Joci N Macêdo; Gonçalo A G Pereira; Júlio C M Cascardo
Journal:  Ann Bot       Date:  2007-06-08       Impact factor: 4.357

9.  Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research.

Authors:  Ana Conesa; Stefan Götz; Juan Miguel García-Gómez; Javier Terol; Manuel Talón; Montserrat Robles
Journal:  Bioinformatics       Date:  2005-08-04       Impact factor: 6.937

10.  Towards the understanding of the cocoa transcriptome: Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions.

Authors:  Xavier Argout; Olivier Fouet; Patrick Wincker; Karina Gramacho; Thierry Legavre; Xavier Sabau; Ange Marie Risterucci; Corinne Da Silva; Julio Cascardo; Mathilde Allegre; David Kuhn; Joseph Verica; Brigitte Courtois; Gaston Loor; Regis Babin; Olivier Sounigo; Michel Ducamp; Mark J Guiltinan; Manuel Ruiz; Laurence Alemanno; Regina Machado; Wilberth Phillips; Ray Schnell; Martin Gilmour; Eric Rosenquist; David Butler; Siela Maximova; Claire Lanaud
Journal:  BMC Genomics       Date:  2008-10-30       Impact factor: 3.969

View more
  2 in total

1.  Analysis of expressed sequence tags from cDNA library of Fusarium culmorum infected barley (Hordeum vulgare L.) roots.

Authors:  Feyza Tufan; Cüneyt Uçarlı; Filiz Gürel
Journal:  Bioinformation       Date:  2015-01-30

2.  First Microsatellite Markers Developed from Cupuassu ESTs: Application in Diversity Analysis and Cross-Species Transferability to Cacao.

Authors:  Lucas Ferraz Dos Santos; Roberta Moreira Fregapani; Loeni Ludke Falcão; Roberto Coiti Togawa; Marcos Mota do Carmo Costa; Uilson Vanderlei Lopes; Karina Peres Gramacho; Rafael Moyses Alves; Fabienne Micheli; Lucilia Helena Marcellino
Journal:  PLoS One       Date:  2016-03-07       Impact factor: 3.240

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.