BACKGROUND: The fermented dried seeds of Theobroma cacao (cacao tree) are the main ingredient in chocolate. World cocoa production was estimated to be 3 million tons in 2010 with an annual estimated average growth rate of 2.2%. The cacao bean production industry is currently under threat from a rise in fungal diseases including black pod, frosty pod, and witches' broom. In order to address these issues, genome-sequencing efforts have been initiated recently to facilitate identification of genetic markers and genes that could be utilized to accelerate the release of robust T. cacao cultivars. However, problems inherent with assembly and resolution of distal regions of complex eukaryotic genomes, such as gaps, chimeric joins, and unresolvable repeat-induced compressions, have been unavoidably encountered with the sequencing strategies selected. RESULTS: Here, we describe the construction of a BAC-based integrated genetic-physical map of the T. cacao cultivar Matina 1-6 which is designed to augment and enhance these sequencing efforts. Three BAC libraries, each comprised of 10× coverage, were constructed and fingerprinted. 230 genetic markers from a high-resolution genetic recombination map and 96 Arabidopsis-derived conserved ortholog set (COS) II markers were anchored using pooled overgo hybridization. A dense tile path consisting of 29,383 BACs was selected and end-sequenced. The physical map consists of 154 contigs and 4,268 singletons. Forty-nine contigs are genetically anchored and ordered to chromosomes for a total span of 307.2 Mbp. The unanchored contigs (105) span 67.4 Mbp and therefore the estimated genome size of T. cacao is 374.6 Mbp. A comparative analysis with A. thaliana, V. vinifera, and P. trichocarpa suggests that comparisons of the genome assemblies of these distantly related species could provide insights into genome structure, evolutionary history, conservation of functional sites, and improvements in physical map assembly. A comparison between the two T. cacao cultivars Matina 1-6 and Criollo indicates a high degree of collinearity in their genomes, yet rearrangements were also observed. CONCLUSIONS: The results presented in this study are a stand-alone resource for functional exploitation and enhancement of Theobroma cacao but are also expected to complement and augment ongoing genome-sequencing efforts. This resource will serve as a template for refinement of the T. cacao genome through gap-filling, targeted re-sequencing, and resolution of repetitive DNA arrays.
BACKGROUND: The fermented dried seeds of Theobroma cacao (cacao tree) are the main ingredient in chocolate. World cocoa production was estimated to be 3 million tons in 2010 with an annual estimated average growth rate of 2.2%. The cacao bean production industry is currently under threat from a rise in fungal diseases including black pod, frosty pod, and witches' broom. In order to address these issues, genome-sequencing efforts have been initiated recently to facilitate identification of genetic markers and genes that could be utilized to accelerate the release of robust T. cacao cultivars. However, problems inherent with assembly and resolution of distal regions of complex eukaryotic genomes, such as gaps, chimeric joins, and unresolvable repeat-induced compressions, have been unavoidably encountered with the sequencing strategies selected. RESULTS: Here, we describe the construction of a BAC-based integrated genetic-physical map of the T. cacao cultivar Matina 1-6 which is designed to augment and enhance these sequencing efforts. Three BAC libraries, each comprised of 10× coverage, were constructed and fingerprinted. 230 genetic markers from a high-resolution genetic recombination map and 96 Arabidopsis-derived conserved ortholog set (COS) II markers were anchored using pooled overgo hybridization. A dense tile path consisting of 29,383 BACs was selected and end-sequenced. The physical map consists of 154 contigs and 4,268 singletons. Forty-nine contigs are genetically anchored and ordered to chromosomes for a total span of 307.2 Mbp. The unanchored contigs (105) span 67.4 Mbp and therefore the estimated genome size of T. cacao is 374.6 Mbp. A comparative analysis with A. thaliana, V. vinifera, and P. trichocarpa suggests that comparisons of the genome assemblies of these distantly related species could provide insights into genome structure, evolutionary history, conservation of functional sites, and improvements in physical map assembly. A comparison between the two T. cacao cultivars Matina 1-6 and Criollo indicates a high degree of collinearity in their genomes, yet rearrangements were also observed. CONCLUSIONS: The results presented in this study are a stand-alone resource for functional exploitation and enhancement of Theobroma cacao but are also expected to complement and augment ongoing genome-sequencing efforts. This resource will serve as a template for refinement of the T. cacao genome through gap-filling, targeted re-sequencing, and resolution of repetitive DNA arrays.
Authors: Jan van Oeveren; Marjo de Ruiter; Taco Jesse; Hein van der Poel; Jifeng Tang; Feyruz Yalcin; Antoine Janssen; Hanne Volpin; Keith E Stormo; Robert Bogden; Michiel J T van Eijk; Marcel Prins Journal: Genome Res Date: 2011-02-01 Impact factor: 9.043
Authors: D Crouzillat; E Lerceteau; V Petiard; J Morera; H Rodriguez; D Walker; W Phillips; C Ronning; R Schnell; J Osei; P Fritz Journal: Theor Appl Genet Date: 1996-07 Impact factor: 5.699
Authors: Fusheng Wei; Jianwei Zhang; Shiguo Zhou; Ruifeng He; Mary Schaeffer; Kristi Collura; David Kudrna; Ben P Faga; Marina Wissotski; Wolfgang Golser; Susan M Rock; Tina A Graves; Robert S Fulton; Ed Coe; Patrick S Schnable; David C Schwartz; Doreen Ware; Sandra W Clifton; Richard K Wilson; Rod A Wing Journal: PLoS Genet Date: 2009-11-20 Impact factor: 5.917
Authors: Yong Q Gu; Yaqin Ma; Naxin Huo; John P Vogel; Frank M You; Gerard R Lazo; William M Nelson; Carol Soderlund; Jan Dvorak; Olin D Anderson; Ming-Cheng Luo Journal: BMC Genomics Date: 2009-10-27 Impact factor: 3.969
Authors: Christopher A Saski; Brian E Scheffler; Amanda M Hulse-Kemp; Bo Liu; Qingxin Song; Atsumi Ando; David M Stelly; Jodi A Scheffler; Jane Grimwood; Don C Jones; Daniel G Peterson; Jeremy Schmutz; Z Jeffery Chen Journal: Sci Rep Date: 2017-11-10 Impact factor: 4.379
Authors: Juan C Motamayor; Keithanne Mockaitis; Jeremy Schmutz; Niina Haiminen; Donald Livingstone; Omar Cornejo; Seth D Findley; Ping Zheng; Filippo Utro; Stefan Royaert; Christopher Saski; Jerry Jenkins; Ram Podicheti; Meixia Zhao; Brian E Scheffler; Joseph C Stack; Frank A Feltus; Guiliana M Mustiga; Freddy Amores; Wilbert Phillips; Jean Philippe Marelli; Gregory D May; Howard Shapiro; Jianxin Ma; Carlos D Bustamante; Raymond J Schnell; Dorrie Main; Don Gilbert; Laxmi Parida; David N Kuhn Journal: Genome Biol Date: 2013-06-03 Impact factor: 13.583