Literature DB >> 33175872

High density genotype storage for plant breeding in the Chado schema of Breedbase.

Nicolas Morales1,2, Guillaume J Bauchet2, Titima Tantikanjana2, Adrian F Powell2, Bryan J Ellerbrock2, Isaak Y Tecle2, Lukas A Mueller2.   

Abstract

Modern breeding programs routinely use genome-wide information for selecting individuals to advance. The large volumes of genotypic information required present a challenge for data storage and query efficiency. Major use cases require genotyping data to be linked with trait phenotyping data. In contrast to phenotyping data that are often stored in relational database schemas, next-generation genotyping data are traditionally stored in non-relational storage systems due to their extremely large scope. This study presents a novel data model implemented in Breedbase (https://breedbase.org/) for uniting relational phenotyping data and non-relational genotyping data within the open-source PostgreSQL database engine. Breedbase is an open-source, web-database designed to manage all of a breeder's informatics needs: management of field experiments, phenotypic and genotypic data collection and storage, and statistical analyses. The genotyping data is stored in a PostgreSQL data-type known as binary JavaScript Object Notation (JSONb), where the JSON structures closely follow the Variant Call Format (VCF) data model. The Breedbase genotyping data model can handle different ploidy levels, structural variants, and any genotype encoded in VCF. JSONb is both compressed and indexed, resulting in a space and time efficient system. Furthermore, file caching maximizes data retrieval performance. Integration of all breeding data within the Chado database schema retains referential integrity that may be lost when genotyping and phenotyping data are stored in separate systems. Benchmarking demonstrates that the system is fast enough for computation of a genomic relationship matrix (GRM) and genome wide association study (GWAS) for datasets involving 1,325 diploid Zea mays, 314 triploid Musa acuminata, and 924 diploid Manihot esculenta samples genotyped with 955,690, 142,119, and 287,952 genotype-by-sequencing (GBS) markers, respectively.

Entities:  

Year:  2020        PMID: 33175872      PMCID: PMC7657515          DOI: 10.1371/journal.pone.0240059

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


  19 in total

Review 1.  Crop Breeding Chips and Genotyping Platforms: Progress, Challenges, and Perspectives.

Authors:  Awais Rasheed; Yuanfeng Hao; Xianchun Xia; Awais Khan; Yunbi Xu; Rajeev K Varshney; Zhonghu He
Journal:  Mol Plant       Date:  2017-06-29       Impact factor: 13.164

2.  Bridging the phenotypic and genetic data useful for integrated breeding through a data annotation using the Crop Ontology developed by the crop communities of practice.

Authors:  Rosemary Shrestha; Luca Matteis; Milko Skofic; Arllet Portugal; Graham McLaren; Glenn Hyman; Elizabeth Arnaud
Journal:  Front Physiol       Date:  2012-08-25       Impact factor: 4.566

3.  A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species.

Authors:  Robert J Elshire; Jeffrey C Glaubitz; Qi Sun; Jesse A Poland; Ken Kawamoto; Edward S Buckler; Sharon E Mitchell
Journal:  PLoS One       Date:  2011-05-04       Impact factor: 3.240

4.  The variant call format and VCFtools.

Authors:  Petr Danecek; Adam Auton; Goncalo Abecasis; Cornelis A Albers; Eric Banks; Mark A DePristo; Robert E Handsaker; Gerton Lunter; Gabor T Marth; Stephen T Sherry; Gilean McVean; Richard Durbin
Journal:  Bioinformatics       Date:  2011-06-07       Impact factor: 6.937

5.  BrAPI-an application programming interface for plant breeding applications.

Authors:  Peter Selby; Rafael Abbeloos; Jan Erik Backlund; Martin Basterrechea Salido; Guillaume Bauchet; Omar E Benites-Alfaro; Clay Birkett; Viana C Calaminos; Pierre Carceller; Guillaume Cornut; Bruno Vasques Costa; Jeremy D Edwards; Richard Finkers; Star Yanxin Gao; Mehmood Ghaffar; Philip Glaser; Valentin Guignon; Puthick Hok; Andrzej Kilian; Patrick König; Jack Elendil B Lagare; Matthias Lange; Marie-Angélique Laporte; Pierre Larmande; David S LeBauer; David A Lyon; David S Marshall; Dave Matthews; Iain Milne; Naymesh Mistry; Nicolas Morales; Lukas A Mueller; Pascal Neveu; Evangelia Papoutsoglou; Brian Pearce; Ivan Perez-Masias; Cyril Pommier; Ricardo H Ramírez-González; Abhishek Rathore; Angel Manica Raquel; Sebastian Raubach; Trevor Rife; Kelly Robbins; Mathieu Rouard; Chaitanya Sarma; Uwe Scholz; Guilhem Sempéré; Paul D Shaw; Reinhard Simon; Nahuel Soldevilla; Gordon Stephen; Qi Sun; Clarysabel Tovar; Grzegorz Uszynski; Maikel Verouden
Journal:  Bioinformatics       Date:  2019-10-15       Impact factor: 6.937

6.  Maize genomes to fields (G2F): 2014-2017 field seasons: genotype, phenotype, climatic, soil, and inbred ear image datasets.

Authors:  Bridget A McFarland; Naser AlKhalifah; Martin Bohn; Jessica Bubert; Edward S Buckler; Ignacio Ciampitti; Jode Edwards; David Ertl; Joseph L Gage; Celeste M Falcon; Sherry Flint-Garcia; Michael A Gore; Christopher Graham; Candice N Hirsch; James B Holland; Elizabeth Hood; David Hooker; Diego Jarquin; Shawn M Kaeppler; Joseph Knoll; Greg Kruger; Nick Lauter; Elizabeth C Lee; Dayane C Lima; Aaron Lorenz; Jonathan P Lynch; John McKay; Nathan D Miller; Stephen P Moose; Seth C Murray; Rebecca Nelson; Christina Poudyal; Torbert Rocheford; Oscar Rodriguez; Maria Cinta Romay; James C Schnable; Patrick S Schnable; Brian Scully; Rajandeep Sekhon; Kevin Silverstein; Maninder Singh; Margaret Smith; Edgar P Spalding; Nathan Springer; Kurt Thelen; Peter Thomison; Mitchell Tuinstra; Jason Wallace; Ramona Walls; David Wills; Randall J Wisser; Wenwei Xu; Cheng-Ting Yeh; Natalia de Leon
Journal:  BMC Res Notes       Date:  2020-02-12

7.  FlyBase: enhancing Drosophila Gene Ontology annotations.

Authors:  Susan Tweedie; Michael Ashburner; Kathleen Falls; Paul Leyland; Peter McQuilton; Steven Marygold; Gillian Millburn; David Osumi-Sutherland; Andrew Schroeder; Ruth Seal; Haiyan Zhang
Journal:  Nucleic Acids Res       Date:  2008-10-23       Impact factor: 16.971

8.  Gigwa-Genotype investigator for genome-wide analyses.

Authors:  Guilhem Sempéré; Florian Philippe; Alexis Dereeper; Manuel Ruiz; Gautier Sarah; Pierre Larmande
Journal:  Gigascience       Date:  2016-06-06       Impact factor: 6.524

9.  The Development of Quality Control Genotyping Approaches: A Case Study Using Elite Maize Lines.

Authors:  Jiafa Chen; Cristian Zavala; Noemi Ortega; Cesar Petroli; Jorge Franco; Juan Burgueño; Denise E Costich; Sarah J Hearne
Journal:  PLoS One       Date:  2016-06-09       Impact factor: 3.240

View more
  1 in total

1.  Breedbase: a digital ecosystem for modern plant breeding.

Authors:  Nicolas Morales; Alex C Ogbonna; Bryan J Ellerbrock; Guillaume J Bauchet; Titima Tantikanjana; Isaak Y Tecle; Adrian F Powell; David Lyon; Naama Menda; Christiano C Simoes; Surya Saha; Prashant Hosmani; Mirella Flores; Naftali Panitz; Ryan S Preble; Afolabi Agbona; Ismail Rabbi; Peter Kulakow; Prasad Peteti; Robert Kawuki; Williams Esuma; Micheal Kanaabi; Doreen M Chelangat; Ezenwanyi Uba; Adeyemi Olojede; Joseph Onyeka; Trushar Shah; Margaret Karanja; Chiedozie Egesi; Hale Tufan; Agre Paterne; Asrat Asfaw; Jean-Luc Jannink; Marnin Wolfe; Clay L Birkett; David J Waring; Jenna M Hershberger; Michael A Gore; Kelly R Robbins; Trevor Rife; Chaney Courtney; Jesse Poland; Elizabeth Arnaud; Marie-Angélique Laporte; Heneriko Kulembeka; Kasele Salum; Emmanuel Mrema; Allan Brown; Stanley Bayo; Brigitte Uwimana; Violet Akech; Craig Yencho; Bert de Boeck; Hugo Campos; Rony Swennen; Jeremy D Edwards; Lukas A Mueller
Journal:  G3 (Bethesda)       Date:  2022-07-06       Impact factor: 3.542

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.