Literature DB >> 23819467

BioBin: a bioinformatics tool for automating the binning of rare variants using publicly available biological knowledge.

Carrie B Moore1, John R Wallace, Alex T Frase, Sarah A Pendergrass, Marylyn D Ritchie.   

Abstract

BACKGROUND: With the recent decreasing cost of genome sequence data, there has been increasing interest in rare variants and methods to detect their association to disease. We developed BioBin, a flexible collapsing method inspired by biological knowledge that can be used to automate the binning of low frequency variants for association testing. We also built the Library of Knowledge Integration (LOKI), a repository of data assembled from public databases, which contains resources such as: dbSNP and gene Entrez database information from the National Center for Biotechnology (NCBI), pathway information from Gene Ontology (GO), Protein families database (Pfam), Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome, NetPath - signal transduction pathways, Open Regulatory Annotation Database (ORegAnno), Biological General Repository for Interaction Datasets (BioGrid), Pharmacogenomics Knowledge Base (PharmGKB), Molecular INTeraction database (MINT), and evolutionary conserved regions (ECRs) from UCSC Genome Browser. The novelty of BioBin is access to comprehensive knowledge-guided multi-level binning. For example, bin boundaries can be formed using genomic locations from: functional regions, evolutionary conserved regions, genes, and/or pathways.
METHODS: We tested BioBin using simulated data and 1000 Genomes Project low coverage data to test our method with simulated causative variants and a pairwise comparison of rare variant (MAF < 0.03) burden differences between Yoruba individuals (YRI) and individuals of European descent (CEU). Lastly, we analyzed the NHLBI GO Exome Sequencing Project Kabuki dataset, a congenital disorder affecting multiple organs and often intellectual disability, contrasted with Complete Genomics data as controls.
RESULTS: The results from our simulation studies indicate type I error rate is controlled, however, power falls quickly for small sample sizes using variants with modest effect sizes. Using BioBin, we were able to find simulated variants in genes with less than 20 loci, but found the sensitivity to be much less in large bins. We also highlighted the scale of population stratification between two 1000 Genomes Project data, CEU and YRI populations. Lastly, we were able to apply BioBin to natural biological data from dbGaP and identify an interesting candidate gene for further study.
CONCLUSIONS: We have established that BioBin will be a very practical and flexible tool to analyze sequence data and potentially uncover novel associations between low frequency variants and complex disease.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23819467      PMCID: PMC3654874          DOI: 10.1186/1755-8794-6-S2-S6

Source DB:  PubMed          Journal:  BMC Med Genomics        ISSN: 1755-8794            Impact factor:   3.063


  40 in total

Review 1.  Mutation rate variation in the mammalian genome.

Authors:  Hans Ellegren; Nick G C Smith; Matthew T Webster
Journal:  Curr Opin Genet Dev       Date:  2003-12       Impact factor: 5.578

2.  Gathering the gold dust: methods for assessing the aggregate impact of small effect genes in genomic scans.

Authors:  Michael A Province; Ingrid B Borecki
Journal:  Pac Symp Biocomput       Date:  2008

3.  Power of deep, all-exon resequencing for discovery of human trait genes.

Authors:  Gregory V Kryukov; Alexander Shpunt; John A Stamatoyannopoulos; Shamil R Sunyaev
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-06       Impact factor: 11.205

4.  Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data.

Authors:  Bingshan Li; Suzanne M Leal
Journal:  Am J Hum Genet       Date:  2008-08-07       Impact factor: 11.025

5.  Using BioBin to explore rare variant population stratification.

Authors:  Carrie B Moore; John R Wallace; Alex T Frase; Sarah A Pendergrass; Marylyn D Ritchie
Journal:  Pac Symp Biocomput       Date:  2013

6.  Mutation of a gene essential for ribosome biogenesis, EMG1, causes Bowen-Conradi syndrome.

Authors:  Joy Armistead; Sunita Khatkar; Britta Meyer; Brian L Mark; Nehal Patel; Gail Coghlan; Ryan E Lamont; Shuangbo Liu; Jill Wiechert; Peter A Cattini; Peter Koetter; Klaus Wrogemann; Cheryl R Greenberg; Karl-Dieter Entian; Teresa Zelinski; Barbara Triggs-Raine
Journal:  Am J Hum Genet       Date:  2009-05-21       Impact factor: 11.025

7.  Forward-time simulations of human populations with complex diseases.

Authors:  Bo Peng; Christopher I Amos; Marek Kimmel
Journal:  PLoS Genet       Date:  2007-02-15       Impact factor: 5.917

8.  A groupwise association test for rare mutations using a weighted sum statistic.

Authors:  Bo Eskerod Madsen; Sharon R Browning
Journal:  PLoS Genet       Date:  2009-02-13       Impact factor: 5.917

9.  Discovery of rare variants via sequencing: implications for the design of complex trait association studies.

Authors:  Bingshan Li; Suzanne M Leal
Journal:  PLoS Genet       Date:  2009-05-15       Impact factor: 5.917

10.  ORegAnno: an open-access community-driven resource for regulatory annotation.

Authors:  Obi L Griffith; Stephen B Montgomery; Bridget Bernier; Bryan Chu; Katayoon Kasaian; Stein Aerts; Shaun Mahony; Monica C Sleumer; Mikhail Bilenky; Maximilian Haeussler; Malachi Griffith; Steven M Gallo; Belinda Giardine; Bart Hooghe; Peter Van Loo; Enrique Blanco; Amy Ticoll; Stuart Lithwick; Elodie Portales-Casamar; Ian J Donaldson; Gordon Robertson; Claes Wadelius; Pieter De Bleser; Dominique Vlieghe; Marc S Halfon; Wyeth Wasserman; Ross Hardison; Casey M Bergman; Steven J M Jones
Journal:  Nucleic Acids Res       Date:  2007-11-15       Impact factor: 16.971

View more
  20 in total

Review 1.  Unravelling the human genome-phenome relationship using phenome-wide association studies.

Authors:  William S Bush; Matthew T Oetjens; Dana C Crawford
Journal:  Nat Rev Genet       Date:  2016-02-15       Impact factor: 53.242

Review 2.  Genomic architecture of pharmacological efficacy and adverse events.

Authors:  Aparna Chhibber; Deanna L Kroetz; Kelan G Tantisira; Michael McGeachie; Cheng Cheng; Robert Plenge; Eli Stahl; Wolfgang Sadee; Marylyn D Ritchie; Sarah A Pendergrass
Journal:  Pharmacogenomics       Date:  2014-12       Impact factor: 2.533

3.  Binning somatic mutations based on biological knowledge for predicting survival: an application in renal cell carcinoma.

Authors:  Dokyoon Kim; Ruowang Li; Scott M Dudek; John R Wallace; Marylyn D Ritchie
Journal:  Pac Symp Biocomput       Date:  2015

4.  Codon bias among synonymous rare variants is associated with Alzheimer's disease imaging biomarker.

Authors:  Jason E Miller; Manu K Shivakumar; Shannon L Risacher; Andrew J Saykin; Seunggeun Lee; Kwangsik Nho; Dokyoon Kim
Journal:  Pac Symp Biocomput       Date:  2018

5.  KNOWLEDGE DRIVEN BINNING AND PHEWAS ANALYSIS IN MARSHFIELD PERSONALIZED MEDICINE RESEARCH PROJECT USING BIOBIN.

Authors:  Anna O Basile; John R Wallace; Peggy Peissig; Catherine A McCarty; Murray Brilliant; Marylyn D Ritchie
Journal:  Pac Symp Biocomput       Date:  2016

Review 6.  Individualized medicine enabled by genomics in Saudi Arabia.

Authors:  Muhammad Abu-Elmagd; Mourad Assidi; Hans-Juergen Schulten; Ashraf Dallol; Peter Pushparaj; Farid Ahmed; Stephen W Scherer; Mohammed Al-Qahtani
Journal:  BMC Med Genomics       Date:  2015-01-15       Impact factor: 3.063

7.  Current Scope and Challenges in Phenome-Wide Association Studies.

Authors:  Anurag Verma; Marylyn D Ritchie
Journal:  Curr Epidemiol Rep       Date:  2017-11-02

8.  Genetic Analysis of Functional Rare Germline Variants across Nine Cancer Types from an Electronic Health Record Linked Biobank.

Authors:  Manu Shivakumar; Jason E Miller; Venkata Ramesh Dasari; Yanfei Zhang; Ming Ta Michael Lee; David J Carey; Radhika Gogoi; Dokyoon Kim
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2021-07-08       Impact factor: 4.254

9.  Translational bioinformatics has now come of age: TBC 2012 collection.

Authors:  Ju Han Kim
Journal:  BMC Med Genomics       Date:  2013-05-07       Impact factor: 3.063

10.  Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.

Authors:  Lőrinc S Pongor; Roberto Vera; Balázs Ligeti
Journal:  PLoS One       Date:  2014-07-31       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.