Literature DB >> 28961771

Cloud-based interactive analytics for terabytes of genomic variants data.

Cuiping Pan1,2, Gregory McInnes1,3, Nicole Deflaux4,5, Michael Snyder2,3, Jonathan Bingham4,5, Somalee Datta1,3, Philip S Tsao1,6.   

Abstract

MOTIVATION: Large scale genomic sequencing is now widely used to decipher questions in diverse realms such as biological function, human diseases, evolution, ecosystems, and agriculture. With the quantity and diversity these data harbor, a robust and scalable data handling and analysis solution is desired.
RESULTS: We present interactive analytics using a cloud-based columnar database built on Dremel to perform information compression, comprehensive quality controls, and biological information retrieval in large volumes of genomic data. We demonstrate such Big Data computing paradigms can provide orders of magnitude faster turnaround for common genomic analyses, transforming long-running batch jobs submitted via a Linux shell into questions that can be asked from a web browser in seconds. Using this method, we assessed a study population of 475 deeply sequenced human genomes for genomic call rate, genotype and allele frequency distribution, variant density across the genome, and pharmacogenomic information.
AVAILABILITY AND IMPLEMENTATION: Our analysis framework is implemented in Google Cloud Platform and BigQuery. Codes are available at https://github.com/StanfordBioinformatics/mvp_aaa_codelabs. CONTACT: cuiping@stanford.edu or ptsao@stanford.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Published by Oxford University Press 2017. This work is written by US Government employees and are in the public domain in the US.

Entities:  

Mesh:

Year:  2017        PMID: 28961771      PMCID: PMC5860318          DOI: 10.1093/bioinformatics/btx468

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  36 in total

1.  Bioinformatics and Microarray Data Analysis on the Cloud.

Authors:  Barbara Calabrese; Mario Cannataro
Journal:  Methods Mol Biol       Date:  2016

2.  Large-scale whole-genome sequencing of the Icelandic population.

Authors:  Daniel F Gudbjartsson; Hannes Helgason; Sigurjon A Gudjonsson; Florian Zink; Asmundur Oddson; Arnaldur Gylfason; Soren Besenbacher; Gisli Magnusson; Bjarni V Halldorsson; Eirikur Hjartarson; Gunnar Th Sigurdsson; Simon N Stacey; Michael L Frigge; Hilma Holm; Jona Saemundsdottir; Hafdis Th Helgadottir; Hrefna Johannsdottir; Gunnlaugur Sigfusson; Gudmundur Thorgeirsson; Jon Th Sverrisson; Solveig Gretarsdottir; G Bragi Walters; Thorunn Rafnar; Bjarni Thjodleifsson; Einar S Bjornsson; Sigurdur Olafsson; Hildur Thorarinsdottir; Thora Steingrimsdottir; Thora S Gudmundsdottir; Asgeir Theodors; Jon G Jonasson; Asgeir Sigurdsson; Gyda Bjornsdottir; Jon J Jonsson; Olafur Thorarensen; Petur Ludvigsson; Hakon Gudbjartsson; Gudmundur I Eyjolfsson; Olof Sigurdardottir; Isleifur Olafsson; David O Arnar; Olafur Th Magnusson; Augustine Kong; Gisli Masson; Unnur Thorsteinsdottir; Agnar Helgason; Patrick Sulem; Kari Stefansson
Journal:  Nat Genet       Date:  2015-03-25       Impact factor: 38.330

3.  Whole-genome sequence variation, population structure and demographic history of the Dutch population.

Authors: 
Journal:  Nat Genet       Date:  2014-06-29       Impact factor: 38.330

4.  Recommendations for reporting of secondary findings in clinical exome and genome sequencing, 2016 update (ACMG SF v2.0): a policy statement of the American College of Medical Genetics and Genomics.

Authors:  Sarah S Kalia; Kathy Adelman; Sherri J Bale; Wendy K Chung; Christine Eng; James P Evans; Gail E Herman; Sophia B Hufnagel; Teri E Klein; Bruce R Korf; Kent D McKelvey; Kelly E Ormond; C Sue Richards; Christopher N Vlangos; Michael Watson; Christa L Martin; David T Miller
Journal:  Genet Med       Date:  2016-11-17       Impact factor: 8.822

5.  Inexpensive and Highly Reproducible Cloud-Based Variant Calling of 2,535 Human Genomes.

Authors:  Suyash S Shringarpure; Andrew Carroll; Francisco M De La Vega; Carlos D Bustamante
Journal:  PLoS One       Date:  2015-06-25       Impact factor: 3.240

6.  STORMSeq: an open-source, user-friendly pipeline for processing personal genomics data in the cloud.

Authors:  Konrad J Karczewski; Guy Haskin Fernald; Alicia R Martin; Michael Snyder; Nicholas P Tatonetti; Joel T Dudley
Journal:  PLoS One       Date:  2014-01-15       Impact factor: 3.240

7.  A global reference for human genetic variation.

Authors:  Adam Auton; Lisa D Brooks; Richard M Durbin; Erik P Garrison; Hyun Min Kang; Jan O Korbel; Jonathan L Marchini; Shane McCarthy; Gil A McVean; Gonçalo R Abecasis
Journal:  Nature       Date:  2015-10-01       Impact factor: 49.962

8.  NEK1 variants confer susceptibility to amyotrophic lateral sclerosis.

Authors:  Kevin P Kenna; Perry T C van Doormaal; Annelot M Dekker; Nicola Ticozzi; Brendan J Kenna; Frank P Diekstra; Wouter van Rheenen; Kristel R van Eijk; Ashley R Jones; Pamela Keagle; Aleksey Shatunov; William Sproviero; Bradley N Smith; Michael A van Es; Simon D Topp; Aoife Kenna; Jack W Miller; Claudia Fallini; Cinzia Tiloca; Russell L McLaughlin; Caroline Vance; Claire Troakes; Claudia Colombrita; Gabriele Mora; Andrea Calvo; Federico Verde; Safa Al-Sarraj; Andrew King; Daniela Calini; Jacqueline de Belleroche; Frank Baas; Anneke J van der Kooi; Marianne de Visser; Anneloor L M A Ten Asbroek; Peter C Sapp; Diane McKenna-Yasek; Meraida Polak; Seneshaw Asress; José Luis Muñoz-Blanco; Tim M Strom; Thomas Meitinger; Karen E Morrison; Giuseppe Lauria; Kelly L Williams; P Nigel Leigh; Garth A Nicholson; Ian P Blair; Claire S Leblond; Patrick A Dion; Guy A Rouleau; Hardev Pall; Pamela J Shaw; Martin R Turner; Kevin Talbot; Franco Taroni; Kevin B Boylan; Marka Van Blitterswijk; Rosa Rademakers; Jesús Esteban-Pérez; Alberto García-Redondo; Phillip Van Damme; Wim Robberecht; Adriano Chio; Cinzia Gellera; Carsten Drepper; Michael Sendtner; Antonia Ratti; Jonathan D Glass; Jesús S Mora; Nazli A Basak; Orla Hardiman; Albert C Ludolph; Peter M Andersen; Jochen H Weishaupt; Robert H Brown; Ammar Al-Chalabi; Vincenzo Silani; Christopher E Shaw; Leonard H van den Berg; Jan H Veldink; John E Landers
Journal:  Nat Genet       Date:  2016-07-25       Impact factor: 41.307

9.  A pan-cancer proteomic perspective on The Cancer Genome Atlas.

Authors:  Rehan Akbani; Patrick Kwok Shing Ng; Henrica M J Werner; Maria Shahmoradgoli; Fan Zhang; Zhenlin Ju; Wenbin Liu; Ji-Yeon Yang; Kosuke Yoshihara; Jun Li; Shiyun Ling; Elena G Seviour; Prahlad T Ram; John D Minna; Lixia Diao; Pan Tong; John V Heymach; Steven M Hill; Frank Dondelinger; Nicolas Städler; Lauren A Byers; Funda Meric-Bernstam; John N Weinstein; Bradley M Broom; Roeland G W Verhaak; Han Liang; Sach Mukherjee; Yiling Lu; Gordon B Mills
Journal:  Nat Commun       Date:  2014-05-29       Impact factor: 14.919

10.  Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

Authors:  Enis Afgan; Clare Sloggett; Nuwan Goonasekera; Igor Makunin; Derek Benson; Mark Crowe; Simon Gladman; Yousef Kowsar; Michael Pheasant; Ron Horst; Andrew Lonie
Journal:  PLoS One       Date:  2015-10-26       Impact factor: 3.240

View more
  3 in total

1.  Swarm: A federated cloud framework for large-scale variant analysis.

Authors:  Amir Bahmani; Kyle Ferriter; Vandhana Krishnan; Arash Alavi; Amir Alavi; Philip S Tsao; Michael P Snyder; Cuiping Pan
Journal:  PLoS Comput Biol       Date:  2021-05-12       Impact factor: 4.475

2.  Scalability and cost-effectiveness analysis of whole genome-wide association studies on Google Cloud Platform and Amazon Web Services.

Authors:  Inès Krissaane; Carlos De Niz; Alba Gutiérrez-Sacristán; Gabor Korodi; Nneka Ede; Ranjay Kumar; Jessica Lyons; Arjun Manrai; Chirag Patel; Isaac Kohane; Paul Avillach
Journal:  J Am Med Inform Assoc       Date:  2020-07-27       Impact factor: 4.497

3.  KAUST Metagenomic Analysis Platform (KMAP), enabling access to massive analytics of re-annotated metagenomic data.

Authors:  Intikhab Alam; Allan Anthony Kamau; David Kamanda Ngugi; Takashi Gojobori; Carlos M Duarte; Vladimir B Bajic
Journal:  Sci Rep       Date:  2021-06-01       Impact factor: 4.379

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.