Literature DB >> 19717808

Large datasets in biomedicine: a discussion of salient analytic issues.

Anshu Sinha1, George Hripcsak, Marianthi Markatou.   

Abstract

Advances in high-throughput and mass-storage technologies have led to an information explosion in both biology and medicine, presenting novel challenges for analysis and modeling. With regards to multivariate analysis techniques such as clustering, classification, and regression, large datasets present unique and often misunderstood challenges. The authors' goal is to provide a discussion of the salient problems encountered in the analysis of large datasets as they relate to modeling and inference to inform a principled and generalizable analysis and highlight the interdisciplinary nature of these challenges. The authors present a detailed study of germane issues including high dimensionality, multiple testing, scientific significance, dependence, information measurement, and information management with a focus on appropriate methodologies available to address these concerns. A firm understanding of the challenges and statistical technology involved ultimately contributes to better science. The authors further suggest that the community consider facilitating discussion through interdisciplinary panels, invited papers and curriculum enhancement to establish guidelines for analysis and reporting.

Mesh:

Year:  2009        PMID: 19717808      PMCID: PMC3002128          DOI: 10.1197/jamia.M2780

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  23 in total

Review 1.  Data integration and genomic medicine.

Authors:  Brenton Louie; Peter Mork; Fernando Martin-Sanchez; Alon Halevy; Peter Tarczy-Hornoch
Journal:  J Biomed Inform       Date:  2006-03-09       Impact factor: 6.317

2.  A statistical methodology for analyzing co-occurrence data from a large sample.

Authors:  Hui Cao; George Hripcsak; Marianthi Markatou
Journal:  J Biomed Inform       Date:  2006-12-01       Impact factor: 6.317

3.  The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements.

Authors:  Leming Shi; Laura H Reid; Wendell D Jones; Richard Shippy; Janet A Warrington; Shawn C Baker; Patrick J Collins; Francoise de Longueville; Ernest S Kawasaki; Kathleen Y Lee; Yuling Luo; Yongming Andrew Sun; James C Willey; Robert A Setterquist; Gavin M Fischer; Weida Tong; Yvonne P Dragan; David J Dix; Felix W Frueh; Frederico M Goodsaid; Damir Herman; Roderick V Jensen; Charles D Johnson; Edward K Lobenhofer; Raj K Puri; Uwe Schrf; Jean Thierry-Mieg; Charles Wang; Mike Wilson; Paul K Wolber; Lu Zhang; Shashi Amur; Wenjun Bao; Catalin C Barbacioru; Anne Bergstrom Lucas; Vincent Bertholet; Cecilie Boysen; Bud Bromley; Donna Brown; Alan Brunner; Roger Canales; Xiaoxi Megan Cao; Thomas A Cebula; James J Chen; Jing Cheng; Tzu-Ming Chu; Eugene Chudin; John Corson; J Christopher Corton; Lisa J Croner; Christopher Davies; Timothy S Davison; Glenda Delenstarr; Xutao Deng; David Dorris; Aron C Eklund; Xiao-hui Fan; Hong Fang; Stephanie Fulmer-Smentek; James C Fuscoe; Kathryn Gallagher; Weigong Ge; Lei Guo; Xu Guo; Janet Hager; Paul K Haje; Jing Han; Tao Han; Heather C Harbottle; Stephen C Harris; Eli Hatchwell; Craig A Hauser; Susan Hester; Huixiao Hong; Patrick Hurban; Scott A Jackson; Hanlee Ji; Charles R Knight; Winston P Kuo; J Eugene LeClerc; Shawn Levy; Quan-Zhen Li; Chunmei Liu; Ying Liu; Michael J Lombardi; Yunqing Ma; Scott R Magnuson; Botoul Maqsodi; Tim McDaniel; Nan Mei; Ola Myklebost; Baitang Ning; Natalia Novoradovskaya; Michael S Orr; Terry W Osborn; Adam Papallo; Tucker A Patterson; Roger G Perkins; Elizabeth H Peters; Ron Peterson; Kenneth L Philips; P Scott Pine; Lajos Pusztai; Feng Qian; Hongzu Ren; Mitch Rosen; Barry A Rosenzweig; Raymond R Samaha; Mark Schena; Gary P Schroth; Svetlana Shchegrova; Dave D Smith; Frank Staedtler; Zhenqiang Su; Hongmei Sun; Zoltan Szallasi; Zivana Tezak; Danielle Thierry-Mieg; Karol L Thompson; Irina Tikhonova; Yaron Turpaz; Beena Vallanat; Christophe Van; Stephen J Walker; Sue Jane Wang; Yonghong Wang; Russ Wolfinger; Alex Wong; Jie Wu; Chunlin Xiao; Qian Xie; Jun Xu; Wen Yang; Liang Zhang; Sheng Zhong; Yaping Zong; William Slikker
Journal:  Nat Biotechnol       Date:  2006-09       Impact factor: 54.908

4.  A scalable method for integration and functional analysis of multiple microarray datasets.

Authors:  Curtis Huttenhower; Matt Hibbs; Chad Myers; Olga G Troyanskaya
Journal:  Bioinformatics       Date:  2006-09-27       Impact factor: 6.937

5.  Anatomy of data integration.

Authors:  Olga Brazhnik; John F Jones
Journal:  J Biomed Inform       Date:  2006-09-24       Impact factor: 6.317

6.  A probabilistic functional network of yeast genes.

Authors:  Insuk Lee; Shailesh V Date; Alex T Adai; Edward M Marcotte
Journal:  Science       Date:  2004-11-26       Impact factor: 47.728

7.  Toward understanding the genetics of alcohol drinking through transcriptome meta-analysis.

Authors:  Megan K Mulligan; Igor Ponomarev; Robert J Hitzemann; John K Belknap; Boris Tabakoff; R Adron Harris; John C Crabbe; Yuri A Blednov; Nicholas J Grahame; Tamara J Phillips; Deborah A Finn; Paula L Hoffman; Vishwanath R Iyer; George F Koob; Susan E Bergeson
Journal:  Proc Natl Acad Sci U S A       Date:  2006-04-17       Impact factor: 11.205

8.  Inter-patient distance metrics using SNOMED CT defining relationships.

Authors:  Genevieve B Melton; Simon Parsons; Frances P Morrison; Adam S Rothschild; Marianthi Markatou; George Hripcsak
Journal:  J Biomed Inform       Date:  2006-02-24       Impact factor: 6.317

9.  The Molecular Biology Database Collection: 2007 update.

Authors:  Michael Y Galperin
Journal:  Nucleic Acids Res       Date:  2006-12-05       Impact factor: 16.971

10.  The effects of normalization on the correlation structure of microarray data.

Authors:  Xing Qiu; Andrew I Brooks; Lev Klebanov; Ndrei Yakovlev
Journal:  BMC Bioinformatics       Date:  2005-05-16       Impact factor: 3.169

View more
  14 in total

1.  Human behavioral informatics in genetic studies of neuropsychiatric disease: multivariate profile-based analysis.

Authors:  Cinnamon S Bloss; Kelly M Schiabor; Nicholas J Schork
Journal:  Brain Res Bull       Date:  2010-04-28       Impact factor: 4.077

2.  Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection.

Authors:  Taxiarchis Botsis; Michael D Nguyen; Emily Jane Woo; Marianthi Markatou; Robert Ball
Journal:  J Am Med Inform Assoc       Date:  2011-06-27       Impact factor: 4.497

3.  Big Data Science Training Program at a Minority Serving Institution: Processes and Initial Outcomes.

Authors:  Archana Jaiswal McEligot; Math P Cuajungco; Sam Behseta; Laura Chandler; Harmanpreet Chauhan; Sinjini Mitra; Pimbucha Rusmevichientong; Shana Charles
Journal:  Calif J Health Promot       Date:  2018

4.  Using Big Data to Evaluate the Association between Periodontal Disease and Rheumatoid Arthritis.

Authors:  Michael A Grasso; Angela C Comer; Dana D DiRenzo; Yelena Yesha; Naphtali D Rishe
Journal:  AMIA Annu Symp Proc       Date:  2015-11-05

Review 5.  Big data and clinicians: a review on the state of the science.

Authors:  Weiqi Wang; Eswar Krishnan
Journal:  JMIR Med Inform       Date:  2014-01-17

6.  Medical big data: promise and challenges.

Authors:  Choong Ho Lee; Hyung-Jin Yoon
Journal:  Kidney Res Clin Pract       Date:  2017-03-31

7.  A precision medicine framework using artificial intelligence for the identification and confirmation of genomic biomarkers of response to an Alzheimer's disease therapy: Analysis of the blarcamesine (ANAVEX2-73) Phase 2a clinical study.

Authors:  Harald Hampel; Coralie Williams; Adrien Etcheto; Federico Goodsaid; Frédéric Parmentier; Jean Sallantin; Walter E Kaufmann; Christopher U Missling; Mohammad Afshar
Journal:  Alzheimers Dement (N Y)       Date:  2020-04-19

8.  Big data are coming to psychiatry: a general introduction.

Authors:  Scott Monteith; Tasha Glenn; John Geddes; Michael Bauer
Journal:  Int J Bipolar Disord       Date:  2015-09-29

9.  Information technologies of 21st century and their impact on the society.

Authors:  Mohammad Yamin
Journal:  Int J Inf Technol       Date:  2019-08-16

10.  Fluency and rule breaking behaviour in the frontal cortex.

Authors:  Lisa Cipolotti; Pascal Molenberghs; Juan Dominguez; Nicola Smith; Daniela Smirni; Tianbo Xu; Tim Shallice; Edgar Chan
Journal:  Neuropsychologia       Date:  2019-12-20       Impact factor: 3.139

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.