Garrick Wallstrom1, Karen S Anderson, Joshua LaBaer. 1. Center for Personalized Diagnostics, Biodesign Institute, ASU, 1001 S. McAllister Ave, Tempe, AZ 85287, USA. garrick.wallstrom@asu.edu
Abstract
BACKGROUND: Modern genomic and proteomic studies reveal that many diseases are heterogeneous, comprising multiple different subtypes. The common notion that one biomarker can be predictive for all patients may need to be replaced by an understanding that each subtype has its own set of unique biomarkers, affecting how discovery studies are designed and analyzed. METHODS: We used Monte Carlo simulation to measure and compare the performance of eight selection methods with homogeneous and heterogeneous diseases using both single-stage and two-stage designs. We also applied the selection methods in an actual proteomic biomarker screening study of heterogeneous breast cancer cases. RESULTS: Different selection methods were optimal, and more than two-fold larger sample sizes were needed for heterogeneous diseases compared with homogeneous diseases. We also found that for larger studies, two-stage designs can achieve nearly the same statistical power as single-stage designs at significantly reduced cost. CONCLUSIONS: We found that disease heterogeneity profoundly affected biomarker performance. We report sample size requirements and provide guidance on the design and analysis of biomarker discovery studies for both homogeneous and heterogeneous diseases. IMPACT: We have shown that studies to identify biomarkers for the early detection of heterogeneous disease require different statistical selection methods and larger sample sizes than if the disease were homogeneous. These findings provide a methodologic platform for biomarker discovery of heterogeneous diseases.
BACKGROUND: Modern genomic and proteomic studies reveal that many diseases are heterogeneous, comprising multiple different subtypes. The common notion that one biomarker can be predictive for all patients may need to be replaced by an understanding that each subtype has its own set of unique biomarkers, affecting how discovery studies are designed and analyzed. METHODS: We used Monte Carlo simulation to measure and compare the performance of eight selection methods with homogeneous and heterogeneous diseases using both single-stage and two-stage designs. We also applied the selection methods in an actual proteomic biomarker screening study of heterogeneous breast cancer cases. RESULTS: Different selection methods were optimal, and more than two-fold larger sample sizes were needed for heterogeneous diseases compared with homogeneous diseases. We also found that for larger studies, two-stage designs can achieve nearly the same statistical power as single-stage designs at significantly reduced cost. CONCLUSIONS: We found that disease heterogeneity profoundly affected biomarker performance. We report sample size requirements and provide guidance on the design and analysis of biomarker discovery studies for both homogeneous and heterogeneous diseases. IMPACT: We have shown that studies to identify biomarkers for the early detection of heterogeneous disease require different statistical selection methods and larger sample sizes than if the disease were homogeneous. These findings provide a methodologic platform for biomarker discovery of heterogeneous diseases.
Authors: Charles Kooperberg; Simonetta Sipione; Michael LeBlanc; Andrew D Strand; Elena Cattaneo; James M Olson Journal: Hum Mol Genet Date: 2002-09-15 Impact factor: 6.150
Authors: Karen S Anderson; Sahar Sibani; Garrick Wallstrom; Ji Qiu; Eliseo A Mendoza; Jacob Raphael; Eugenie Hainsworth; Wagner R Montor; Jessica Wong; Jin G Park; Naa Lokko; Tanya Logvinenko; Niroshan Ramachandran; Andrew K Godwin; Jeffrey Marks; Paul Engstrom; Joshua Labaer Journal: J Proteome Res Date: 2010-11-23 Impact factor: 4.466
Authors: M Nacht; T Dracheva; Y Gao; T Fujii; Y Chen; A Player; V Akmaev; B Cook; M Dufault; M Zhang; W Zhang; M Guo; J Curran; S Han; D Sidransky; K Buetow; S L Madden; J Jen Journal: Proc Natl Acad Sci U S A Date: 2001-12-18 Impact factor: 11.205
Authors: Niroshan Ramachandran; Eugenie Hainsworth; Bhupinder Bhullar; Samuel Eisenstein; Benjamin Rosen; Albert Y Lau; Johannes C Walter; Joshua LaBaer Journal: Science Date: 2004-07-02 Impact factor: 47.728
Authors: Jacques Lapointe; Chunde Li; John P Higgins; Matt van de Rijn; Eric Bair; Kelli Montgomery; Michelle Ferrari; Lars Egevad; Walter Rayford; Ulf Bergerheim; Peter Ekman; Angelo M DeMarzo; Robert Tibshirani; David Botstein; Patrick O Brown; James D Brooks; Jonathan R Pollack Journal: Proc Natl Acad Sci U S A Date: 2004-01-07 Impact factor: 11.205
Authors: Jie Wang; Jonine D Figueroa; Garrick Wallstrom; Kristi Barker; Jin G Park; Gokhan Demirkan; Jolanta Lissowska; Karen S Anderson; Ji Qiu; Joshua LaBaer Journal: Cancer Epidemiol Biomarkers Prev Date: 2015-06-12 Impact factor: 4.254
Authors: V Lopes de Andrade; D Serrazina; M L Mateus; C Batoréu; M Aschner; A P Marreilha Dos Santos Journal: Toxicol Appl Pharmacol Date: 2021-08-23 Impact factor: 4.460
Authors: Steven J Skates; Michael A Gillette; Joshua LaBaer; Steven A Carr; Leigh Anderson; Daniel C Liebler; David Ransohoff; Nader Rifai; Marina Kondratovich; Živana Težak; Elizabeth Mansfield; Ann L Oberg; Ian Wright; Grady Barnes; Mitchell Gail; Mehdi Mesri; Christopher R Kinsinger; Henry Rodriguez; Emily S Boja Journal: J Proteome Res Date: 2013-10-28 Impact factor: 4.466
Authors: Yong Zhou; Shizhen Qin; Mingjuan Sun; Li Tang; Xiaowei Yan; Taek-Kyun Kim; Juan Caballero; Gustavo Glusman; Mary E Brunkow; Mark J Soloski; Alison W Rebman; Carol Scavarda; Denise Cooper; Gilbert S Omenn; Robert L Moritz; Gary P Wormser; Nathan D Price; John N Aucott; Leroy Hood Journal: J Proteome Res Date: 2019-11-01 Impact factor: 5.370