Vishnu M Bashyam1, Jimit Doshi1, Guray Erus1, Dhivya Srinivasan1, Ahmed Abdulkadir1, Ashish Singh1, Mohamad Habes2, Yong Fan1, Colin L Masters3, Paul Maruff3, Chuanjun Zhuo4,5, Henry Völzke6,7, Sterling C Johnson8, Jurgen Fripp9, Nikolaos Koutsouleris10, Theodore D Satterthwaite1,11, Daniel H Wolf11, Raquel E Gur11,12, Ruben C Gur11,12, John C Morris13, Marilyn S Albert14, Hans J Grabe15,16, Susan M Resnick17, Nick R Bryan18, Katharina Wittfeld15,16, Robin Bülow19, David A Wolk20, Haochang Shou21, Ilya M Nasrallah12, Christos Davatzikos1. 1. Artificial Intelligence in Biomedical Imaging Lab, University of Pennsylvania, Philadelphia, Pennsylvania, USA. 2. Biggs Alzheimer's Institute, University of Texas San Antonio Health Science Center, San Antonio, Texas, USA. 3. Florey Institute of Neuroscience and Mental Health, University of Melbourne, Melbourne, Victoria, Australia. 4. Tianjin Mental Health Center, Nankai University Affiliated Tianjin Anding Hospital, Tianjin, China. 5. Department of Psychiatry, Tianjin Medical University, Tianjin, China. 6. Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany. 7. German Centre for Cardiovascular Research, Partner Site Greifswald, Greifswald, Germany. 8. Wisconsin Alzheimer's Institute, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA. 9. CSIRO Health and Biosecurity, Australian e-Health Research Centre CSIRO, Brisbane, Queensland, Australia. 10. Department of Psychiatry and Psychotherapy, Ludwig Maximilian University of Munich, Munich, Germany. 11. Department of Psychiatry, University of Pennsylvania, Philadelphia, Pennsylvania, USA. 12. Department of Radiology, University of Pennsylvania, Philadelphia, Pennsylvania, USA. 13. Department of Neurology, Washington University in St. Louis, St. Louis, Missouri, USA. 14. Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA. 15. Department of Psychiatry and Psychotherapy, University Medicine Greifswald, Greifswald, Germany. 16. German Center for Neurodegenerative Diseases (DZNE), Site Rostock/Greifswald, Greifswald, Germany. 17. Laboratory of Behavioral Neuroscience, National Institute on Aging, Baltimore, Maryland, USA. 18. Department of Diagnostic Medicine, University of Texas at Austin, Austin, Texas, USA. 19. Institute of Diagnostic Radiology and Neuroradiology, University Medicine Greifswald, Greifswald, Germany. 20. Department of Neurology, University of Pennsylvania, Philadelphia, Pennsylvania, USA. 21. Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Abstract
BACKGROUND: In the medical imaging domain, deep learning-based methods have yet to see widespread clinical adoption, in part due to limited generalization performance across different imaging devices and acquisition protocols. The deviation between estimated brain age and biological age is an established biomarker of brain health and such models may benefit from increased cross-site generalizability. PURPOSE: To develop and evaluate a deep learning-based image harmonization method to improve cross-site generalizability of deep learning age prediction. STUDY TYPE: Retrospective. POPULATION: Eight thousand eight hundred and seventy-six subjects from six sites. Harmonization models were trained using all subjects. Age prediction models were trained using 2739 subjects from a single site and tested using the remaining 6137 subjects from various other sites. FIELD STRENGTH/SEQUENCE: Brain imaging with magnetization prepared rapid acquisition with gradient echo or spoiled gradient echo sequences at 1.5 T and 3 T. ASSESSMENT: StarGAN v2, was used to perform a canonical mapping from diverse datasets to a reference domain to reduce site-based variation while preserving semantic information. Generalization performance of deep learning age prediction was evaluated using harmonized, histogram matched, and unharmonized data. STATISTICAL TESTS: Mean absolute error (MAE) and Pearson correlation between estimated age and biological age quantified the performance of the age prediction model. RESULTS: Our results indicated a substantial improvement in age prediction in out-of-sample data, with the overall MAE improving from 15.81 (±0.21) years to 11.86 (±0.11) with histogram matching to 7.21 (±0.22) years with generative adversarial network (GAN)-based harmonization. In the multisite case, across the 5 out-of-sample sites, MAE improved from 9.78 (±6.69) years to 7.74 (±3.03) years with histogram normalization to 5.32 (±4.07) years with GAN-based harmonization. DATA CONCLUSION: While further research is needed, GAN-based medical image harmonization appears to be a promising tool for improving cross-site deep learning generalization. LEVEL OF EVIDENCE: 4 TECHNICAL EFFICACY: Stage 1.
BACKGROUND: In the medical imaging domain, deep learning-based methods have yet to see widespread clinical adoption, in part due to limited generalization performance across different imaging devices and acquisition protocols. The deviation between estimated brain age and biological age is an established biomarker of brain health and such models may benefit from increased cross-site generalizability. PURPOSE: To develop and evaluate a deep learning-based image harmonization method to improve cross-site generalizability of deep learning age prediction. STUDY TYPE: Retrospective. POPULATION: Eight thousand eight hundred and seventy-six subjects from six sites. Harmonization models were trained using all subjects. Age prediction models were trained using 2739 subjects from a single site and tested using the remaining 6137 subjects from various other sites. FIELD STRENGTH/SEQUENCE: Brain imaging with magnetization prepared rapid acquisition with gradient echo or spoiled gradient echo sequences at 1.5 T and 3 T. ASSESSMENT: StarGAN v2, was used to perform a canonical mapping from diverse datasets to a reference domain to reduce site-based variation while preserving semantic information. Generalization performance of deep learning age prediction was evaluated using harmonized, histogram matched, and unharmonized data. STATISTICAL TESTS: Mean absolute error (MAE) and Pearson correlation between estimated age and biological age quantified the performance of the age prediction model. RESULTS: Our results indicated a substantial improvement in age prediction in out-of-sample data, with the overall MAE improving from 15.81 (±0.21) years to 11.86 (±0.11) with histogram matching to 7.21 (±0.22) years with generative adversarial network (GAN)-based harmonization. In the multisite case, across the 5 out-of-sample sites, MAE improved from 9.78 (±6.69) years to 7.74 (±3.03) years with histogram normalization to 5.32 (±4.07) years with GAN-based harmonization. DATA CONCLUSION: While further research is needed, GAN-based medical image harmonization appears to be a promising tool for improving cross-site deep learning generalization. LEVEL OF EVIDENCE: 4 TECHNICAL EFFICACY: Stage 1.
Authors: Kathryn A Ellis; Christopher C Rowe; Victor L Villemagne; Ralph N Martins; Colin L Masters; Olivier Salvado; Cassandra Szoeke; David Ames Journal: Alzheimers Dement Date: 2010-05 Impact factor: 21.566
Authors: Mohak Shah; Yiming Xiao; Nagesh Subbanna; Simon Francis; Douglas L Arnold; D Louis Collins; Tal Arbel Journal: Med Image Anal Date: 2010-12-25 Impact factor: 8.545
Authors: Konstantinos Kamnitsas; Christian Ledig; Virginia F J Newcombe; Joanna P Simpson; Andrew D Kane; David K Menon; Daniel Rueckert; Ben Glocker Journal: Med Image Anal Date: 2016-10-29 Impact factor: 8.545
Authors: Lianrui Zuo; Blake E Dewey; Yihao Liu; Yufan He; Scott D Newsome; Ellen M Mowry; Susan M Resnick; Jerry L Prince; Aaron Carass Journal: Neuroimage Date: 2021-09-08 Impact factor: 6.556
Authors: John R Zech; Marcus A Badgeley; Manway Liu; Anthony B Costa; Joseph J Titano; Eric Karl Oermann Journal: PLoS Med Date: 2018-11-06 Impact factor: 11.069