Michael D Parkes1, Arezu Z Aliabadi2, Martin Cadeiras3, Maria G Crespo-Leiro4, Mario Deng3, Eugene C Depasquale3, Johannes Goekler2, Daniel H Kim5, Jon Kobashigawa6, Alexandre Loupy7, Peter Macdonald8, Luciano Potena9, Andreas Zuckermann2, Philip F Halloran10. 1. Alberta Transplant Applied Genomics Centre, University of Alberta, Edmonton, Alberta, Canada. 2. Department of Cardiac Surgery, Medical University of Vienna, Vienna, Austria. 3. Ronald Reagan UCLA Medical Center, Los Angeles, California, USA. 4. Complexo Hospitalario Universitario A Coruña (CHUAC)-CIBERCV, A Coruña, Spain. 5. Department of Medicine, University of Alberta, Edmonton, Alberta, Canada. 6. Cedars-Sinai Medical Center, Beverly Hills, California, USA. 7. Hôpital Necker, Paris, France. 8. The Victor Chang Cardiac Research Institute, Sydney, New South Wales, Australia. 9. Cardiovascular Department, University of Bologna, Bologna, Italy. 10. Alberta Transplant Applied Genomics Centre, University of Alberta, Edmonton, Alberta, Canada; Department of Medicine, University of Alberta, Edmonton, Alberta, Canada. Electronic address: phallora@ualberta.ca.
Abstract
BACKGROUND: We previously reported a microarray-based diagnostic system for heart transplant endomyocardial biopsies (EMBs), using either 3-archetype (3AA) or 4-archetype (4AA) unsupervised algorithms to estimate rejection. In the present study we examined the stability of machine-learning algorithms in new biopsies, compared 3AA vs 4AA algorithms, assessed supervised binary classifiers trained on histologic or molecular diagnoses, created a report combining many scores into an ensemble of estimates, and examined possible automated sign-outs. METHODS: We studied 889 EMBs from 454 transplant recipients at 8 centers: the initial cohort (N = 331) and a new cohort (N = 558). Published 3AA algorithms derived in Cohort 331 were tested in Cohort 558, the 3AA and 4AA models were compared, and supervised binary classifiers were created. RESULTS: A`lgorithms derived in Cohort 331 performed similarly in new biopsies despite differences in case mix. In the combined cohort, the 4AA model, including a parenchymal injury score, retained correlations with histologic rejection and DSA similar to the 3AA model. Supervised molecular classifiers predicted molecular rejection (areas under the curve [AUCs] >0.87) better than histologic rejection (AUCs <0.78), even when trained on histology diagnoses. A report incorporating many AA and binary classifier scores interpreted by 1 expert showed highly significant agreement with histology (p < 0.001), but with many discrepancies, as expected from the known noise in histology. An automated random forest score closely predicted expert diagnoses, confirming potential for automated signouts. CONCLUSIONS: Molecular algorithms are stable in new populations and can be assembled into an ensemble that combines many supervised and unsupervised estimates of the molecular disease states.
BACKGROUND: We previously reported a microarray-based diagnostic system for heart transplant endomyocardial biopsies (EMBs), using either 3-archetype (3AA) or 4-archetype (4AA) unsupervised algorithms to estimate rejection. In the present study we examined the stability of machine-learning algorithms in new biopsies, compared 3AA vs 4AA algorithms, assessed supervised binary classifiers trained on histologic or molecular diagnoses, created a report combining many scores into an ensemble of estimates, and examined possible automated sign-outs. METHODS: We studied 889 EMBs from 454 transplant recipients at 8 centers: the initial cohort (N = 331) and a new cohort (N = 558). Published 3AA algorithms derived in Cohort 331 were tested in Cohort 558, the 3AA and 4AA models were compared, and supervised binary classifiers were created. RESULTS: A`lgorithms derived in Cohort 331 performed similarly in new biopsies despite differences in case mix. In the combined cohort, the 4AA model, including a parenchymal injury score, retained correlations with histologic rejection and DSA similar to the 3AA model. Supervised molecular classifiers predicted molecular rejection (areas under the curve [AUCs] >0.87) better than histologic rejection (AUCs <0.78), even when trained on histology diagnoses. A report incorporating many AA and binary classifier scores interpreted by 1 expert showed highly significant agreement with histology (p < 0.001), but with many discrepancies, as expected from the known noise in histology. An automated random forest score closely predicted expert diagnoses, confirming potential for automated signouts. CONCLUSIONS: Molecular algorithms are stable in new populations and can be assembled into an ensemble that combines many supervised and unsupervised estimates of the molecular disease states.
Authors: Amit Alam; Philip F Halloran; Christo Mathew; Samreen Fathima; Alexia Ghazi; Parag Kale; Shelley A Hall Journal: Methodist Debakey Cardiovasc J Date: 2021-06-16