| Literature DB >> 35228707 |
Manik Kuchroo1, Jessie Huang2, Patrick Wong3, Jean-Christophe Grenier4, Dennis Shung5, Alexander Tong2, Carolina Lucas3, Jon Klein3, Daniel B Burkhardt6, Scott Gigante7, Abhinav Godavarthi8, Bastian Rieck9, Benjamin Israelow3,10, Michael Simonov5, Tianyang Mao3, Ji Eun Oh3, Julio Silva3, Takehiro Takahashi3, Camila D Odio5, Arnau Casanovas-Massana11, John Fournier10, Shelli Farhadian10, Charles S Dela Cruz12,13, Albert I Ko10,11, Matthew J Hirn14,15, F Perry Wilson16, Julie G Hussin4,17, Guy Wolf18,19, Akiko Iwasaki3,20, Smita Krishnaswamy21,22.
Abstract
As the biomedical community produces datasets that are increasingly complex and high dimensional, there is a need for more sophisticated computational tools to extract biological insights. We present Multiscale PHATE, a method that sweeps through all levels of data granularity to learn abstracted biological features directly predictive of disease outcome. Built on a coarse-graining process called diffusion condensation, Multiscale PHATE learns a data topology that can be analyzed at coarse resolutions for high-level summarizations of data and at fine resolutions for detailed representations of subsets. We apply Multiscale PHATE to a coronavirus disease 2019 (COVID-19) dataset with 54 million cells from 168 hospitalized patients and find that patients who die show CD16hiCD66blo neutrophil and IFN-γ+ granzyme B+ Th17 cell responses. We also show that population groupings from Multiscale PHATE directly fed into a classifier predict disease outcome more accurately than naive featurizations of the data. Multiscale PHATE is broadly generalizable to different data types, including flow cytometry, single-cell RNA sequencing (scRNA-seq), single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq), and clinical variables.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35228707 DOI: 10.1038/s41587-021-01186-x
Source DB: PubMed Journal: Nat Biotechnol ISSN: 1087-0156 Impact factor: 68.164