| Literature DB >> 35199063 |
Dimitris Spathis1, Ignacio Perez-Pozuelo2, Laia Marques-Fernandez3, Cecilia Mascolo1.
Abstract
Medicine is undergoing an unprecedented digital transformation, as massive amounts of health data are being produced, gathered, and curated, ranging from in-hospital (e.g., intensive care unit [ICU]) to person-generated data (wearables). Annotating all these data for training purposes in order to feed to deep learning models for pattern recognition is impractical. Here, we discuss some exciting recent results of self-supervised learning (SSL) applications to high-resolution health signals. These examples leverage unlabeled data to learn meaningful representations that can generalize to situations where the ground truth is inadequate or simply infeasible to collect due to the high burden or associated costs. The most prominent bottleneck of deep learning today is access to labeled, carefully curated datasets, and self-supervision on health signals opens up new possibilities to eliminate data silos through general-purpose models that can transfer to low-resource environments and tasks.Entities:
Keywords: biomedical informatics; health signals; machine learning; transfer learning
Year: 2022 PMID: 35199063 PMCID: PMC8848012 DOI: 10.1016/j.patter.2021.100410
Source DB: PubMed Journal: Patterns (N Y) ISSN: 2666-3899
Figure 1Self-supervised learning for health signals
Here, we illustrate the case of ECG signals and the prominent methods, which leverage unlabeled data with self-supervised learning.
(A) Contrastive training maximizes the agreement between the original and the distorted view (flipped, rotated, or other augmentations).
(B) Generative models such as GANs involve two networks that contest with each other in a game to generate more realistic data.
(C) Time-aware models try to guess whether the signal follows the arrow of time.
(D) Masked models hide part of the signal and challenge the model to predict the original one. The representations learned from these methods can be reused on downstream transfer learning tasks with linear models (blue box). Self-supervision is more label efficient in low-data regimes (top right graph).