| Literature DB >> 29522900 |
Chensi Cao1, Feng Liu2, Hai Tan3, Deshou Song3, Wenjie Shu2, Weizhong Li4, Yiming Zhou5, Xiaochen Bo6, Zhi Xie7.
Abstract
Advances in biological and medical technologies have been providing us explosive volumes of biological and physiological data, such as medical images, electroencephalography, genomic and protein sequences. Learning from these data facilitates the understanding of human health and disease. Developed from artificial neural networks, deep learning-based algorithms show great promise in extracting features and learning patterns from complex data. The aim of this paper is to provide an overview of deep learning techniques and some of the state-of-the-art applications in the biomedical field. We first introduce the development of artificial neural network and deep learning. We then describe two main components of deep learning, i.e., deep learning architectures and model optimization. Subsequently, some examples are demonstrated for deep learning applications, including medical image classification, genomic sequence analysis, as well as protein structure classification and prediction. Finally, we offer our perspectives for the future directions in the field of deep learning.Entities:
Keywords: Big data; Bioinformatics; Biomedical informatics; Deep learning; High-throughput sequencing; Medical image
Mesh:
Substances:
Year: 2018 PMID: 29522900 PMCID: PMC6000200 DOI: 10.1016/j.gpb.2017.07.003
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Figure 1Timeline of the development of deep learning and commonly-used machine learning algorithmsThe development of deep learning and neural networks is shown in the top panel, and several commonly-used machine learning algorithms are shown in the bottom panel. NN, neural network; BP, backpropagation; DBN, deep belief network; SVM, support vector machine; AE: auto-encoder; VAE: variational AE; GAN: generative adversarial network; WGAN: Wasserstein GAN.
Figure 2Illustration of convolutional neural network
A. In the convolution layer, fields (different color blocks in the table) of the input patch (represented by a) are multiplied by matrices (convolution kernel, represented by k). B. In the pooling layer, the results of convolution are summarized (the max pooling is taken as example here). aij, cij, kij represent the number located in line i and column j in the corresponding matrix.
Figure 3Illustration of recurrent neural network
A. The unfold form of common neural networks (top) and schema (bottom). B. An illustration of recurrent neural networks (top) and their unfold form (bottom). The red square represents one time step delay. Different from panel A, the arrows in panel B represent sets of connections. W and B represent the weight matrix and bias vector, respectively. x and y represent the input and output of the network, respectively; h indicates the hidden units of network; L consists of couples of transformations, such as densely-connected layers or dropout layers; U indicates the transformation between two neighbor time points; and t represents the time point.
Applications of deep learning frameworks in biomedical informatics
| Medical images analysis | CNN | Brain tumor segmentation, won top 2 in BRATS | |
| Segmentation of pancreas in CT | |||
| Knee cartilage segmentation | |||
| Segmentation of hippocampus | |||
| Predict semantic descriptions from medical images | |||
| Segmentation of MR brain images | |||
| Anatomy-specific classification of medical images | |||
| Cerebral microbleeds from MR images | |||
| Coronary artery calcium scoring in CT images | |||
| Nuclei detection in routine colon cancer histology images | |||
| Histopathological cancer classification | |||
| Invasive ductal carcinoma segmentation in WSI | |||
| Mammographic lesions detection | |||
| Haemorrhages detection in fundus images | |||
| Exudates detection in fundus images | |||
| SAE | Segmentation of hippocampus from infant brains | ||
| Organ detection in 4D patient data | |||
| Histological characterization healthy skin and healing wounds | |||
| Scoring of percentage mammographic density and mammographic texture related to breast cancer risk | |||
| Optic disc detection from fundus photograph | |||
| DBN | Segmentation of left ventricle of the heart from MR data | ||
| Discriminate retinal-based diseases | |||
| DNN | Brain tumor segmentation in MR images, won 2nd place in BRATS | ||
| Prostate MR segmentation | |||
| Gland instance segmentation | |||
| Semantic segmentation of tissues in CT images | |||
| Mitosis detection in breast cancer histological images | |||
| RNN | EEG-based prediction of epileptic seizures propagation using time-delayed NN | ||
| Classification of patterns of EEG synchronization for seizure prediction | |||
| EEG-based lapse detection | |||
| Prediction of epileptic seizures | |||
| Genomic sequencing and gene expression analysis | DNN | Gene expression inference | |
| Identification of | |||
| Prediction of enhancer | |||
| Prediction of splicing patterns in individual tissues and differences in splicing patterns across tissues | |||
| Annotation of the pathogenicity of genetic variants | |||
| DBN | Modeling structural binding preferences and predicting binding sites of RNA-binding proteins | ||
| Prediction of splice junction at DNA level | |||
| Prediction of transcription factor binding sites | |||
| Annotation and interpretation of the noncoding genome | |||
| Prediction of the noncoding variant effects | |||
| RNN | Prediction of miRNA precursor and miRNA targets | ||
| Detection of splice junctions from DNA sequences | |||
| Prediction of non-coding function | |||
| Analysis of human splicing codes and their determination of diseases | |||
| Protein structure prediction | DBN | Modeling structural binding preferences and predicting binding sites of RBPs | |
| Prediction of protein disorder | |||
| Prediction of secondary structures, local backbone angles, and solvent accessible surface area of proteins | |||
| CNN | Prediction of protein order/disorder regions | ||
| Prediction of protein secondary structures | |||
| Prediction of protein structure properties, including secondary structure, solvent accessibility, and disorder regions | |||
| SAE | Sequence-based prediction of backbone Cα angles and dihedrals | ||
| RNN | Prediction of protein secondary structure | ||
| Prediction of protein contact map | |||
Note: NN, neural networks; CNN, convolutional NN; SAE, stacked auto-encoder; DBN, deep belief network; RNN, recurrent NN.
Figure 4Popularity of deep learning frameworks in Github
The distributions of stars in Github of deep learning frameworks written in C++, Lua, Python, Matlab, Julia, and Java are shown in the pie chart. More stars in Github indicate higher popularity. Font size of the frameworks in the pie chart reflects the number of stars.