| Literature DB >> 29487619 |
Athanasios Voulodimos1,2, Nikolaos Doulamis2, Anastasios Doulamis2, Eftychios Protopapadakis2.
Abstract
Over the last years deep learning methods have been shown to outperform previous state-of-the-art machine learning techniques in several fields, with computer vision being one of the most prominent cases. This review paper provides a brief overview of some of the most significant deep learning schemes used in computer vision problems, that is, Convolutional Neural Networks, Deep Boltzmann Machines and Deep Belief Networks, and Stacked Denoising Autoencoders. A brief account of their history, structure, advantages, and limitations is given, followed by a description of their applications in various computer vision tasks, such as object detection, face recognition, action and activity recognition, and human pose estimation. Finally, a brief overview is given of future directions in designing deep learning schemes for computer vision problems and the challenges involved therein.Entities:
Mesh:
Year: 2018 PMID: 29487619 PMCID: PMC5816885 DOI: 10.1155/2018/7068349
Source DB: PubMed Journal: Comput Intell Neurosci
Important milestones in the history of neural networks and machine learning, leading up to the era of deep learning.
| Milestone/contribution | Contributor, year |
|---|---|
| MCP model, regarded as the ancestor of the Artificial Neural Network | McCulloch & Pitts, 1943 |
| Hebbian learning rule | Hebb, 1949 |
| First perceptron | Rosenblatt, 1958 |
| Backpropagation | Werbos, 1974 |
| Neocognitron, regarded as the ancestor of the Convolutional Neural Network | Fukushima, 1980 |
| Boltzmann Machine | Ackley, Hinton & Sejnowski, 1985 |
| Restricted Boltzmann Machine (initially known as Harmonium) | Smolensky, 1986 |
| Recurrent Neural Network | Jordan, 1986 |
| Autoencoders | Rumelhart, Hinton & Williams, 1986 |
| Ballard, 1987 | |
| LeNet, starting the era of Convolutional Neural Networks | LeCun, 1990 |
| LSTM | Hochreiter & Schmidhuber, 1997 |
| Deep Belief Network, ushering the “age of deep learning” | Hinton, 2006 |
| Deep Boltzmann Machine | Salakhutdinov & Hinton, 2009 |
| AlexNet, starting the age of CNN used for ImageNet classification | Krizhevsky, Sutskever, & Hinton, 2012 |
Figure 1Example architecture of a CNN for a computer vision task (object detection).
Figure 2Deep Belief Network (DBN) and Deep Boltzmann Machine (DBM). The top two layers of a DBN form an undirected graph and the remaining layers form a belief network with directed, top-down connections. In a DBM, all connections are undirected.
Figure 3Denoising autoencoder [56].
Comparison of CNNs, DBNs/DBMs, and SdAs with respect to a number of properties. + denotes a good performance in the property and − denotes bad performance or complete lack thereof.
| Model properties | CNNs | DBNs/DBMs | SdAs |
|---|---|---|---|
| Unsupervised learning | − | + | + |
| Training efficiency | − | − | + |
| Feature learning | + | − | − |
| Scale/rotation/translation invariance | + | − | − |
| Generalization | + | + | + |
Figure 4Object detection results comparison from [66]. (a) Ground truth; (b) bounding boxes obtained with [32]; (c) bounding boxes obtained with [66].