Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.

Literature DB >> 26353135

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun.

Abstract

Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224 × 224) input image. This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a variety of CNN architectures despite their different designs. On the Pascal VOC 2007 and Caltech101 datasets, SPP-net achieves state-of-the-art classification results using a single full-image representation and no fine-tuning. The power of SPP-net is also significant in object detection. Using SPP-net, we compute the feature maps from the entire image only once, and then pool features in arbitrary regions (sub-images) to generate fixed-length representations for training the detectors. This method avoids repeatedly computing the convolutional features. In processing test images, our method is 24-102 × faster than the R-CNN method, while achieving better or comparable accuracy on Pascal VOC 2007. In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our methods rank #2 in object detection and #3 in image classification among all 38 teams. This manuscript also introduces the improvement made for this competition.

Entities: Disease Gene

Mesh：

Year: 2015 PMID： 26353135 DOI： 10.1109/TPAMI.2015.2389824

Source DB: PubMed Journal: IEEE Trans Pattern Anal Mach Intell ISSN： 0098-5589 Impact factor: 6.226

Keyword Cloud
Cited

264 in total

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.

1. Non-invasive prediction of tissue Doppler-derived E/e' ratio using lung Doppler signals.

2. Mixed Maximum Loss Design for Optic Disc and Optic Cup Segmentation with Deep Learning from Imbalanced Samples.

3. [Palm vein recognition based on end-to-end convolutional neural network].

4. Simultaneous arteriole and venule segmentation with domain-specific loss function on a new public database.

5. DEEP CONVOLUTIONAL NEURAL NETWORKS FOR IMAGING DATA BASED SURVIVAL ANALYSIS OF RECTAL CANCER.

6. Deep Learning Classifiers for Automated Detection of Gonioscopic Angle Closure Based on Anterior Segment OCT Images.

7. A multi-scale residual network for accelerated radial MR parameter mapping.

Review 8. Salient Object Detection Techniques in Computer Vision-A Survey.

9. Medical Image Retrieval Using Multi-Texton Assignment.

10. Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis.