Literature DB >> 26353135

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun.   

Abstract

Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224 × 224) input image. This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a variety of CNN architectures despite their different designs. On the Pascal VOC 2007 and Caltech101 datasets, SPP-net achieves state-of-the-art classification results using a single full-image representation and no fine-tuning. The power of SPP-net is also significant in object detection. Using SPP-net, we compute the feature maps from the entire image only once, and then pool features in arbitrary regions (sub-images) to generate fixed-length representations for training the detectors. This method avoids repeatedly computing the convolutional features. In processing test images, our method is 24-102 × faster than the R-CNN method, while achieving better or comparable accuracy on Pascal VOC 2007. In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our methods rank #2 in object detection and #3 in image classification among all 38 teams. This manuscript also introduces the improvement made for this competition.

Entities:  

Mesh:

Year:  2015        PMID: 26353135     DOI: 10.1109/TPAMI.2015.2389824

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  264 in total

1.  Non-invasive prediction of tissue Doppler-derived E/e' ratio using lung Doppler signals.

Authors:  Mina M Benjamin; Christopher Bianco; Marco Caccamo; George Sokos; Nobuyuki Kagiyama; Sirish Shrestha; Grace Verzosa; Partho P Sengupta
Journal:  Eur Heart J Cardiovasc Imaging       Date:  2020-09-01       Impact factor: 6.875

2.  Mixed Maximum Loss Design for Optic Disc and Optic Cup Segmentation with Deep Learning from Imbalanced Samples.

Authors:  Yong-Li Xu; Shuai Lu; Han-Xiong Li; Rui-Rui Li
Journal:  Sensors (Basel)       Date:  2019-10-11       Impact factor: 3.576

3.  [Palm vein recognition based on end-to-end convolutional neural network].

Authors:  Dongyang Du; Lijun Lu; Ruiyang Fu; Lisha Yuan; Wufan Chen; Yaqin Liu
Journal:  Nan Fang Yi Ke Da Xue Xue Bao       Date:  2019-02-28

4.  Simultaneous arteriole and venule segmentation with domain-specific loss function on a new public database.

Authors:  Xiayu Xu; Rendong Wang; Peilin Lv; Bin Gao; Chan Li; Zhiqiang Tian; Tao Tan; Feng Xu
Journal:  Biomed Opt Express       Date:  2018-06-15       Impact factor: 3.732

5.  DEEP CONVOLUTIONAL NEURAL NETWORKS FOR IMAGING DATA BASED SURVIVAL ANALYSIS OF RECTAL CANCER.

Authors:  Hongming Li; Pamela Boimel; James Janopaul-Naylor; Haoyu Zhong; Ying Xiao; Edgar Ben-Josef; Yong Fan
Journal:  Proc IEEE Int Symp Biomed Imaging       Date:  2019-07-11

6.  Deep Learning Classifiers for Automated Detection of Gonioscopic Angle Closure Based on Anterior Segment OCT Images.

Authors:  Benjamin Y Xu; Michael Chiang; Shreyasi Chaudhary; Shraddha Kulkarni; Anmol A Pardeshi; Rohit Varma
Journal:  Am J Ophthalmol       Date:  2019-08-22       Impact factor: 5.258

7.  A multi-scale residual network for accelerated radial MR parameter mapping.

Authors:  Zhiyang Fu; Sagar Mandava; Mahesh B Keerthivasan; Zhitao Li; Kevin Johnson; Diego R Martin; Maria I Altbach; Ali Bilgin
Journal:  Magn Reson Imaging       Date:  2020-09-01       Impact factor: 2.546

Review 8.  Salient Object Detection Techniques in Computer Vision-A Survey.

Authors:  Ashish Kumar Gupta; Ayan Seal; Mukesh Prasad; Pritee Khanna
Journal:  Entropy (Basel)       Date:  2020-10-19       Impact factor: 2.524

9.  Medical Image Retrieval Using Multi-Texton Assignment.

Authors:  Qiling Tang; Jirong Yang; Xianfu Xia
Journal:  J Digit Imaging       Date:  2018-02       Impact factor: 4.056

10.  Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis.

Authors:  Shancheng Fang; Hongtao Xie; Zhineng Chen; Yizhi Liu; Yan Li
Journal:  Neuroinformatics       Date:  2018-10
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.