Boosting Convolutional Features for Robust Object Proposals
Deep Convolutional Neural Networks (CNNs) have demonstrated excellent performance in image classification, but still show room for improvement in object-detection tasks with many categories, in particular for cluttered scenes and occlusion. Modern detection algorithms like Regions with CNNs (Girshick et al., 2014) rely on Selective Search (Uijlings et al., 2013) to propose regions which with high probability represent objects, where in turn CNNs are deployed for classification. Selective Search represents a family of sophisticated algorithms that are engineered with multiple segmentation, appearance and saliency cues, typically coming with a significant run-time overhead. Furthermore, (Hosang et al., 2014) have shown that most methods suffer from low reproducibility due to unstable superpixels, even for slight image perturbations. Although CNNs are subsequently used for classification in top-performing object-detection pipelines, current proposal methods are agnostic to how these models parse objects and their rich learned representations. As a result they may propose regions which may not resemble high-level objects or totally miss some of them. To overcome these drawbacks we propose a boosting approach which directly takes advantage of hierarchical CNN features for detecting regions of interest fast. We demonstrate its performance on ImageNet 2013 detection benchmark and compare it with state-of-the-art methods.
READ FULL TEXT