DiCENet: Dimension-wise Convolutions for Efficient Networks

06/08/2019
by   Sachin Mehta, et al.
4

In this paper, we propose a new CNN model DiCENet, that is built using: (1) dimension-wise convolutions and (2) efficient channel fusion. The introduced blocks maximize the use of information in the input tensor by learning representations across all dimensions while simultaneously reducing the complexity of the network and achieving high accuracy. Our model shows significant improvements over state-of-the-art models across various visual recognition tasks, including image classification, object detection, and semantic segmentation. Our model delivers either the same or better performance than existing models with fewer FLOPs, including task-specific models. Notably, DiCENet delivers competitive performance to neural architecture search-based methods at fewer FLOPs (70-100 MFLOPs). On the MS-COCO object detection, DiCENet is 4.5 PASCAL VOC 2012 semantic segmentation dataset, DiCENet is 4.3 and has 3.2 times fewer FLOPs than a recent efficient semantic segmentation network, ESPNet. Our source code is available at <https://github.com/sacmehta/EdgeNets>

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset